benchmark archive database: Topics by Science.gov

Sample records for benchmark archive database

PDS: A Performance Database Server

DOE PAGES

Berry, Michael W.; Dongarra, Jack J.; Larose, Brian H.; ...

1994-01-01

The process of gathering, archiving, and distributing computer benchmark data is a cumbersome task usually performed by computer users and vendors with little coordination. Most important, there is no publicly available central depository of performance data for all ranges of machines from personal computers to supercomputers. We present an Internet-accessible performance database server (PDS) that can be used to extract current benchmark data and literature. As an extension to the X-Windows-based user interface (Xnetlib) to the Netlib archival system, PDS provides an on-line catalog of public domain computer benchmarks such as the LINPACK benchmark, Perfect benchmarks, and the NAS parallelmore » benchmarks. PDS does not reformat or present the benchmark data in any way that conflicts with the original methodology of any particular benchmark; it is thereby devoid of any subjective interpretations of machine performance. We believe that all branches (research laboratories, academia, and industry) of the general computing community can use this facility to archive performance metrics and make them readily available to the public. PDS can provide a more manageable approach to the development and support of a large dynamic database of published performance metrics.« less
Validation of Shielding Analysis Capability of SuperMC with SINBAD

NASA Astrophysics Data System (ADS)

Chen, Chaobin; Yang, Qi; Wu, Bin; Han, Yuncheng; Song, Jing

2017-09-01

Abstract: The shielding analysis capability of SuperMC was validated with the Shielding Integral Benchmark Archive Database (SINBAD). The SINBAD was compiled by RSICC and NEA, it includes numerous benchmark experiments performed with the D-T fusion neutron source facilities of OKTAVIAN, FNS, IPPE, etc. The results from SuperMC simulation were compared with experimental data and MCNP results. Very good agreement with deviation lower than 1% was achieved and it suggests that SuperMC is reliable in shielding calculation.
Montreal Archive of Sleep Studies: an open-access resource for instrument benchmarking and exploratory research.

PubMed

O'Reilly, Christian; Gosselin, Nadia; Carrier, Julie; Nielsen, Tore

2014-12-01

Manual processing of sleep recordings is extremely time-consuming. Efforts to automate this process have shown promising results, but automatic systems are generally evaluated on private databases, not allowing accurate cross-validation with other systems. In lacking a common benchmark, the relative performances of different systems are not compared easily and advances are compromised. To address this fundamental methodological impediment to sleep study, we propose an open-access database of polysomnographic biosignals. To build this database, whole-night recordings from 200 participants [97 males (aged 42.9 ± 19.8 years) and 103 females (aged 38.3 ± 18.9 years); age range: 18-76 years] were pooled from eight different research protocols performed in three different hospital-based sleep laboratories. All recordings feature a sampling frequency of 256 Hz and an electroencephalography (EEG) montage of 4-20 channels plus standard electro-oculography (EOG), electromyography (EMG), electrocardiography (ECG) and respiratory signals. Access to the database can be obtained through the Montreal Archive of Sleep Studies (MASS) website (http://www.ceams-carsm.ca/en/MASS), and requires only affiliation with a research institution and prior approval by the applicant's local ethical review board. Providing the research community with access to this free and open sleep database is expected to facilitate the development and cross-validation of sleep analysis automation systems. It is also expected that such a shared resource will be a catalyst for cross-centre collaborations on difficult topics such as improving inter-rater agreement on sleep stage scoring. © 2014 European Sleep Research Society.
Big heart data: advancing health informatics through data sharing in cardiovascular imaging.

PubMed

Suinesiaputra, Avan; Medrano-Gracia, Pau; Cowan, Brett R; Young, Alistair A

2015-07-01

The burden of heart disease is rapidly worsening due to the increasing prevalence of obesity and diabetes. Data sharing and open database resources for heart health informatics are important for advancing our understanding of cardiovascular function, disease progression and therapeutics. Data sharing enables valuable information, often obtained at considerable expense and effort, to be reused beyond the specific objectives of the original study. Many government funding agencies and journal publishers are requiring data reuse, and are providing mechanisms for data curation and archival. Tools and infrastructure are available to archive anonymous data from a wide range of studies, from descriptive epidemiological data to gigabytes of imaging data. Meta-analyses can be performed to combine raw data from disparate studies to obtain unique comparisons or to enhance statistical power. Open benchmark datasets are invaluable for validating data analysis algorithms and objectively comparing results. This review provides a rationale for increased data sharing and surveys recent progress in the cardiovascular domain. We also highlight the potential of recent large cardiovascular epidemiological studies enabling collaborative efforts to facilitate data sharing, algorithms benchmarking, disease modeling and statistical atlases.
Toward Automated Benchmarking of Atomistic Force Fields: Neat Liquid Densities and Static Dielectric Constants from the ThermoML Data Archive.

PubMed

Beauchamp, Kyle A; Behr, Julie M; Rustenburg, Ariën S; Bayly, Christopher I; Kroenlein, Kenneth; Chodera, John D

2015-10-08

Atomistic molecular simulations are a powerful way to make quantitative predictions, but the accuracy of these predictions depends entirely on the quality of the force field employed. Although experimental measurements of fundamental physical properties offer a straightforward approach for evaluating force field quality, the bulk of this information has been tied up in formats that are not machine-readable. Compiling benchmark data sets of physical properties from non-machine-readable sources requires substantial human effort and is prone to the accumulation of human errors, hindering the development of reproducible benchmarks of force-field accuracy. Here, we examine the feasibility of benchmarking atomistic force fields against the NIST ThermoML data archive of physicochemical measurements, which aggregates thousands of experimental measurements in a portable, machine-readable, self-annotating IUPAC-standard format. As a proof of concept, we present a detailed benchmark of the generalized Amber small-molecule force field (GAFF) using the AM1-BCC charge model against experimental measurements (specifically, bulk liquid densities and static dielectric constants at ambient pressure) automatically extracted from the archive and discuss the extent of data available for use in larger scale (or continuously performed) benchmarks. The results of even this limited initial benchmark highlight a general problem with fixed-charge force fields in the representation low-dielectric environments, such as those seen in binding cavities or biological membranes.
The philosophy of benchmark testing a standards-based picture archiving and communications system.

PubMed

Richardson, N E; Thomas, J A; Lyche, D K; Romlein, J; Norton, G S; Dolecek, Q E

1999-05-01

The Department of Defense issued its requirements for a Digital Imaging Network-Picture Archiving and Communications System (DIN-PACS) in a Request for Proposals (RFP) to industry in January 1997, with subsequent contracts being awarded in November 1997 to the Agfa Division of Bayer and IBM Global Government Industry. The Government's technical evaluation process consisted of evaluating a written technical proposal as well as conducting a benchmark test of each proposed system at the vendor's test facility. The purpose of benchmark testing was to evaluate the performance of the fully integrated system in a simulated operational environment. The benchmark test procedures and test equipment were developed through a joint effort between the Government, academic institutions, and private consultants. Herein the authors discuss the resources required and the methods used to benchmark test a standards-based PACS.
Computational Chemistry Comparison and Benchmark Database

National Institute of Standards and Technology Data Gateway

SRD 101 NIST Computational Chemistry Comparison and Benchmark Database (Web, free access) The NIST Computational Chemistry Comparison and Benchmark Database is a collection of experimental and ab initio thermochemical properties for a selected set of molecules. The goals are to provide a benchmark set of molecules for the evaluation of ab initio computational methods and allow the comparison between different ab initio computational methods for the prediction of thermochemical properties.
Chemical annotation of small and peptide-like molecules at the Protein Data Bank

PubMed Central

Young, Jasmine Y.; Feng, Zukang; Dimitropoulos, Dimitris; Sala, Raul; Westbrook, John; Zhuravleva, Marina; Shao, Chenghua; Quesada, Martha; Peisach, Ezra; Berman, Helen M.

2013-01-01

Over the past decade, the number of polymers and their complexes with small molecules in the Protein Data Bank archive (PDB) has continued to increase significantly. To support scientific advancements and ensure the best quality and completeness of the data files over the next 10 years and beyond, the Worldwide PDB partnership that manages the PDB archive is developing a new deposition and annotation system. This system focuses on efficient data capture across all supported experimental methods. The new deposition and annotation system is composed of four major modules that together support all of the processing requirements for a PDB entry. In this article, we describe one such module called the Chemical Component Annotation Tool. This tool uses information from both the Chemical Component Dictionary and Biologically Interesting molecule Reference Dictionary to aid in annotation. Benchmark studies have shown that the Chemical Component Annotation Tool provides significant improvements in processing efficiency and data quality. Database URL: http://wwpdb.org PMID:24291661
Chemical annotation of small and peptide-like molecules at the Protein Data Bank.

PubMed

Young, Jasmine Y; Feng, Zukang; Dimitropoulos, Dimitris; Sala, Raul; Westbrook, John; Zhuravleva, Marina; Shao, Chenghua; Quesada, Martha; Peisach, Ezra; Berman, Helen M

2013-01-01

Over the past decade, the number of polymers and their complexes with small molecules in the Protein Data Bank archive (PDB) has continued to increase significantly. To support scientific advancements and ensure the best quality and completeness of the data files over the next 10 years and beyond, the Worldwide PDB partnership that manages the PDB archive is developing a new deposition and annotation system. This system focuses on efficient data capture across all supported experimental methods. The new deposition and annotation system is composed of four major modules that together support all of the processing requirements for a PDB entry. In this article, we describe one such module called the Chemical Component Annotation Tool. This tool uses information from both the Chemical Component Dictionary and Biologically Interesting molecule Reference Dictionary to aid in annotation. Benchmark studies have shown that the Chemical Component Annotation Tool provides significant improvements in processing efficiency and data quality. Database URL: http://wwpdb.org.
3DNALandscapes: a database for exploring the conformational features of DNA.

PubMed

Zheng, Guohui; Colasanti, Andrew V; Lu, Xiang-Jun; Olson, Wilma K

2010-01-01

3DNALandscapes, located at: http://3DNAscapes.rutgers.edu, is a new database for exploring the conformational features of DNA. In contrast to most structural databases, which archive the Cartesian coordinates and/or derived parameters and images for individual structures, 3DNALandscapes enables searches of conformational information across multiple structures. The database contains a wide variety of structural parameters and molecular images, computed with the 3DNA software package and known to be useful for characterizing and understanding the sequence-dependent spatial arrangements of the DNA sugar-phosphate backbone, sugar-base side groups, base pairs, base-pair steps, groove structure, etc. The data comprise all DNA-containing structures--both free and bound to proteins, drugs and other ligands--currently available in the Protein Data Bank. The web interface allows the user to link, report, plot and analyze this information from numerous perspectives and thereby gain insight into DNA conformation, deformability and interactions in different sequence and structural contexts. The data accumulated from known, well-resolved DNA structures can serve as useful benchmarks for the analysis and simulation of new structures. The collective data can also help to understand how DNA deforms in response to proteins and other molecules and undergoes conformational rearrangements.
The mass storage testing laboratory at GSFC

NASA Technical Reports Server (NTRS)

Venkataraman, Ravi; Williams, Joel; Michaud, David; Gu, Heng; Kalluri, Atri; Hariharan, P. C.; Kobler, Ben; Behnke, Jeanne; Peavey, Bernard

1998-01-01

Industry-wide benchmarks exist for measuring the performance of processors (SPECmarks), and of database systems (Transaction Processing Council). Despite storage having become the dominant item in computing and IT (Information Technology) budgets, no such common benchmark is available in the mass storage field. Vendors and consultants provide services and tools for capacity planning and sizing, but these do not account for the complete set of metrics needed in today's archives. The availability of automated tape libraries, high-capacity RAID systems, and high- bandwidth interconnectivity between processor and peripherals has led to demands for services which traditional file systems cannot provide. File Storage and Management Systems (FSMS), which began to be marketed in the late 80's, have helped to some extent with large tape libraries, but their use has introduced additional parameters affecting performance. The aim of the Mass Storage Test Laboratory (MSTL) at Goddard Space Flight Center is to develop a test suite that includes not only a comprehensive check list to document a mass storage environment but also benchmark code. Benchmark code is being tested which will provide measurements for both baseline systems, i.e. applications interacting with peripherals through the operating system services, and for combinations involving an FSMS. The benchmarks are written in C, and are easily portable. They are initially being aimed at the UNIX Open Systems world. Measurements are being made using a Sun Ultra 170 Sparc with 256MB memory running Solaris 2.5.1 with the following configuration: 4mm tape stacker on SCSI 2 Fast/Wide; 4GB disk device on SCSI 2 Fast/Wide; and Sony Petaserve on Fast/Wide differential SCSI 2.
A Methodology for Benchmarking Relational Database Machines,

DTIC Science & Technology

1984-01-01

user benchmarks is to compare the multiple users to the best-case performance The data for each query classification coll and the performance...called a benchmark. The term benchmark originates from the markers used by sur - veyors in establishing common reference points for their measure...formatted databases. In order to further simplify the problem, we restrict our study to those DBMs which support the relational model. A sur - vey
Generating Shifting Workloads to Benchmark Adaptability in Relational Database Systems

NASA Astrophysics Data System (ADS)

Rabl, Tilmann; Lang, Andreas; Hackl, Thomas; Sick, Bernhard; Kosch, Harald

A large body of research concerns the adaptability of database systems. Many commercial systems already contain autonomic processes that adapt configurations as well as data structures and data organization. Yet there is virtually no possibility for a just measurement of the quality of such optimizations. While standard benchmarks have been developed that simulate real-world database applications very precisely, none of them considers variations in workloads produced by human factors. Today’s benchmarks test the performance of database systems by measuring peak performance on homogeneous request streams. Nevertheless, in systems with user interaction access patterns are constantly shifting. We present a benchmark that simulates a web information system with interaction of large user groups. It is based on the analysis of a real online eLearning management system with 15,000 users. The benchmark considers the temporal dependency of user interaction. Main focus is to measure the adaptability of a database management system according to shifting workloads. We will give details on our design approach that uses sophisticated pattern analysis and data mining techniques.
Transaction Processing Performance Council (TPC): State of the Council 2010

NASA Astrophysics Data System (ADS)

Nambiar, Raghunath; Wakou, Nicholas; Carman, Forrest; Majdalany, Michael

The Transaction Processing Performance Council (TPC) is a non-profit corporation founded to define transaction processing and database benchmarks and to disseminate objective, verifiable performance data to the industry. Established in August 1988, the TPC has been integral in shaping the landscape of modern transaction processing and database benchmarks over the past twenty-two years. This paper provides an overview of the TPC's existing benchmark standards and specifications, introduces two new TPC benchmarks under development, and examines the TPC's active involvement in the early creation of additional future benchmarks.
The European Radiobiology Archives (ERA)--content, structure and use illustrated by an example.

PubMed

Gerber, G B; Wick, R R; Kellerer, A M; Hopewell, J W; Di Majo, V; Dudoignon, N; Gössner, W; Stather, J

2006-01-01

The European Radiobiology Archives (ERA), supported by the European Commission and the European Late Effect Project Group (EULEP), together with the US National Radiobiology Archives (NRA) and the Japanese Radiobiology Archives (JRA) have collected all information still available on long-term animal experiments, including some selected human studies. The archives consist of a database in Microsoft Access, a website, databases of references and information on the use of the database. At present, the archives contain a description of the exposure conditions, animal strains, etc. from approximately 350,000 individuals; data on survival and pathology are available from approximately 200,000 individuals. Care has been taken to render pathological diagnoses compatible among different studies and to allow the lumping of pathological diagnoses into more general classes. 'Forms' in Access with an underlying computer code facilitate the use of the database. This paper describes the structure and content of the archives and illustrates an example for a possible analysis of such data.
Water and carbon stable isotope records from natural archives: a new database and interactive online platform for data browsing, visualizing and downloading

NASA Astrophysics Data System (ADS)

Bolliet, Timothé; Brockmann, Patrick; Masson-Delmotte, Valérie; Bassinot, Franck; Daux, Valérie; Genty, Dominique; Landais, Amaelle; Lavrieux, Marlène; Michel, Elisabeth; Ortega, Pablo; Risi, Camille; Roche, Didier M.; Vimeux, Françoise; Waelbroeck, Claire

2016-08-01

Past climate is an important benchmark to assess the ability of climate models to simulate key processes and feedbacks. Numerous proxy records exist for stable isotopes of water and/or carbon, which are also implemented inside the components of a growing number of Earth system model. Model-data comparisons can help to constrain the uncertainties associated with transfer functions. This motivates the need of producing a comprehensive compilation of different proxy sources. We have put together a global database of proxy records of oxygen (δ18O), hydrogen (δD) and carbon (δ13C) stable isotopes from different archives: ocean and lake sediments, corals, ice cores, speleothems and tree-ring cellulose. Source records were obtained from the georeferenced open access PANGAEA and NOAA libraries, complemented by additional data obtained from a literature survey. About 3000 source records were screened for chronological information and temporal resolution of proxy records. Altogether, this database consists of hundreds of dated δ18O, δ13C and δD records in a standardized simple text format, complemented with a metadata Excel catalog. A quality control flag was implemented to describe age markers and inform on chronological uncertainty. This compilation effort highlights the need to homogenize and structure the format of datasets and chronological information as well as enhance the distribution of published datasets that are currently highly fragmented and scattered. We also provide an online portal based on the records included in this database with an intuitive and interactive platform (http://climateproxiesfinder.ipsl.fr/), allowing one to easily select, visualize and download subsets of the homogeneously formatted records that constitute this database, following a choice of search criteria, and to upload new datasets. In the last part, we illustrate the type of application allowed by our database by comparing several key periods highly investigated by the paleoclimate community. For coherency with the Paleoclimate Modelling Intercomparison Project (PMIP), we focus on records spanning the past 200 years, the mid-Holocene (MH, 5.5-6.5 ka; calendar kiloyears before 1950), the Last Glacial Maximum (LGM, 19-23 ka), and those spanning the last interglacial period (LIG, 115-130 ka). Basic statistics have been applied to characterize anomalies between these different periods. Most changes from the MH to present day and from LIG to MH appear statistically insignificant. Significant global differences are reported from LGM to MH with regional discrepancies in signals from different archives and complex patterns.
Berkeley Lab - Materials Sciences Division

Science.gov Websites

Postdoc Forum Research Highlights Awards Publications Database Events Calendar Newsletter Archive People Database Events Calendar Newsletter Archive Send us your research highlights. Reserch Highlight Template (powerpoint) Publications Database The MSD publications database has been updated to include all FY2014
Berkeley Lab - Materials Sciences Division

Science.gov Websites

Postdoc Forum Research Highlights Awards Publications Database Events Calendar Newsletter Archive People Database Events Calendar Newsletter Archive Send us your research highlights. Reserch Highlight Template
A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database.

PubMed

Huang, Zhiwu; Shan, Shiguang; Wang, Ruiping; Zhang, Haihong; Lao, Shihong; Kuerban, Alifu; Chen, Xilin

2015-12-01

Face recognition with still face images has been widely studied, while the research on video-based face recognition is inadequate relatively, especially in terms of benchmark datasets and comparisons. Real-world video-based face recognition applications require techniques for three distinct scenarios: 1) Videoto-Still (V2S); 2) Still-to-Video (S2V); and 3) Video-to-Video (V2V), respectively, taking video or still image as query or target. To the best of our knowledge, few datasets and evaluation protocols have benchmarked for all the three scenarios. In order to facilitate the study of this specific topic, this paper contributes a benchmarking and comparative study based on a newly collected still/video face database, named COX(1) Face DB. Specifically, we make three contributions. First, we collect and release a largescale still/video face database to simulate video surveillance with three different video-based face recognition scenarios (i.e., V2S, S2V, and V2V). Second, for benchmarking the three scenarios designed on our database, we review and experimentally compare a number of existing set-based methods. Third, we further propose a novel Point-to-Set Correlation Learning (PSCL) method, and experimentally show that it can be used as a promising baseline method for V2S/S2V face recognition on COX Face DB. Extensive experimental results clearly demonstrate that video-based face recognition needs more efforts, and our COX Face DB is a good benchmark database for evaluation.
Planetary Data Archiving Plan at JAXA

NASA Astrophysics Data System (ADS)

Shinohara, Iku; Kasaba, Yasumasa; Yamamoto, Yukio; Abe, Masanao; Okada, Tatsuaki; Imamura, Takeshi; Sobue, Shinichi; Takashima, Takeshi; Terazono, Jun-Ya

After the successful rendezvous of Hayabusa with the small-body planet Itokawa, and the successful launch of Kaguya to the moon, Japanese planetary community has gotten their own and full-scale data. However, at this moment, these datasets are only available from the data sites managed by each mission team. The databases are individually constructed in the different formats, and the user interface of these data sites is not compatible with foreign databases. To improve the usability of the planetary archives at JAXA and to enable the international data exchange smooth, we are investigating to make a new planetary database. Within a coming decade, Japan will have fruitful datasets in the planetary science field, Venus (Planet-C), Mercury (BepiColombo), and several missions in planning phase (small-bodies). In order to strongly assist the international scientific collaboration using these mission archive data, the planned planetary data archive at JAXA should be managed in an unified manner and the database should be constructed in the international planetary database standard style. In this presentation, we will show the current status and future plans of the planetary data archiving at JAXA.

Archive of mass spectral data files on recordable CD-ROMs and creation and maintenance of a searchable computerized database.

PubMed

Amick, G D

1999-01-01

A database containing names of mass spectral data files generated in a forensic toxicology laboratory and two Microsoft Visual Basic programs to maintain and search this database is described. The data files (approximately 0.5 KB/each) were collected from six mass spectrometers during routine casework. Data files were archived on 650 MB (74 min) recordable CD-ROMs. Each recordable CD-ROM was given a unique name, and its list of data file names was placed into the database. The present manuscript describes the use of search and maintenance programs for searching and routine upkeep of the database and creation of CD-ROMs for archiving of data files.
Benchmarking Using Basic DBMS Operations

NASA Astrophysics Data System (ADS)

Crolotte, Alain; Ghazal, Ahmad

The TPC-H benchmark proved to be successful in the decision support area. Many commercial database vendors and their related hardware vendors used these benchmarks to show the superiority and competitive edge of their products. However, over time, the TPC-H became less representative of industry trends as vendors keep tuning their database to this benchmark-specific workload. In this paper, we present XMarq, a simple benchmark framework that can be used to compare various software/hardware combinations. Our benchmark model is currently composed of 25 queries that measure the performance of basic operations such as scans, aggregations, joins and index access. This benchmark model is based on the TPC-H data model due to its maturity and well-understood data generation capability. We also propose metrics to evaluate single-system performance and compare two systems. Finally we illustrate the effectiveness of this model by showing experimental results comparing two systems under different conditions.
The NSO FTS database program and archive (FTSDBM)

NASA Technical Reports Server (NTRS)

Lytle, D. M.

1992-01-01

Data from the NSO Fourier transform spectrometer is being re-archived from half inch tape onto write-once compact disk. In the process, information about each spectrum and a low resolution copy of each spectrum is being saved into an on-line database. FTSDBM is a simple database management program in the NSO external package for IRAF. A command language allows the FTSDBM user to add entries to the database, delete entries, select subsets from the database based on keyword values including ranges of values, create new database files based on these subsets, make keyword lists, examine low resolution spectra graphically, and make disk number/file number lists. Once the archive is complete, FTSDBM will allow the database to be efficiently searched for data of interest to the user and the compact disk format will allow random access to that data.
Astronomical Archive at Tartu Observatory

NASA Astrophysics Data System (ADS)

Annuk, K.

2007-10-01

Archiving astronomical data is important task not only at large observatories but also at small observatories. Here we describe the astronomical archive at Tartu Observatory. The archive consists of old photographic plate images, photographic spectrograms, CCD direct--images and CCD spectroscopic data. The photographic plate digitizing project was started in 2005. An on-line database (based on MySQL) was created. The database includes CCD data as well photographic data. A PHP-MySQL interface was written for access to all data.
The BioGRID interaction database: 2013 update.

PubMed

Chatr-Aryamontri, Andrew; Breitkreutz, Bobby-Joe; Heinicke, Sven; Boucher, Lorrie; Winter, Andrew; Stark, Chris; Nixon, Julie; Ramage, Lindsay; Kolas, Nadine; O'Donnell, Lara; Reguly, Teresa; Breitkreutz, Ashton; Sellam, Adnane; Chen, Daici; Chang, Christie; Rust, Jennifer; Livstone, Michael; Oughtred, Rose; Dolinski, Kara; Tyers, Mike

2013-01-01

The Biological General Repository for Interaction Datasets (BioGRID: http//thebiogrid.org) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species. As of September 2012, BioGRID houses more than 500 000 manually annotated interactions from more than 30 model organisms. BioGRID maintains complete curation coverage of the literature for the budding yeast Saccharomyces cerevisiae, the fission yeast Schizosaccharomyces pombe and the model plant Arabidopsis thaliana. A number of themed curation projects in areas of biomedical importance are also supported. BioGRID has established collaborations and/or shares data records for the annotation of interactions and phenotypes with most major model organism databases, including Saccharomyces Genome Database, PomBase, WormBase, FlyBase and The Arabidopsis Information Resource. BioGRID also actively engages with the text-mining community to benchmark and deploy automated tools to expedite curation workflows. BioGRID data are freely accessible through both a user-defined interactive interface and in batch downloads in a wide variety of formats, including PSI-MI2.5 and tab-delimited files. BioGRID records can also be interrogated and analyzed with a series of new bioinformatics tools, which include a post-translational modification viewer, a graphical viewer, a REST service and a Cytoscape plugin.
The Role of Data Archives in Synoptic Solar Physics

NASA Astrophysics Data System (ADS)

Reardon, Kevin

The detailed study of solar cycle variations requires analysis of recorded datasets spanning many years of observations, that is, a data archive. The use of digital data, combined with powerful database server software, gives such archives new capabilities to provide, quickly and flexibly, selected pieces of information to scientists. Use of standardized protocols will allow multiple databases, independently maintained, to be seamlessly joined, allowing complex searches spanning multiple archives. These data archives also benefit from being developed in parallel with the telescope itself, which helps to assure data integrity and to provide close integration between the telescope and archive. Development of archives that can guarantee long-term data availability and strong compatibility with other projects makes solar-cycle studies easier to plan and realize.
CERES Search and Subset Tool

Atmospheric Science Data Center

2016-06-24

... data granules using a high resolution spatial metadata database and directly accessing the archived data granules. Subset results are ... data granules using a high resolution spatial metadata database and directly accessing the archived data granules. Subset results are ...
Using ontology databases for scalable query answering, inconsistency detection, and data integration

PubMed Central

Dou, Dejing

2011-01-01

An ontology database is a basic relational database management system that models an ontology plus its instances. To reason over the transitive closure of instances in the subsumption hierarchy, for example, an ontology database can either unfold views at query time or propagate assertions using triggers at load time. In this paper, we use existing benchmarks to evaluate our method—using triggers—and we demonstrate that by forward computing inferences, we not only improve query time, but the improvement appears to cost only more space (not time). However, we go on to show that the true penalties were simply opaque to the benchmark, i.e., the benchmark inadequately captures load-time costs. We have applied our methods to two case studies in biomedicine, using ontologies and data from genetics and neuroscience to illustrate two important applications: first, ontology databases answer ontology-based queries effectively; second, using triggers, ontology databases detect instance-based inconsistencies—something not possible using views. Finally, we demonstrate how to extend our methods to perform data integration across multiple, distributed ontology databases. PMID:22163378
AccuNet/AP (Associated Press) Multimedia Archive

ERIC Educational Resources Information Center

Young, Terrence E., Jr.

2004-01-01

The AccuNet/AP Multimedia Archive is an electronic library containing the AP's current photos and a selection of pictures from their enormous print and negative library, as well as text and graphic material. It is composed of two photo databases as well as graphics, text, and audio databases. The features of this database are briefly described in…
Ionospheric characteristics for archiving at the World Data Centers. Technical report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gamache, R.R.; Reinisch, B.W.

1990-12-01

A database structure for archiving ionospheric characteristics at uneven data rates was developed at the July 1989 Ionospheric Informatics Working Group (IIWG) Lowell Workshop in Digital Ionogram Data Formats for World Data Center Archiving. This structure is proposed as a new URSI standard and is being employed by the World Data Center A for solar terrestrial physics for archiving characteristics. Here the database has been slightly refined for the application and programs written to generate these database files using as input Digisonde 256 ARTIST data, post processed by the ULCAR ADEP (ARTIST Data Editing Program) system. The characteristics program asmore » well as supplemental programs developed for this task are described here. The new software will make it possible to archive the ionospheric characteristics from the Geophysics Laboratory high latitude Digisonde network, the AWS DISS and the international Digisonde networks, and other ionospheric sounding networks.« less
2016 update of the PRIDE database and its related tools

PubMed Central

Vizcaíno, Juan Antonio; Csordas, Attila; del-Toro, Noemi; Dianes, José A.; Griss, Johannes; Lavidas, Ilias; Mayer, Gerhard; Perez-Riverol, Yasset; Reisinger, Florian; Ternent, Tobias; Xu, Qing-Wei; Wang, Rui; Hermjakob, Henning

2016-01-01

The PRoteomics IDEntifications (PRIDE) database is one of the world-leading data repositories of mass spectrometry (MS)-based proteomics data. Since the beginning of 2014, PRIDE Archive (http://www.ebi.ac.uk/pride/archive/) is the new PRIDE archival system, replacing the original PRIDE database. Here we summarize the developments in PRIDE resources and related tools since the previous update manuscript in the Database Issue in 2013. PRIDE Archive constitutes a complete redevelopment of the original PRIDE, comprising a new storage backend, data submission system and web interface, among other components. PRIDE Archive supports the most-widely used PSI (Proteomics Standards Initiative) data standard formats (mzML and mzIdentML) and implements the data requirements and guidelines of the ProteomeXchange Consortium. The wide adoption of ProteomeXchange within the community has triggered an unprecedented increase in the number of submitted data sets (around 150 data sets per month). We outline some statistics on the current PRIDE Archive data contents. We also report on the status of the PRIDE related stand-alone tools: PRIDE Inspector, PRIDE Converter 2 and the ProteomeXchange submission tool. Finally, we will give a brief update on the resources under development ‘PRIDE Cluster’ and ‘PRIDE Proteomes’, which provide a complementary view and quality-scored information of the peptide and protein identification data available in PRIDE Archive. PMID:26527722
The Craft of Benchmarking: Finding and Utilizing District-Level, Campus-Level, and Program-Level Standards.

ERIC Educational Resources Information Center

McGregor, Ellen N.; Attinasi, Louis C., Jr.

This paper describes the processes involved in selecting peer institutions for appropriate benchmarking using national databases (NCES-IPEDS). Benchmarking involves the identification of peer institutions and/or best practices in specific operational areas for the purpose of developing standards. The benchmarking process was borne in the early…
Simple re-instantiation of small databases using cloud computing.

PubMed

Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M

2013-01-01

Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.
Simple re-instantiation of small databases using cloud computing

PubMed Central

2013-01-01

Background Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. Results We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Conclusions Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear. PMID:24564380
NVST Data Archiving System Based On FastBit NoSQL Database

NASA Astrophysics Data System (ADS)

Liu, Ying-bo; Wang, Feng; Ji, Kai-fan; Deng, Hui; Dai, Wei; Liang, Bo

2014-06-01

The New Vacuum Solar Telescope (NVST) is a 1-meter vacuum solar telescope that aims to observe the fine structures of active regions on the Sun. The main tasks of the NVST are high resolution imaging and spectral observations, including the measurements of the solar magnetic field. The NVST has been collecting more than 20 million FITS files since it began routine observations in 2012 and produces a maximum observational records of 120 thousand files in a day. Given the large amount of files, the effective archiving and retrieval of files becomes a critical and urgent problem. In this study, we implement a new data archiving system for the NVST based on the Fastbit Not Only Structured Query Language (NoSQL) database. Comparing to the relational database (i.e., MySQL; My Structured Query Language), the Fastbit database manifests distinctive advantages on indexing and querying performance. In a large scale database of 40 million records, the multi-field combined query response time of Fastbit database is about 15 times faster and fully meets the requirements of the NVST. Our study brings a new idea for massive astronomical data archiving and would contribute to the design of data management systems for other astronomical telescopes.
The DNA Data Bank of Japan launches a new resource, the DDBJ Omics Archive of functional genomics experiments.

PubMed

Kodama, Yuichi; Mashima, Jun; Kaminuma, Eli; Gojobori, Takashi; Ogasawara, Osamu; Takagi, Toshihisa; Okubo, Kousaku; Nakamura, Yasukazu

2012-01-01

The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. The central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: the 'DDBJ Omics Archive' (DOR; http://trace.ddbj.nig.ac.jp/dor) and BioProject (http://trace.ddbj.nig.ac.jp/bioproject). DOR is an archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides an organizational framework to access metadata about research projects and the data from the projects that are deposited into different databases. In this article, we describe major changes and improvements introduced to the DDBJ services, and the launch of two new resources: DOR and BioProject.
Benchmarking database performance for genomic data.

PubMed

Khushi, Matloob

2015-06-01

Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.
Qualitative Comparison of IGRA and ESRL Radiosonde Archived Databases

NASA Technical Reports Server (NTRS)

Walker, John R.

2014-01-01

Multiple databases of atmospheric profile information are freely available to individuals and groups such as the Natural Environments group. Two of the primary database archives provided by NOAA that are most frequently used are those from the Earth Science Research Laboratory (ESRL) and the Integrated Global Radiosonde Archive (IGRA). Inquiries have been made as to why one database is used as opposed to the other, yet to the best of knowledge, no formal comparison has been performed. The goal of this study is to provide a qualitative comparison of the ESRL and IGRA radiosonde databases. For part of this analyses, 14 upper air observation sites were selected. These sites all have the common attribute of having been used or are planned for use in the development of Range Reference Atmospheres (RRAs) in support of NASA's and DOD's current and future goals.
Constructing Benchmark Databases and Protocols for Medical Image Analysis: Diabetic Retinopathy

PubMed Central

Kauppi, Tomi; Kämäräinen, Joni-Kristian; Kalesnykiene, Valentina; Sorri, Iiris; Uusitalo, Hannu; Kälviäinen, Heikki

2013-01-01

We address the performance evaluation practices for developing medical image analysis methods, in particular, how to establish and share databases of medical images with verified ground truth and solid evaluation protocols. Such databases support the development of better algorithms, execution of profound method comparisons, and, consequently, technology transfer from research laboratories to clinical practice. For this purpose, we propose a framework consisting of reusable methods and tools for the laborious task of constructing a benchmark database. We provide a software tool for medical image annotation helping to collect class label, spatial span, and expert's confidence on lesions and a method to appropriately combine the manual segmentations from multiple experts. The tool and all necessary functionality for method evaluation are provided as public software packages. As a case study, we utilized the framework and tools to establish the DiaRetDB1 V2.1 database for benchmarking diabetic retinopathy detection algorithms. The database contains a set of retinal images, ground truth based on information from multiple experts, and a baseline algorithm for the detection of retinopathy lesions. PMID:23956787
A MySQL Based EPICS Archiver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Christopher Slominski

2009-10-01

Archiving a large fraction of the EPICS signals within the Jefferson Lab (JLAB) Accelerator control system is vital for postmortem and real-time analysis of the accelerator performance. This analysis is performed on a daily basis by scientists, operators, engineers, technicians, and software developers. Archiving poses unique challenges due to the magnitude of the control system. A MySQL Archiving system (Mya) was developed to scale to the needs of the control system; currently archiving 58,000 EPICS variables, updating at a rate of 11,000 events per second. In addition to the large collection rate, retrieval of the archived data must also bemore » fast and robust. Archived data retrieval clients obtain data at a rate over 100,000 data points per second. Managing the data in a relational database provides a number of benefits. This paper describes an archiving solution that uses an open source database and standard off the shelf hardware to reach high performance archiving needs. Mya has been in production at Jefferson Lab since February of 2007.« less

Toward benchmarking in catalysis science: Best practices, challenges, and opportunities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bligaard, Thomas; Bullock, R. Morris; Campbell, Charles T.

Benchmarking is a community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments. In catalysis science, important catalyst performance metrics include activity, selectivity, and the deactivation profile, which enable comparisons between new and standard catalysts. Benchmarking also requires careful documentation, archiving, and sharing of methods and measurements, to ensure that the full value of research data can be realized. Beyond these goals, benchmarking presents unique opportunities to advance and accelerate understanding of complex reaction systems by combining and comparing experimental information from multiple, in situ and operando techniques with theoretical insights derived frommore » calculations characterizing model systems. This Perspective describes the origins and uses of benchmarking and its applications in computational catalysis, heterogeneous catalysis, molecular catalysis, and electrocatalysis. As a result, it also discusses opportunities and challenges for future developments in these fields.« less
Toward benchmarking in catalysis science: Best practices, challenges, and opportunities

DOE PAGES

Bligaard, Thomas; Bullock, R. Morris; Campbell, Charles T.; ...

2016-03-07

Benchmarking is a community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments. In catalysis science, important catalyst performance metrics include activity, selectivity, and the deactivation profile, which enable comparisons between new and standard catalysts. Benchmarking also requires careful documentation, archiving, and sharing of methods and measurements, to ensure that the full value of research data can be realized. Beyond these goals, benchmarking presents unique opportunities to advance and accelerate understanding of complex reaction systems by combining and comparing experimental information from multiple, in situ and operando techniques with theoretical insights derived frommore » calculations characterizing model systems. This Perspective describes the origins and uses of benchmarking and its applications in computational catalysis, heterogeneous catalysis, molecular catalysis, and electrocatalysis. As a result, it also discusses opportunities and challenges for future developments in these fields.« less
First results of MAO NASU SS bodies photographic archive digitizing

NASA Astrophysics Data System (ADS)

Pakuliak, L.; Andruk, V.; Shatokhina, S.; Golovnya, V.; Yizhakevych, O.; Kulyk, I.

2013-05-01

MAO NASU glass archive encloses about 1800 photographic plates with planets and their satellites (including near 80 images of Uranus, Pluto and Neptune), about 1700 plates with minor planets and about 900 plates with comets. Plates were made during 1949-1999 using 11 telescopes of different focus, mostly the Double Wide-angle Astrograph (F/D=2000/400) and the Double Long-focus Astrograph (F/D=5500/400) of MAO NASU. Observational sites are Kyiv, Lviv (Ukraine), Biurakan (Armenia), Abastumani (Georgia), Mt. Maidanak (Uzbekistan), Quito (Equador). Tables contain data about the most significant numbers of plates sub-divided by years and objects. The database with metadata of plates (DBGPA) is available on the computer cluster of MAO (http://gua.db.ukr-vo.org) via open access. The database accumulates archives of four Ukrainian observatories, involving the UkrVO national project. Together with the archive managing system, the database serves as a test area for JDA - Joint Digital Archive - the core of the UkrVO.
The HARPS-N archive through a Cassandra, NoSQL database suite?

NASA Astrophysics Data System (ADS)

Molinari, Emilio; Guerra, Jose; Harutyunyan, Avet; Lodi, Marcello; Martin, Adrian

2016-07-01

The TNG-INAF is developing the science archive for the WEAVE instrument. The underlying architecture of the archive is based on a non relational database, more precisely, on Apache Cassandra cluster, which uses a NoSQL technology. In order to test and validate the use of this architecture, we created a local archive which we populated with all the HARPSN spectra collected at the TNG since the instrument's start of operations in mid-2012, as well as developed tools for the analysis of this data set. The HARPS-N data set is two orders of magnitude smaller than WEAVE, but we want to demonstrate the ability to walk through a complete data set and produce scientific output, as valuable as that produced by an ordinary pipeline, though without accessing directly the FITS files. The analytics is done by Apache Solr and Spark and on a relational PostgreSQL database. As an example, we produce observables like metallicity indexes for the targets in the archive and compare the results with the ones coming from the HARPS-N regular data reduction software. The aim of this experiment is to explore the viability of a high availability cluster and distributed NoSQL database as a platform for complex scientific analytics on a large data set, which will then be ported to the WEAVE Archive System (WAS) which we are developing for the WEAVE multi object, fiber spectrograph.
Mean velocity and turbulence measurements in a 90 deg curved duct with thin inlet boundary layer

NASA Technical Reports Server (NTRS)

Crawford, R. A.; Peters, C. E.; Steinhoff, J.; Hornkohl, J. O.; Nourinejad, J.; Ramachandran, K.

1985-01-01

The experimental database established by this investigation of the flow in a large rectangular turning duct is of benchmark quality. The experimental Reynolds numbers, Deans numbers and boundary layer characteristics are significantly different from previous benchmark curved-duct experimental parameters. This investigation extends the experimental database to higher Reynolds number and thinner entrance boundary layers. The 5% to 10% thick boundary layers, based on duct half-width, results in a large region of near-potential flow in the duct core surrounded by developing boundary layers with large crossflows. The turbulent entrance boundary layer case at R sub ed = 328,000 provides an incompressible flowfield which approaches real turbine blade cascade characteristics. The results of this investigation provide a challenging benchmark database for computational fluid dynamics code development.
Ray Modeling Methods for Range Dependent Ocean Environments

DTIC Science & Technology

1983-12-01

the eikonal equation, gives rise to equations for ray paths which are perpendicular to the wave fronts. Equation II.4, the transport equation, leads... databases for use by MEDUSA. The author has assisted in the installation of MEDUSA at computer facilities which possess databases containing archives of...sound velocity profiles, bathymetry, and bottom loss data. At each computer site, programs convert the archival data retrieved by the database system
NASA Records Database

NASA Technical Reports Server (NTRS)

Callac, Christopher; Lunsford, Michelle

2005-01-01

The NASA Records Database, comprising a Web-based application program and a database, is used to administer an archive of paper records at Stennis Space Center. The system begins with an electronic form, into which a user enters information about records that the user is sending to the archive. The form is smart : it provides instructions for entering information correctly and prompts the user to enter all required information. Once complete, the form is digitally signed and submitted to the database. The system determines which storage locations are not in use, assigns the user s boxes of records to some of them, and enters these assignments in the database. Thereafter, the software tracks the boxes and can be used to locate them. By use of search capabilities of the software, specific records can be sought by box storage locations, accession numbers, record dates, submitting organizations, or details of the records themselves. Boxes can be marked with such statuses as checked out, lost, transferred, and destroyed. The system can generate reports showing boxes awaiting destruction or transfer. When boxes are transferred to the National Archives and Records Administration (NARA), the system can automatically fill out NARA records-transfer forms. Currently, several other NASA Centers are considering deploying the NASA Records Database to help automate their records archives.
Duplicates, redundancies and inconsistencies in the primary nucleotide databases: a descriptive study.

PubMed

Chen, Qingyu; Zobel, Justin; Verspoor, Karin

2017-01-01

GenBank, the EMBL European Nucleotide Archive and the DNA DataBank of Japan, known collectively as the International Nucleotide Sequence Database Collaboration or INSDC, are the three most significant nucleotide sequence databases. Their records are derived from laboratory work undertaken by different individuals, by different teams, with a range of technologies and assumptions and over a period of decades. As a consequence, they contain a great many duplicates, redundancies and inconsistencies, but neither the prevalence nor the characteristics of various types of duplicates have been rigorously assessed. Existing duplicate detection methods in bioinformatics only address specific duplicate types, with inconsistent assumptions; and the impact of duplicates in bioinformatics databases has not been carefully assessed, making it difficult to judge the value of such methods. Our goal is to assess the scale, kinds and impact of duplicates in bioinformatics databases, through a retrospective analysis of merged groups in INSDC databases. Our outcomes are threefold: (1) We analyse a benchmark dataset consisting of duplicates manually identified in INSDC-a dataset of 67 888 merged groups with 111 823 duplicate pairs across 21 organisms from INSDC databases - in terms of the prevalence, types and impacts of duplicates. (2) We categorize duplicates at both sequence and annotation level, with supporting quantitative statistics, showing that different organisms have different prevalence of distinct kinds of duplicate. (3) We show that the presence of duplicates has practical impact via a simple case study on duplicates, in terms of GC content and melting temperature. We demonstrate that duplicates not only introduce redundancy, but can lead to inconsistent results for certain tasks. Our findings lead to a better understanding of the problem of duplication in biological databases.Database URL: the merged records are available at https://cloudstor.aarnet.edu.au/plus/index.php/s/Xef2fvsebBEAv9w. © The Author(s) 2017. Published by Oxford University Press.
Duplicates, redundancies and inconsistencies in the primary nucleotide databases: a descriptive study

PubMed Central

Chen, Qingyu; Zobel, Justin; Verspoor, Karin

2017-01-01

GenBank, the EMBL European Nucleotide Archive and the DNA DataBank of Japan, known collectively as the International Nucleotide Sequence Database Collaboration or INSDC, are the three most significant nucleotide sequence databases. Their records are derived from laboratory work undertaken by different individuals, by different teams, with a range of technologies and assumptions and over a period of decades. As a consequence, they contain a great many duplicates, redundancies and inconsistencies, but neither the prevalence nor the characteristics of various types of duplicates have been rigorously assessed. Existing duplicate detection methods in bioinformatics only address specific duplicate types, with inconsistent assumptions; and the impact of duplicates in bioinformatics databases has not been carefully assessed, making it difficult to judge the value of such methods. Our goal is to assess the scale, kinds and impact of duplicates in bioinformatics databases, through a retrospective analysis of merged groups in INSDC databases. Our outcomes are threefold: (1) We analyse a benchmark dataset consisting of duplicates manually identified in INSDC—a dataset of 67 888 merged groups with 111 823 duplicate pairs across 21 organisms from INSDC databases – in terms of the prevalence, types and impacts of duplicates. (2) We categorize duplicates at both sequence and annotation level, with supporting quantitative statistics, showing that different organisms have different prevalence of distinct kinds of duplicate. (3) We show that the presence of duplicates has practical impact via a simple case study on duplicates, in terms of GC content and melting temperature. We demonstrate that duplicates not only introduce redundancy, but can lead to inconsistent results for certain tasks. Our findings lead to a better understanding of the problem of duplication in biological databases. Database URL: the merged records are available at https://cloudstor.aarnet.edu.au/plus/index.php/s/Xef2fvsebBEAv9w PMID:28077566
Toward a National Computerized Database for Moving Image Materials.

ERIC Educational Resources Information Center

Gartenberg, Jon

This report summarizes a project conducted by a group of catalogers from film archives devoted to nitrate preservation, which explored ways of developing a database to provide a complete film and television information service that would be available nationwide and could contain filmographic data, information on holdings in archives and…
The MAO NASU Plate Archive Database. Current Status and Perspectives

NASA Astrophysics Data System (ADS)

Pakuliak, L. K.; Sergeeva, T. P.

2006-04-01

The preliminary online version of the database of the MAO NASU plate archive is constructed on the basis of the relational database management system MySQL and permits an easy supplement of database with new collections of astronegatives, provides a high flexibility in constructing SQL-queries for data search optimization, PHP Basic Authorization protected access to administrative interface and wide range of search parameters. The current status of the database will be reported and the brief description of the search engine and means of the database integrity support will be given. Methods and means of the data verification and tasks for the further development will be discussed.
An overview on integrated data system for archiving and sharing marine geology and geophysical data in Korea Institute of Ocean Science & Technology (KIOST)

NASA Astrophysics Data System (ADS)

Choi, Sang-Hwa; Kim, Sung Dae; Park, Hyuk Min; Lee, SeungHa

2016-04-01

We established and have operated an integrated data system for managing, archiving and sharing marine geology and geophysical data around Korea produced from various research projects and programs in Korea Institute of Ocean Science & Technology (KIOST). First of all, to keep the consistency of data system with continuous data updates, we set up standard operating procedures (SOPs) for data archiving, data processing and converting, data quality controls, and data uploading, DB maintenance, etc. Database of this system comprises two databases, ARCHIVE DB and GIS DB for the purpose of this data system. ARCHIVE DB stores archived data as an original forms and formats from data providers for data archive and GIS DB manages all other compilation, processed and reproduction data and information for data services and GIS application services. Relational data management system, Oracle 11g, adopted for DBMS and open source GIS techniques applied for GIS services such as OpenLayers for user interface, GeoServer for application server, PostGIS and PostgreSQL for GIS database. For the sake of convenient use of geophysical data in a SEG Y format, a viewer program was developed and embedded in this system. Users can search data through GIS user interface and save the results as a report.
Digitized Archival Primary Sources in STEM: A Selected Webliography

ERIC Educational Resources Information Center

Jankowski, Amy

2017-01-01

Accessibility and findability of digitized archival resources can be a challenge, particularly for students or researchers not familiar with archival formats and digital interfaces, which adhere to different descriptive standards than more widely familiar library resources. Numerous aggregate archival collection databases exist, which provide a…
Middle Level Teachers' Perceptions of Interim Reading Assessments: An Exploratory Study of Data-Based Decision Making

ERIC Educational Resources Information Center

Reed, Deborah K.

2015-01-01

This study explored the data-based decision making of 12 teachers in grades 6-8 who were asked about their perceptions and use of three required interim measures of reading performance: oral reading fluency (ORF), retell, and a benchmark comprised of released state test items. Focus group participants reported they did not believe the benchmark or…
Overview of TPC Benchmark E: The Next Generation of OLTP Benchmarks

NASA Astrophysics Data System (ADS)

Hogan, Trish

Set to replace the aging TPC-C, the TPC Benchmark E is the next generation OLTP benchmark, which more accurately models client database usage. TPC-E addresses the shortcomings of TPC-C. It has a much more complex workload, requires the use of RAID-protected storage, generates much less I/O, and is much cheaper and easier to set up, run, and audit. After a period of overlap, it is expected that TPC-E will become the de facto OLTP benchmark.
36 CFR 1235.50 - What specifications and standards for transfer apply to electronic records?

Code of Federal Regulations, 2011 CFR

2011-07-01

... characters and other non-data characters removed. Agencies must consult with the National Archives and... Road, College Park, MD 20740, phone number (301) 837-1578 to initiate transfer discussions. (b) Data files and databases. Data files and databases must be transferred to the National Archives of the United...
36 CFR 1235.50 - What specifications and standards for transfer apply to electronic records?

Code of Federal Regulations, 2010 CFR

2010-07-01

... characters and other non-data characters removed. Agencies must consult with the National Archives and... Road, College Park, MD 20740, phone number (301) 837-1578 to initiate transfer discussions. (b) Data files and databases. Data files and databases must be transferred to the National Archives of the United...
Archive and Database as Metaphor: Theorizing the Historical Record

ERIC Educational Resources Information Center

Manoff, Marlene

2010-01-01

Digital media increase the visibility and presence of the past while also reshaping our sense of history. We have extraordinary access to digital versions of books, journals, film, television, music, art and popular culture from earlier eras. New theoretical formulations of database and archive provide ways to think creatively about these changes…
GraphBench

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sukumar, Sreenivas R.; Hong, Seokyong; Lee, Sangkeun

2016-06-01

GraphBench is a benchmark suite for graph pattern mining and graph analysis systems. The benchmark suite is a significant addition to conducting apples-apples comparison of graph analysis software (databases, in-memory tools, triple stores, etc.)
EnergyIQ

DOE Office of Scientific and Technical Information (OSTI.GOV)

MILLS, EVAN; MATTHE, PAUL; STOUFER, MARTIN

2016-10-06

EnergyIQ-the first "action-oriented" benchmarking tool for non-residential buildings-provides a standardized opportunity assessment based on benchmarking results. along with decision-support information to help refine action plans. EnergyIQ offers a wide array of benchmark metrics, with visuall as well as tabular display. These include energy, costs, greenhouse-gas emissions, and a large array of characteristics (e.g. building components or operational strategies). The tool supports cross-sectional benchmarking for comparing the user's building to it's peers at one point in time, as well as longitudinal benchmarking for tracking the performance of an individual building or enterprise portfolio over time. Based on user inputs, the toolmore » generates a list of opportunities and recommended actions. Users can then explore the "Decision Support" module for helpful information on how to refine action plans, create design-intent documentation, and implement improvements. This includes information on best practices, links to other energy analysis tools and more. The variety of databases are available within EnergyIQ from which users can specify peer groups for comparison. Using the tool, this data can be visually browsed and used as a backdrop against which to view a variety of energy benchmarking metrics for the user's own building. User can save their project information and return at a later date to continue their exploration. The initial database is the CA Commercial End-Use Survey (CEUS), which provides details on energy use and characteristics for about 2800 buildings (and 62 building types). CEUS is likely the most thorough survey of its kind every conducted. The tool is built as a web service. The EnergyIQ web application is written in JSP with pervasive us of JavaScript and CSS2. EnergyIQ also supports a SOAP based web service to allow the flow of queries and data to occur with non-browser implementations. Data are stored in an Oracle 10g database. References: Mills, Mathew, Brook and Piette. 2008. "Action Oriented Benchmarking: Concepts and Tools." Energy Engineering, Vol.105, No. 4, pp 21-40. LBNL-358E; Mathew, Mills, Bourassa, Brook. 2008. "Action-Oriented Benchmarking: Using the CEUS Database to Benchmark Commercial Buildings in California." Energy Engineering, Vol 105, No. 5, pp 6-18. LBNL-502E.« less

77 FR 65416 - Advisory Committee on the Electronic Records Archives (ACERA)

Federal Register 2010, 2011, 2012, 2013, 2014

2012-10-26

... Minutes ERA Program Update Business Priorities Presidential Directive on Records Management Online Public Access Discussions: Encouraging development of automated tools for electronic records management, impact of big data, and benchmarking Dated: October 24, 2012. Patrice Little Murray, Acting Committee...
A MYSQL-BASED DATA ARCHIVER: PRELIMINARY RESULTS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Matthew Bickley; Christopher Slominski

2008-01-23

Following an evaluation of the archival requirements of the Jefferson Laboratory accelerator’s user community, a prototyping effort was executed to determine if an archiver based on MySQL had sufficient functionality to meet those requirements. This approach was chosen because an archiver based on a relational database enables the development effort to focus on data acquisition and management, letting the database take care of storage, indexing and data consistency. It was clear from the prototype effort that there were no performance impediments to successful implementation of a final system. With our performance concerns addressed, the lab undertook the design and developmentmore » of an operational system. The system is in its operational testing phase now. This paper discusses the archiver system requirements, some of the design choices and their rationale, and presents the acquisition, storage and retrieval performance.« less
A web-based system architecture for ontology-based data integration in the domain of IT benchmarking

NASA Astrophysics Data System (ADS)

Pfaff, Matthias; Krcmar, Helmut

2018-03-01

In the domain of IT benchmarking (ITBM), a variety of data and information are collected. Although these data serve as the basis for business analyses, no unified semantic representation of such data yet exists. Consequently, data analysis across different distributed data sets and different benchmarks is almost impossible. This paper presents a system architecture and prototypical implementation for an integrated data management of distributed databases based on a domain-specific ontology. To preserve the semantic meaning of the data, the ITBM ontology is linked to data sources and functions as the central concept for database access. Thus, additional databases can be integrated by linking them to this domain-specific ontology and are directly available for further business analyses. Moreover, the web-based system supports the process of mapping ontology concepts to external databases by introducing a semi-automatic mapping recommender and by visualizing possible mapping candidates. The system also provides a natural language interface to easily query linked databases. The expected result of this ontology-based approach of knowledge representation and data access is an increase in knowledge and data sharing in this domain, which will enhance existing business analysis methods.
Toward public volume database management: a case study of NOVA, the National Online Volumetric Archive

NASA Astrophysics Data System (ADS)

Fletcher, Alex; Yoo, Terry S.

2004-04-01

Public databases today can be constructed with a wide variety of authoring and management structures. The widespread appeal of Internet search engines suggests that public information be made open and available to common search strategies, making accessible information that would otherwise be hidden by the infrastructure and software interfaces of a traditional database management system. We present the construction and organizational details for managing NOVA, the National Online Volumetric Archive. As an archival effort of the Visible Human Project for supporting medical visualization research, archiving 3D multimodal radiological teaching files, and enhancing medical education with volumetric data, our overall database structure is simplified; archives grow by accruing information, but seldom have to modify, delete, or overwrite stored records. NOVA is being constructed and populated so that it is transparent to the Internet; that is, much of its internal structure is mirrored in HTML allowing internet search engines to investigate, catalog, and link directly to the deep relational structure of the collection index. The key organizational concept for NOVA is the Image Content Group (ICG), an indexing strategy for cataloging incoming data as a set structure rather than by keyword management. These groups are managed through a series of XML files and authoring scripts. We cover the motivation for Image Content Groups, their overall construction, authorship, and management in XML, and the pilot results for creating public data repositories using this strategy.
Informatics in radiology: Efficiency metrics for imaging device productivity.

PubMed

Hu, Mengqi; Pavlicek, William; Liu, Patrick T; Zhang, Muhong; Langer, Steve G; Wang, Shanshan; Place, Vicki; Miranda, Rafael; Wu, Teresa Tong

2011-01-01

Acute awareness of the costs associated with medical imaging equipment is an ever-present aspect of the current healthcare debate. However, the monitoring of productivity associated with expensive imaging devices is likely to be labor intensive, relies on summary statistics, and lacks accepted and standardized benchmarks of efficiency. In the context of the general Six Sigma DMAIC (design, measure, analyze, improve, and control) process, a World Wide Web-based productivity tool called the Imaging Exam Time Monitor was developed to accurately and remotely monitor imaging efficiency with use of Digital Imaging and Communications in Medicine (DICOM) combined with a picture archiving and communication system. Five device efficiency metrics-examination duration, table utilization, interpatient time, appointment interval time, and interseries time-were derived from DICOM values. These metrics allow the standardized measurement of productivity, to facilitate the comparative evaluation of imaging equipment use and ongoing efforts to improve efficiency. A relational database was constructed to store patient imaging data, along with device- and examination-related data. The database provides full access to ad hoc queries and can automatically generate detailed reports for administrative and business use, thereby allowing staff to monitor data for trends and to better identify possible changes that could lead to improved productivity and reduced costs in association with imaging services. © RSNA, 2011.
Integrating a local database into the StarView distributed user interface

NASA Technical Reports Server (NTRS)

Silberberg, D. P.

1992-01-01

A distributed user interface to the Space Telescope Data Archive and Distribution Service (DADS) known as StarView is being developed. The DADS architecture consists of the data archive as well as a relational database catalog describing the archive. StarView is a client/server system in which the user interface is the front-end client to the DADS catalog and archive servers. Users query the DADS catalog from the StarView interface. Query commands are transmitted via a network and evaluated by the database. The results are returned via the network and are displayed on StarView forms. Based on the results, users decide which data sets to retrieve from the DADS archive. Archive requests are packaged by StarView and sent to DADS, which returns the requested data sets to the users. The advantages of distributed client/server user interfaces over traditional one-machine systems are well known. Since users run software on machines separate from the database, the overall client response time is much faster. Also, since the server is free to process only database requests, the database response time is much faster. Disadvantages inherent in this architecture are slow overall database access time due to the network delays, lack of a 'get previous row' command, and that refinements of a previously issued query must be submitted to the database server, even though the domain of values have already been returned by the previous query. This architecture also does not allow users to cross correlate DADS catalog data with other catalogs. Clearly, a distributed user interface would be more powerful if it overcame these disadvantages. A local database is being integrated into StarView to overcome these disadvantages. When a query is made through a StarView form, which is often composed of fields from multiple tables, it is translated to an SQL query and issued to the DADS catalog. At the same time, a local database table is created to contain the resulting rows of the query. The returned rows are displayed on the form as well as inserted into the local database table. Identical results are produced by reissuing the query to either the DADS catalog or to the local table. Relational databases do not provide a 'get previous row' function because of the inherent complexity of retrieving previous rows of multiple-table joins. However, since this function is easily implemented on a single table, StarView uses the local table to retrieve the previous row. Also, StarView issues subsequent query refinements to the local table instead of the DADS catalog, eliminating the network transmission overhead. Finally, other catalogs can be imported into the local database for cross correlation with local tables. Overall, it is believe that this is a more powerful architecture for distributed, database user interfaces.
The Golosiiv on-line plate archive database, management and maintenance

NASA Astrophysics Data System (ADS)

Pakuliak, L.; Sergeeva, T.

2007-08-01

We intend to create online version of the database of the MAO NASU plate archive as VO-compatible structures in accordance with principles, developed by the International Virtual Observatory Alliance in order to make them available for world astronomical community. The online version of the log-book database is constructed by means of MySQL+PHP. Data management system provides a user with user interface, gives a capability of detailed traditional form-filling radial search of plates, obtaining some auxiliary sampling, the listing of each collection and permits to browse the detail descriptions of collections. The administrative tool allows database administrator the data correction, enhancement with new data sets and control of the integrity and consistence of the database as a whole. The VO-compatible database is currently constructing under the demands and in the accordance with principles of international data archives and has to be strongly generalized in order to provide a possibility of data mining by means of standard interfaces and to be the best fitted to the demands of WFPDB Group for databases of the plate catalogues. On-going enhancements of database toward the WFPDB bring the problem of the verification of data to the forefront, as it demands the high degree of data reliability. The process of data verification is practically endless and inseparable from data management owing to a diversity of data errors nature, that means to a variety of ploys of their identification and fixing. The current status of MAO NASU glass archive forces the activity in both directions simultaneously: the enhancement of log-book database with new sets of observational data as well as generalized database creation and the cross-identification between them. The VO-compatible version of the database is supplying with digitized data of plates obtained with MicroTek ScanMaker 9800 XL TMA. The scanning procedure is not total but is conducted selectively in the frames of special projects.
BAO Plate Archive Project

NASA Astrophysics Data System (ADS)

Mickaelian, A. M.; Gigoyan, K. S.; Gyulzadyan, M. V.; Paronyan, G. M.; Abrahamyan, H. V.; Andreasyan, H. R.; Azatyan, N. M.; Kostandyan, G. R.; Samsonyan, A. L.; Mikayelyan, G. A.; Farmanyan, S. V.; Harutyunyan, V. L.

2017-12-01

We present the Byurakan Astrophysical Observatory (BAO) Plate Archive Project that is aimed at digitization, extraction and analysis of archival data and building an electronic database and interactive sky map. BAO Plate Archive consists of 37,500 photographic plates and films, obtained with 2.6m telescope, 1m and 0.5m Schmidt telescopes and other smaller ones during 1947-1991. The famous Markarian Survey (or the First Byurakan Survey, FBS) 2000 plates were digitized in 2002-2005 and the Digitized FBS (DFBS, www.aras.am/Dfbs/dfbs.html) was created. New science projects have been conducted based on this low-dispersion spectroscopic material. Several other smaller digitization projects have been carried out as well, such as part of Second Byurakan Survey (SBS) plates, photographic chain plates in Coma, where the blazar ON 231 is located and 2.6m film spectra of FBS Blue Stellar Objects. However, most of the plates and films are not digitized. In 2015, we have started a project on the whole BAO Plate Archive digitization, creation of electronic database and its scientific usage. Armenian Virtual Observatory (ArVO, www.aras.am/Arvo/arvo.htm) database will accommodate all new data. The project runs in collaboration with the Armenian Institute of Informatics and Automation Problems (IIAP) and will continues during 4 years in 2015-2018. The final result will be an Electronic Database and online Interactive Sky map to be used for further research projects. ArVO will provide all standards and tools for efficient usage of the scientific output and its integration in international databases.
BAO plate archive digitization

NASA Astrophysics Data System (ADS)

Mickaelian, A. M.; Nikoghosyan, E. H.; Gigoyan, K. S.; Paronyan, G. M.; Abrahamyan, H. V.; Andreasyan, H. R.; Azatyan, N. M.; Kostandyan, G. R.; Khachatryan, K. G.; Vardanyan, A. V.; Gyulzadyan, M. V.; Mikayelyan, G. A.; Farmanyan, S. V.; Knyazyan, A. V.

Astronomical plate archives created on the basis of numerous observations at many observatories are important part of the astronomical heritage. Byurakan Astrophysical Observatory (BAO) plate archive consists of 37,000 photographic plates and films, obtained at 2.6m telescope, 1m and 0.5m Schmidt telescopes and other smaller ones during 1947-1991. In 2015, we have started a project on the whole BAO Plate Archive digitization, creation of electronic database and its scientific usage. A Science Program Board is created to evaluate the observing material, to investigate new possibilities and to propose new projects based on the combined usage of these observations together with other world databases. The Executing Team consists of 11 astronomers and 2 computer scientists and will use 2 EPSON Perfection V750 Pro scanners for the digitization. The project will run during 3 years in 2015-2017 and the final result will be an electronic database and online interactive sky map to be used for further research projects.
The Dutch Hospital Standardised Mortality Ratio (HSMR) method and cardiac surgery: benchmarking in a national cohort using hospital administration data versus a clinical database

PubMed Central

Siregar, S; Pouw, M E; Moons, K G M; Versteegh, M I M; Bots, M L; van der Graaf, Y; Kalkman, C J; van Herwerden, L A; Groenwold, R H H

2014-01-01

Objective To compare the accuracy of data from hospital administration databases and a national clinical cardiac surgery database and to compare the performance of the Dutch hospital standardised mortality ratio (HSMR) method and the logistic European System for Cardiac Operative Risk Evaluation, for the purpose of benchmarking of mortality across hospitals. Methods Information on all patients undergoing cardiac surgery between 1 January 2007 and 31 December 2010 in 10 centres was extracted from The Netherlands Association for Cardio-Thoracic Surgery database and the Hospital Discharge Registry. The number of cardiac surgery interventions was compared between both databases. The European System for Cardiac Operative Risk Evaluation and hospital standardised mortality ratio models were updated in the study population and compared using the C-statistic, calibration plots and the Brier-score. Results The number of cardiac surgery interventions performed could not be assessed using the administrative database as the intervention code was incorrect in 1.4–26.3%, depending on the type of intervention. In 7.3% no intervention code was registered. The updated administrative model was inferior to the updated clinical model with respect to discrimination (c-statistic of 0.77 vs 0.85, p<0.001) and calibration (Brier Score of 2.8% vs 2.6%, p<0.001, maximum score 3.0%). Two average performing hospitals according to the clinical model became outliers when benchmarking was performed using the administrative model. Conclusions In cardiac surgery, administrative data are less suitable than clinical data for the purpose of benchmarking. The use of either administrative or clinical risk-adjustment models can affect the outlier status of hospitals. Risk-adjustment models including procedure-specific clinical risk factors are recommended. PMID:24334377
Requirements for benchmarking personal image retrieval systems

NASA Astrophysics Data System (ADS)

Bouguet, Jean-Yves; Dulong, Carole; Kozintsev, Igor; Wu, Yi

2006-01-01

It is now common to have accumulated tens of thousands of personal ictures. Efficient access to that many pictures can only be done with a robust image retrieval system. This application is of high interest to Intel processor architects. It is highly compute intensive, and could motivate end users to upgrade their personal computers to the next generations of processors. A key question is how to assess the robustness of a personal image retrieval system. Personal image databases are very different from digital libraries that have been used by many Content Based Image Retrieval Systems.1 For example a personal image database has a lot of pictures of people, but a small set of different people typically family, relatives, and friends. Pictures are taken in a limited set of places like home, work, school, and vacation destination. The most frequent queries are searched for people, and for places. These attributes, and many others affect how a personal image retrieval system should be benchmarked, and benchmarks need to be different from existing ones based on art images, or medical images for examples. The attributes of the data set do not change the list of components needed for the benchmarking of such systems as specified in2: - data sets - query tasks - ground truth - evaluation measures - benchmarking events. This paper proposed a way to build these components to be representative of personal image databases, and of the corresponding usage models.
Decoys Selection in Benchmarking Datasets: Overview and Perspectives

PubMed Central

Réau, Manon; Langenfeld, Florent; Zagury, Jean-François; Lagarde, Nathalie; Montes, Matthieu

2018-01-01

Virtual Screening (VS) is designed to prospectively help identifying potential hits, i.e., compounds capable of interacting with a given target and potentially modulate its activity, out of large compound collections. Among the variety of methodologies, it is crucial to select the protocol that is the most adapted to the query/target system under study and that yields the most reliable output. To this aim, the performance of VS methods is commonly evaluated and compared by computing their ability to retrieve active compounds in benchmarking datasets. The benchmarking datasets contain a subset of known active compounds together with a subset of decoys, i.e., assumed non-active molecules. The composition of both the active and the decoy compounds subsets is critical to limit the biases in the evaluation of the VS methods. In this review, we focus on the selection of decoy compounds that has considerably changed over the years, from randomly selected compounds to highly customized or experimentally validated negative compounds. We first outline the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, we propose recommendations for the selection and the design of benchmarking datasets. PMID:29416509
The Alaska Arctic Vegetation Archive (AVA-AK)

Treesearch

Donald A. Walker; Amy L. Breen; Lisa A. Druckenmiller; Lisa W. Wirth; Will Fisher; Martha K. Raynolds; Jozef Šibík; Marilyn D. Walker; Stephan Hennekens; Keith Boggs; Tina Boucher; Marcel Buchhorn; Helga Bültmann; David J. Cooper; Fred J.A Daniëls; Scott J. Davidson; James J. Ebersole; Sara C. Elmendorf; Howard E. Epstein; William A. Gould; Robert D. Hollister; Colleen M. Iversen; M. Torre Jorgenson; Anja Kade; Michael T. Lee; William H. MacKenzie; Robert K. Peet; Jana L. Peirce; Udo Schickhoff; Victoria L. Sloan; Stephen S. Talbot; Craig E. Tweedie; Sandra Villarreal; Patrick J. Webber; Donatella Zona

2016-01-01

The Alaska Arctic Vegetation Archive (AVA-AK, GIVD-ID: NA-US-014) is a free, publically available database archive of vegetation-plot data from the Arctic tundra region of northern Alaska. The archive currently contains 24 datasets with 3,026 non-overlapping plots. Of these, 74% have geolocation data with 25-m or better precision. Species cover data and header data are...
BAO Plate Archive Project: Digitization, Electronic Database and Research Programmes

NASA Astrophysics Data System (ADS)

Mickaelian, A. M.; Abrahamyan, H. V.; Andreasyan, H. R.; Azatyan, N. M.; Farmanyan, S. V.; Gigoyan, K. S.; Gyulzadyan, M. V.; Khachatryan, K. G.; Knyazyan, A. V.; Kostandyan, G. R.; Mikayelyan, G. A.; Nikoghosyan, E. H.; Paronyan, G. M.; Vardanyan, A. V.

2016-06-01

The most important part of the astronomical observational heritage are astronomical plate archives created on the basis of numerous observations at many observatories. Byurakan Astrophysical Observatory (BAO) plate archive consists of 37,000 photographic plates and films, obtained at 2.6m telescope, 1m and 0.5m Schmidt type and other smaller telescopes during 1947-1991. In 2002-2005, the famous Markarian Survey (also called First Byurakan Survey, FBS) 1874 plates were digitized and the Digitized FBS (DFBS) was created. New science projects have been conducted based on these low-dispersion spectroscopic material. A large project on the whole BAO Plate Archive digitization, creation of electronic database and its scientific usage was started in 2015. A Science Program Board is created to evaluate the observing material, to investigate new possibilities and to propose new projects based on the combined usage of these observations together with other world databases. The Executing Team consists of 11 astronomers and 2 computer scientists and will use 2 EPSON Perfection V750 Pro scanners for the digitization, as well as Armenian Virtual Observatory (ArVO) database will be used to accommodate all new data. The project will run during 3 years in 2015-2017 and the final result will be an electronic database and online interactive sky map to be used for further research projects, mainly including high proper motion stars, variable objects and Solar System bodies.
BAO Plate Archive digitization, creation of electronic database and its scientific usage

NASA Astrophysics Data System (ADS)

Mickaelian, Areg M.

2015-08-01

Astronomical plate archives created on the basis of numerous observations at many observatories are important part of the astronomical heritage. Byurakan Astrophysical Observatory (BAO) plate archive consists of 37,500 photographic plates and films, obtained at 2.6m telescope, 1m and 0.5m Schmidt telescopes and other smaller ones during 1947-1991. In 2002-2005, the famous Markarian Survey (First Byurakan Survey, FBS) 2000 plates were digitized and the Digitized FBS (DFBS, http://www.aras.am/Dfbs/dfbs.html) was created. New science projects have been conducted based on these low-dispersion spectroscopic material. In 2015, we have started a project on the whole BAO Plate Archive digitization, creation of electronic database and its scientific usage. A Science Program Board is created to evaluate the observing material, to investigate new possibilities and to propose new projects based on the combined usage of these observations together with other world databases. The Executing Team consists of 9 astronomers and 3 computer scientists and will use 2 EPSON Perfection V750 Pro scanners for the digitization, as well as Armenian Virtual Observatory (ArVO) database to accommodate all new data. The project will run during 3 years in 2015-2017 and the final result will be an electronic database and online interactive sky map to be used for further research projects.
Global ISR: Toward a Comprehensive Defense Against Unauthorized Code Execution

DTIC Science & Technology

2010-10-01

implementation using two of the most popular open- source servers: the Apache web server, and the MySQL database server. For Apache, we measure the effect that...utility ab. T o ta l T im e ( s e c ) 0 500 1000 1500 2000 2500 3000 Native Null ISR ISR−MP Fig. 3. The MySQL test-insert bench- mark measures...various SQL operations. The figure draws total execution time as reported by the benchmark utility. Finally, we benchmarked a MySQL database server using
E-MSD: improving data deposition and structure quality.

PubMed

Tagari, M; Tate, J; Swaminathan, G J; Newman, R; Naim, A; Vranken, W; Kapopoulou, A; Hussain, A; Fillon, J; Henrick, K; Velankar, S

2006-01-01

The Macromolecular Structure Database (MSD) (http://www.ebi.ac.uk/msd/) [H. Boutselakis, D. Dimitropoulos, J. Fillon, A. Golovin, K. Henrick, A. Hussain, J. Ionides, M. John, P. A. Keller, E. Krissinel et al. (2003) E-MSD: the European Bioinformatics Institute Macromolecular Structure Database. Nucleic Acids Res., 31, 458-462.] group is one of the three partners in the worldwide Protein DataBank (wwPDB), the consortium entrusted with the collation, maintenance and distribution of the global repository of macromolecular structure data [H. Berman, K. Henrick and H. Nakamura (2003) Announcing the worldwide Protein Data Bank. Nature Struct. Biol., 10, 980.]. Since its inception, the MSD group has worked with partners around the world to improve the quality of PDB data, through a clean up programme that addresses inconsistencies and inaccuracies in the legacy archive. The improvements in data quality in the legacy archive have been achieved largely through the creation of a unified data archive, in the form of a relational database that stores all of the data in the wwPDB. The three partners are working towards improving the tools and methods for the deposition of new data by the community at large. The implementation of the MSD database, together with the parallel development of improved tools and methodologies for data harvesting, validation and archival, has lead to significant improvements in the quality of data that enters the archive. Through this and related projects in the NMR and EM realms the MSD continues to improve the quality of publicly available structural data.
JNDMS Task Authorization 2 Report

DTIC Science & Technology

2013-10-01

uses Barnyard to store alarms from all DREnet Snort sensors in a MySQL database. Barnyard is an open source tool designed to work with Snort to take...Technology ITI Information Technology Infrastructure J2EE Java 2 Enterprise Edition JAR Java Archive. This is an archive file format defined by Java ...standards. JDBC Java Database Connectivity JDW JNDMS Data Warehouse JNDMS Joint Network and Defence Management System JNDMS Joint Network Defence and
Selection and implementation of a distributed phased archive for a multivendor incremental approach to PACS

NASA Astrophysics Data System (ADS)

Smith, Edward M.; Wandtke, John; Robinson, Arvin E.

1999-07-01

The selection criteria for the archive were based on the objectives of the Medical Information, Communication and Archive System (MICAS), a multi-vendor incremental approach to PACS. These objectives include interoperability between all components, seamless integration of the Radiology Information System (RIS) with MICAS and eventually other hospital databases, all components must demonstrate DICOM compliance prior to acceptance and automated workflow that can be programmed to meet changes in the healthcare environment. The long-term multi-modality archive is being implemented in 3 or more phases with the first phase designed to provide a 12 to 18 month storage solution. This decision was made because the cost per GB of storage is rapidly decreasing and the speed at which data can be retrieved is increasing with time. The open-solution selected allows incorporation of leading edge, 'best of breed' hardware and software and provides maximum jukeboxes, provides maximum flexibility of workflow both within and outside of radiology. The selected solution is media independent, supports multiple jukeboxes, provides expandable storage capacity and will provide redundancy and fault tolerance at minimal cost. Some of the required attributes of the archive include scalable archive strategy, virtual image database with global query and object-oriented database. The selection process took approximately 10 months with Cemax-Icon being the vendor selected. Prior to signing a purchase order, Cemax-Icon performed a site survey, agreed upon the acceptance test protocol and provided a written guarantee of connectivity between their archive and the imaging modalities and other MICAS components.
A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design.

PubMed

Ó Conchúir, Shane; Barlow, Kyle A; Pache, Roland A; Ollikainen, Noah; Kundert, Kale; O'Meara, Matthew J; Smith, Colin A; Kortemme, Tanja

2015-01-01

The development and validation of computational macromolecular modeling and design methods depend on suitable benchmark datasets and informative metrics for comparing protocols. In addition, if a method is intended to be adopted broadly in diverse biological applications, there needs to be information on appropriate parameters for each protocol, as well as metrics describing the expected accuracy compared to experimental data. In certain disciplines, there exist established benchmarks and public resources where experts in a particular methodology are encouraged to supply their most efficient implementation of each particular benchmark. We aim to provide such a resource for protocols in macromolecular modeling and design. We present a freely accessible web resource (https://kortemmelab.ucsf.edu/benchmarks) to guide the development of protocols for protein modeling and design. The site provides benchmark datasets and metrics to compare the performance of a variety of modeling protocols using different computational sampling methods and energy functions, providing a "best practice" set of parameters for each method. Each benchmark has an associated downloadable benchmark capture archive containing the input files, analysis scripts, and tutorials for running the benchmark. The captures may be run with any suitable modeling method; we supply command lines for running the benchmarks using the Rosetta software suite. We have compiled initial benchmarks for the resource spanning three key areas: prediction of energetic effects of mutations, protein design, and protein structure prediction, each with associated state-of-the-art modeling protocols. With the help of the wider macromolecular modeling community, we hope to expand the variety of benchmarks included on the website and continue to evaluate new iterations of current methods as they become available.

Yucca Mountain licensing support network archive assistant.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dunlavy, Daniel M.; Bauer, Travis L.; Verzi, Stephen J.

2008-03-01

This report describes the Licensing Support Network (LSN) Assistant--a set of tools for categorizing e-mail messages and documents, and investigating and correcting existing archives of categorized e-mail messages and documents. The two main tools in the LSN Assistant are the LSN Archive Assistant (LSNAA) tool for recategorizing manually labeled e-mail messages and documents and the LSN Realtime Assistant (LSNRA) tool for categorizing new e-mail messages and documents. This report focuses on the LSNAA tool. There are two main components of the LSNAA tool. The first is the Sandia Categorization Framework, which is responsible for providing categorizations for documents in anmore » archive and storing them in an appropriate Categorization Database. The second is the actual user interface, which primarily interacts with the Categorization Database, providing a way for finding and correcting categorizations errors in the database. A procedure for applying the LSNAA tool and an example use case of the LSNAA tool applied to a set of e-mail messages are provided. Performance results of the categorization model designed for this example use case are presented.« less
Report on phase 2 of 1990 OSSA data census

NASA Technical Reports Server (NTRS)

King, Joseph H.

1991-01-01

The 1990 NASA Office of Space Science and Applications (OSSA) data census contributed significantly to a database intended to identify and briefly describe all archived data and potentially archive-desirable data from active and inactive NASA/OSSA spaceflight investigations. This database is being created at the National Space Science Data Center (NSSDC) and will be more comprehensive than the databases describing data held at individual sites such as NSSDC. Active investigations are defined as those flying on currently operational spacecraft or those that flew on recently operational spacecraft for which project-coordinated data archiving is continuing. Inactive investigations are those that flew on spacecraft that are no longer operational, and for which systematic archiving has ended. The principal purpose of this report is to describe phase two of the 1990 census. The previously reported phase one of this census, as well as a relevant 1981 census, are briefly discussed. The phase two survey instrument is described, those surveyed and respondents/reponses are identified, and data sets from inactive OSSA investigations discovered during the 1990 census (phase two) are briefly discussed individually.
The CHARA Array Database

NASA Astrophysics Data System (ADS)

Jones, Jeremy; Schaefer, Gail; ten Brummelaar, Theo; Gies, Douglas; Farrington, Christopher

2018-01-01

We are building a searchable database for the CHARA Array data archive. The Array consists of six telescopes linked together as an interferometer, providing sub-milliarcsecond resolution in the optical and near-infrared. The Array enables a variety of scientific studies, including measuring stellar angular diameters, imaging stellar shapes and surface features, mapping the orbits of close binary companions, and resolving circumstellar environments. This database is one component of an NSF/MSIP funded program to provide open access to the CHARA Array to the broader astronomical community. This archive goes back to 2004 and covers all the beam combiners on the Array. We discuss the current status of and future plans for the public database, and give directions on how to access it.
An Electronic Finding Aid Using Extensible Markup Language (XML) and Encoded Archival Description (EAD).

ERIC Educational Resources Information Center

Chang, May

2000-01-01

Describes the development of electronic finding aids for archives at the University of Illinois, Urbana-Champaign that used XML (extensible markup language) and EAD (encoded archival description) to enable more flexible information management and retrieval than using MARC or a relational database management system. EAD template is appended.…
The ISO Data Archive and Interoperability with Other Archives

NASA Astrophysics Data System (ADS)

Salama, Alberto; Arviset, Christophe; Hernández, José; Dowson, John; Osuna, Pedro

The ESA's Infrared Space Observatory (ISO), an unprecedented observatory for infrared astronomy launched in November 1995, successfully made nearly 30,000 scientific observations in its 2.5-year mission. The ISO data can be retrieved from the ISO Data Archive, available at ISO Data Archive , and comprised of about 150,000 observations, including parallel and serendipity mode observations. A user-friendly Java interface permits queries to the database and data retrieval. The interface currently offers a wide variety of links to other archives, such as name resolution with NED and SIMBAD, access to electronic articles from ADS and CDS/VizieR, and access to IRAS data. In the past year development has been focused on improving the IDA interoperability with other astronomical archives, either by accessing other relevant archives or by providing direct access to the ISO data for external services. A mechanism of information transfer has been developed, allowing direct query to the IDA via a Java Server Page, returning quick look ISO images and relevant, observation-specific information embedded in an HTML page. This method has been used to link from the CDS/Vizier Data Centre and ADS, and work with IPAC to allow access to the ISO Archive from IRSA, including display capabilities of the observed sky regions onto other mission images, is in progress. Prospects for further links to and from other archives and databases are also addressed.
Maintaining Multimedia Data in a Geospatial Database

DTIC Science & Technology

2012-09-01

at PostgreSQL and MySQL as spatial databases was offered. Given their results, as each database produced result sets from zero to 100,000, it was...excelled given multiple conditions. A different look at PostgreSQL and MySQL as spatial databases was offered. Given their results, as each database... MySQL ................................................................................................14 B. BENCHMARKING DATA RETRIEVED FROM TABLE
Neuroimaging Data Sharing on the Neuroinformatics Database Platform

PubMed Central

Book, Gregory A; Stevens, Michael; Assaf, Michal; Glahn, David; Pearlson, Godfrey D

2015-01-01

We describe the Neuroinformatics Database (NiDB), an open-source database platform for archiving, analysis, and sharing of neuroimaging data. Data from the multi-site projects Autism Brain Imaging Data Exchange (ABIDE), Bipolar-Schizophrenia Network on Intermediate Phenotypes parts one and two (B-SNIP1, B-SNIP2), and Monetary Incentive Delay task (MID) are available for download from the public instance of NiDB, with more projects sharing data as it becomes available. As demonstrated by making several large datasets available, NiDB is an extensible platform appropriately suited to archive and distribute shared neuroimaging data. PMID:25888923
Practice Benchmarking in the Age of Targeted Auditing

PubMed Central

Langdale, Ryan P.; Holland, Ben F.

2012-01-01

The frequency and sophistication of health care reimbursement auditing has progressed rapidly in recent years, leaving many oncologists wondering whether their private practices would survive a full-scale Office of the Inspector General (OIG) investigation. The Medicare Part B claims database provides a rich source of information for physicians seeking to understand how their billing practices measure up to their peers, both locally and nationally. This database was dissected by a team of cancer specialists to uncover important benchmarks related to targeted auditing. All critical Medicare charges, payments, denials, and service ratios in this article were derived from the full 2010 Medicare Part B claims database. Relevant claims were limited by using Medicare provider specialty codes 83 (hematology/oncology) and 90 (medical oncology), with an emphasis on claims filed from the physician office place of service (11). All charges, denials, and payments were summarized at the Current Procedural Terminology code level to drive practice benchmarking standards. A careful analysis of this data set, combined with the published audit priorities of the OIG, produced germane benchmarks from which medical oncologists can monitor, measure and improve on common areas of billing fraud, waste or abuse in their practices. Part II of this series and analysis will focus on information pertinent to radiation oncologists. PMID:23598847
Practice benchmarking in the age of targeted auditing.

PubMed

Langdale, Ryan P; Holland, Ben F

2012-11-01

The frequency and sophistication of health care reimbursement auditing has progressed rapidly in recent years, leaving many oncologists wondering whether their private practices would survive a full-scale Office of the Inspector General (OIG) investigation. The Medicare Part B claims database provides a rich source of information for physicians seeking to understand how their billing practices measure up to their peers, both locally and nationally. This database was dissected by a team of cancer specialists to uncover important benchmarks related to targeted auditing. All critical Medicare charges, payments, denials, and service ratios in this article were derived from the full 2010 Medicare Part B claims database. Relevant claims were limited by using Medicare provider specialty codes 83 (hematology/oncology) and 90 (medical oncology), with an emphasis on claims filed from the physician office place of service (11). All charges, denials, and payments were summarized at the Current Procedural Terminology code level to drive practice benchmarking standards. A careful analysis of this data set, combined with the published audit priorities of the OIG, produced germane benchmarks from which medical oncologists can monitor, measure and improve on common areas of billing fraud, waste or abuse in their practices. Part II of this series and analysis will focus on information pertinent to radiation oncologists.
48 CFR 22.404-1 - Types of wage determinations.

Code of Federal Regulations, 2010 CFR

2010-10-01

... in the “Archived DB WD” database on WDOL for information purposes only. Contracting officers may not use an archived wage determination in a contract action without obtaining prior approval of the...
Developing a Benchmarking Process in Perfusion: A Report of the Perfusion Downunder Collaboration

PubMed Central

Baker, Robert A.; Newland, Richard F.; Fenton, Carmel; McDonald, Michael; Willcox, Timothy W.; Merry, Alan F.

2012-01-01

Abstract: Improving and understanding clinical practice is an appropriate goal for the perfusion community. The Perfusion Downunder Collaboration has established a multi-center perfusion focused database aimed at achieving these goals through the development of quantitative quality indicators for clinical improvement through benchmarking. Data were collected using the Perfusion Downunder Collaboration database from procedures performed in eight Australian and New Zealand cardiac centers between March 2007 and February 2011. At the Perfusion Downunder Meeting in 2010, it was agreed by consensus, to report quality indicators (QI) for glucose level, arterial outlet temperature, and pCO2 management during cardiopulmonary bypass. The values chosen for each QI were: blood glucose ≥4 mmol/L and ≤10 mmol/L; arterial outlet temperature ≤37°C; and arterial blood gas pCO2 ≥ 35 and ≤45 mmHg. The QI data were used to derive benchmarks using the Achievable Benchmark of Care (ABC™) methodology to identify the incidence of QIs at the best performing centers. Five thousand four hundred and sixty-five procedures were evaluated to derive QI and benchmark data. The incidence of the blood glucose QI ranged from 37–96% of procedures, with a benchmark value of 90%. The arterial outlet temperature QI occurred in 16–98% of procedures with the benchmark of 94%; while the arterial pCO2 QI occurred in 21–91%, with the benchmark value of 80%. We have derived QIs and benchmark calculations for the management of several key aspects of cardiopulmonary bypass to provide a platform for improving the quality of perfusion practice. PMID:22730861
How to benchmark methods for structure-based virtual screening of large compound libraries.

PubMed

Christofferson, Andrew J; Huang, Niu

2012-01-01

Structure-based virtual screening is a useful computational technique for ligand discovery. To systematically evaluate different docking approaches, it is important to have a consistent benchmarking protocol that is both relevant and unbiased. Here, we describe the designing of a benchmarking data set for docking screen assessment, a standard docking screening process, and the analysis and presentation of the enrichment of annotated ligands among a background decoy database.
Astronomical Surveys, Catalogs, Databases, and Archives

NASA Astrophysics Data System (ADS)

Mickaelian, A. M.

2016-06-01

All-sky and large-area astronomical surveys and their cataloged data over the whole range of electromagnetic spectrum are reviewed, from γ-ray to radio, such as Fermi-GLAST and INTEGRAL in γ-ray, ROSAT, XMM and Chandra in X-ray, GALEX in UV, SDSS and several POSS I and II based catalogues (APM, MAPS, USNO, GSC) in optical range, 2MASS in NIR, WISE and AKARI IRC in MIR, IRAS and AKARI FIS in FIR, NVSS and FIRST in radio and many others, as well as most important surveys giving optical images (DSS I and II, SDSS, etc.), proper motions (Tycho, USNO, Gaia), variability (GCVS, NSVS, ASAS, Catalina, Pan-STARRS) and spectroscopic data (FBS, SBS, Case, HQS, HES, SDSS, CALIFA, GAMA). Most important astronomical databases and archives are reviewed as well, including Wide-Field Plate DataBase (WFPDB), ESO, HEASARC, IRSA and MAST archives, CDS SIMBAD, VizieR and Aladin, NED and HyperLEDA extragalactic databases, ADS and astro-ph services. They are powerful sources for many-sided efficient research using Virtual Observatory tools. Using and analysis of Big Data accumulated in astronomy lead to many new discoveries.
An XML-based Generic Tool for Information Retrieval in Solar Databases

NASA Astrophysics Data System (ADS)

Scholl, Isabelle F.; Legay, Eric; Linsolas, Romain

This paper presents the current architecture of the `Solar Web Project' now in its development phase. This tool will provide scientists interested in solar data with a single web-based interface for browsing distributed and heterogeneous catalogs of solar observations. The main goal is to have a generic application that can be easily extended to new sets of data or to new missions with a low level of maintenance. It is developed with Java and XML is used as a powerful configuration language. The server, independent of any database scheme, can communicate with a client (the user interface) and several local or remote archive access systems (such as existing web pages, ftp sites or SQL databases). Archive access systems are externally described in XML files. The user interface is also dynamically generated from an XML file containing the window building rules and a simplified database description. This project is developed at MEDOC (Multi-Experiment Data and Operations Centre), located at the Institut d'Astrophysique Spatiale (Orsay, France). Successful tests have been conducted with other solar archive access systems.
A global multiproxy database for temperature reconstructions of the Common Era.

PubMed

2017-07-11

Reproducible climate reconstructions of the Common Era (1 CE to present) are key to placing industrial-era warming into the context of natural climatic variability. Here we present a community-sourced database of temperature-sensitive proxy records from the PAGES2k initiative. The database gathers 692 records from 648 locations, including all continental regions and major ocean basins. The records are from trees, ice, sediment, corals, speleothems, documentary evidence, and other archives. They range in length from 50 to 2000 years, with a median of 547 years, while temporal resolution ranges from biweekly to centennial. Nearly half of the proxy time series are significantly correlated with HadCRUT4.2 surface temperature over the period 1850-2014. Global temperature composites show a remarkable degree of coherence between high- and low-resolution archives, with broadly similar patterns across archive types, terrestrial versus marine locations, and screening criteria. The database is suited to investigations of global and regional temperature variability over the Common Era, and is shared in the Linked Paleo Data (LiPD) format, including serializations in Matlab, R and Python.
A global multiproxy database for temperature reconstructions of the Common Era

USGS Publications Warehouse

Emile-Geay, Julian; McKay, Nicholas P.; Kaufman, Darrell S.; von Gunten, Lucien; Wang, Jianghao; Anchukaitis, Kevin J.; Abram, Nerilie J.; Addison, Jason A.; Curran, Mark A.J.; Evans, Michael N.; Henley, Benjamin J.; Hao, Zhixin; Martrat, Belen; McGregor, Helen V.; Neukom, Raphael; Pederson, Gregory T.; Stenni, Barbara; Thirumalai, Kaustubh; Werner, Johannes P.; Xu, Chenxi; Divine, Dmitry V.; Dixon, Bronwyn C.; Gergis, Joelle; Mundo, Ignacio A.; Nakatsuka, T.; Phipps, Steven J.; Routson, Cody C.; Steig, Eric J.; Tierney, Jessica E.; Tyler, Jonathan J.; Allen, Kathryn J.; Bertler, Nancy A. N.; Bjorklund, Jesper; Chase, Brian M.; Chen, Min-Te; Cook, Ed; de Jong, Rixt; DeLong, Kristine L.; Dixon, Daniel A.; Ekaykin, Alexey A.; Ersek, Vasile; Filipsson, Helena L.; Francus, Pierre; Freund, Mandy B.; Frezzotti, M.; Gaire, Narayan P.; Gajewski, Konrad; Ge, Quansheng; Goosse, Hugues; Gornostaeva, Anastasia; Grosjean, Martin; Horiuchi, Kazuho; Hormes, Anne; Husum, Katrine; Isaksson, Elisabeth; Kandasamy, Selvaraj; Kawamura, Kenji; Koc, Nalan; Leduc, Guillaume; Linderholm, Hans W.; Lorrey, Andrew M.; Mikhalenko, Vladimir; Mortyn, P. Graham; Motoyama, Hideaki; Moy, Andrew D.; Mulvaney, Robert; Munz, Philipp M.; Nash, David J.; Oerter, Hans; Opel, Thomas; Orsi, Anais J.; Ovchinnikov, Dmitriy V.; Porter, Trevor J.; Roop, Heidi; Saenger, Casey; Sano, Masaki; Sauchyn, David; Saunders, K.M.; Seidenkrantz, Marit-Solveig; Severi, Mirko; Shao, X.; Sicre, Marie-Alexandrine; Sigl, Michael; Sinclair, Kate; St. George, Scott; St. Jacques, Jeannine-Marie; Thamban, Meloth; Thapa, Udya Kuwar; Thomas, E.; Turney, Chris; Uemura, Ryu; Viau, A.E.; Vladimirova, Diana O.; Wahl, Eugene; White, James W. C.; Yu, Z.; Zinke, Jens

2017-01-01

Reproducible climate reconstructions of the Common Era (1 CE to present) are key to placing industrial-era warming into the context of natural climatic variability. Here we present a community-sourced database of temperature-sensitive proxy records from the PAGES2k initiative. The database gathers 692 records from 648 locations, including all continental regions and major ocean basins. The records are from trees, ice, sediment, corals, speleothems, documentary evidence, and other archives. They range in length from 50 to 2000 years, with a median of 547 years, while temporal resolution ranges from biweekly to centennial. Nearly half of the proxy time series are significantly correlated with HadCRUT4.2 surface temperature over the period 1850–2014. Global temperature composites show a remarkable degree of coherence between high- and low-resolution archives, with broadly similar patterns across archive types, terrestrial versus marine locations, and screening criteria. The database is suited to investigations of global and regional temperature variability over the Common Era, and is shared in the Linked Paleo Data (LiPD) format, including serializations in Matlab, R and Python.
A global multiproxy database for temperature reconstructions of the Common Era

PubMed Central

Emile-Geay, Julien; McKay, Nicholas P.; Kaufman, Darrell S.; von Gunten, Lucien; Wang, Jianghao; Anchukaitis, Kevin J.; Abram, Nerilie J.; Addison, Jason A.; Curran, Mark A.J.; Evans, Michael N.; Henley, Benjamin J.; Hao, Zhixin; Martrat, Belen; McGregor, Helen V.; Neukom, Raphael; Pederson, Gregory T.; Stenni, Barbara; Thirumalai, Kaustubh; Werner, Johannes P.; Xu, Chenxi; Divine, Dmitry V.; Dixon, Bronwyn C.; Gergis, Joelle; Mundo, Ignacio A.; Nakatsuka, Takeshi; Phipps, Steven J.; Routson, Cody C.; Steig, Eric J.; Tierney, Jessica E.; Tyler, Jonathan J.; Allen, Kathryn J.; Bertler, Nancy A.N.; Björklund, Jesper; Chase, Brian M.; Chen, Min-Te; Cook, Ed; de Jong, Rixt; DeLong, Kristine L.; Dixon, Daniel A.; Ekaykin, Alexey A.; Ersek, Vasile; Filipsson, Helena L.; Francus, Pierre; Freund, Mandy B.; Frezzotti, Massimo; Gaire, Narayan P.; Gajewski, Konrad; Ge, Quansheng; Goosse, Hugues; Gornostaeva, Anastasia; Grosjean, Martin; Horiuchi, Kazuho; Hormes, Anne; Husum, Katrine; Isaksson, Elisabeth; Kandasamy, Selvaraj; Kawamura, Kenji; Kilbourne, K. Halimeda; Koç, Nalan; Leduc, Guillaume; Linderholm, Hans W.; Lorrey, Andrew M.; Mikhalenko, Vladimir; Mortyn, P. Graham; Motoyama, Hideaki; Moy, Andrew D.; Mulvaney, Robert; Munz, Philipp M.; Nash, David J.; Oerter, Hans; Opel, Thomas; Orsi, Anais J.; Ovchinnikov, Dmitriy V.; Porter, Trevor J.; Roop, Heidi A.; Saenger, Casey; Sano, Masaki; Sauchyn, David; Saunders, Krystyna M.; Seidenkrantz, Marit-Solveig; Severi, Mirko; Shao, Xuemei; Sicre, Marie-Alexandrine; Sigl, Michael; Sinclair, Kate; St. George, Scott; St. Jacques, Jeannine-Marie; Thamban, Meloth; Kuwar Thapa, Udya; Thomas, Elizabeth R.; Turney, Chris; Uemura, Ryu; Viau, Andre E.; Vladimirova, Diana O.; Wahl, Eugene R.; White, James W.C.; Yu, Zicheng; Zinke, Jens

2017-01-01

Reproducible climate reconstructions of the Common Era (1 CE to present) are key to placing industrial-era warming into the context of natural climatic variability. Here we present a community-sourced database of temperature-sensitive proxy records from the PAGES2k initiative. The database gathers 692 records from 648 locations, including all continental regions and major ocean basins. The records are from trees, ice, sediment, corals, speleothems, documentary evidence, and other archives. They range in length from 50 to 2000 years, with a median of 547 years, while temporal resolution ranges from biweekly to centennial. Nearly half of the proxy time series are significantly correlated with HadCRUT4.2 surface temperature over the period 1850–2014. Global temperature composites show a remarkable degree of coherence between high- and low-resolution archives, with broadly similar patterns across archive types, terrestrial versus marine locations, and screening criteria. The database is suited to investigations of global and regional temperature variability over the Common Era, and is shared in the Linked Paleo Data (LiPD) format, including serializations in Matlab, R and Python. PMID:28696409
Between a Map and a Data Rod

NASA Technical Reports Server (NTRS)

Teng, William; Rui, Hualan; Strub, Richard; Vollmer, Bruce

2015-01-01

A Digital Divide has long stood between how NASA and other satellite-derived data are typically archived (time-step arrays or maps) and how hydrology and other point-time series oriented communities prefer to access those data. In essence, the desired method of data access is orthogonal to the way the data are archived. Our approach to bridging the Divide is part of a larger NASA-supported data rods project to enhance access to and use of NASA and other data by the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) Hydrologic Information System (HIS) and the larger hydrology community. Our main objective was to determine a way to reorganize data that is optimal for these communities. Two related objectives were to optimally reorganize data in a way that (1) is operational and fits in and leverages the existing Goddard Earth Sciences Data and Information Services Center (GES DISC) operational environment and (2) addresses the scaling up of data sets available as time series from those archived at the GES DISC to potentially include those from other Earth Observing System Data and Information System (EOSDIS) data archives. Through several prototype efforts and lessons learned, we arrived at a non-database solution that satisfied our objectivesconstraints. We describe, in this presentation, how we implemented the operational production of pre-generated data rods and, considering the tradeoffs between length of time series (or number of time steps), resources needed, and performance, how we implemented the operational production of on-the-fly (virtual) data rods. For the virtual data rods, we leveraged a number of existing resources, including the NASA Giovanni Cache and NetCDF Operators (NCO) and used data cubes processed in parallel. Our current benchmark performance for virtual generation of data rods is about a years worth of time series for hourly data (9,000 time steps) in 90 seconds. Our approach is a specific implementation of the general optimal strategy of reorganizing data to match the desired means of access. Results from our project have already significantly extended NASA data to the large and important hydrology user community that has been, heretofore, mostly unable to easily access and use NASA data.
Between a Map and a Data Rod

NASA Astrophysics Data System (ADS)

Teng, W. L.; Rui, H.; Strub, R. F.; Vollmer, B.

2015-12-01

A "Digital Divide" has long stood between how NASA and other satellite-derived data are typically archived (time-step arrays or "maps") and how hydrology and other point-time series oriented communities prefer to access those data. In essence, the desired method of data access is orthogonal to the way the data are archived. Our approach to bridging the Divide is part of a larger NASA-supported "data rods" project to enhance access to and use of NASA and other data by the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) Hydrologic Information System (HIS) and the larger hydrology community. Our main objective was to determine a way to reorganize data that is optimal for these communities. Two related objectives were to optimally reorganize data in a way that (1) is operational and fits in and leverages the existing Goddard Earth Sciences Data and Information Services Center (GES DISC) operational environment and (2) addresses the scaling up of data sets available as time series from those archived at the GES DISC to potentially include those from other Earth Observing System Data and Information System (EOSDIS) data archives. Through several prototype efforts and lessons learned, we arrived at a non-database solution that satisfied our objectives/constraints. We describe, in this presentation, how we implemented the operational production of pre-generated data rods and, considering the tradeoffs between length of time series (or number of time steps), resources needed, and performance, how we implemented the operational production of on-the-fly ("virtual") data rods. For the virtual data rods, we leveraged a number of existing resources, including the NASA Giovanni Cache and NetCDF Operators (NCO) and used data cubes processed in parallel. Our current benchmark performance for virtual generation of data rods is about a year's worth of time series for hourly data (~9,000 time steps) in ~90 seconds. Our approach is a specific implementation of the general optimal strategy of reorganizing data to match the desired means of access. Results from our project have already significantly extended NASA data to the large and important hydrology user community that has been, heretofore, mostly unable to easily access and use NASA data.
The NASA Exoplanet Science Institute Archives: KOA and NStED

NASA Astrophysics Data System (ADS)

Berriman, G. B.; Ciardi, D.; Abajian, M.; Barlow, T.; Bryden, G.; von Braun, K.; Good, J.; Kane, S.; Kong, M.; Laity, A.; Lynn, M.; Elroy, D. M.; Plavchan, P.; Ramirez, S.; Schmitz, M.; Stauffer, J.; Wyatt, P.; Zhang, A.; Goodrich, R.; Mader, J.; Tran, H.; Tsubota, M.; Beekley, A.; Berukoff, S.; Chan, B.; Lau, C.; Regelson, M.; Saucedo, M.; Swain, M.

2010-12-01

The NASA Exoplanet Science Institute (NExScI) maintains a series of archival services in support of NASA’s planet finding and characterization goals. Two of the larger archival services at NExScI are the Keck Observatory Archive (KOA) and the NASA Star and Exoplanet Database (NStED). KOA, a collaboration between the W. M. Keck Observatory and NExScI, serves raw data from the High Resolution Echelle Spectrograph (HIRES) and extracted spectral browse products. As of June 2009, KOA hosts over 28 million files (4.7 TB) from over 2,000 nights. In Spring 2010, it will begin to serve data from the Near-Infrared Echelle Spectrograph (NIRSPEC). NStED is a general purpose archive with the aim of providing support for NASA’s planet finding and characterization goals, and stellar astrophysics. There are two principal components of NStED: a database of (currently) all known exoplanets, and images; and an archive dedicated to high precision photometric surveys for transiting exoplanets. NStED is the US portal to the CNES mission CoRoT, the first space mission dedicated to the discovery and characterization of exoplanets. These archives share a common software and hardware architecture with the NASA/IPAC Infrared Science Archive (IRSA). The software architecture consists of standalone utilities that perform generic query and retrieval functions. They are called through program interfaces and plugged together to form applications through a simple executive library.

AstroCloud, a Cyber-Infrastructure for Astronomy Research: Data Archiving and Quality Control

NASA Astrophysics Data System (ADS)

He, B.; Cui, C.; Fan, D.; Li, C.; Xiao, J.; Yu, C.; Wang, C.; Cao, Z.; Chen, J.; Yi, W.; Li, S.; Mi, L.; Yang, S.

2015-09-01

AstroCloud is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences)1(Cui et al. 2014). To archive the astronomical data in China, we present the implementation of the astronomical data archiving system (ADAS). Data archiving and quality control are the infrastructure for the AstroCloud. Throughout the data of the entire life cycle, data archiving system standardized data, transferring data, logging observational data, archiving ambient data, And storing these data and metadata in database. Quality control covers the whole process and all aspects of data archiving.
gPhoton: Time-tagged GALEX photon events analysis tools

NASA Astrophysics Data System (ADS)

Million, Chase C.; Fleming, S. W.; Shiao, B.; Loyd, P.; Seibert, M.; Smith, M.

2016-03-01

Written in Python, gPhoton calibrates and sky-projects the ~1.1 trillion ultraviolet photon events detected by the microchannel plates on the Galaxy Evolution Explorer Spacecraft (GALEX), archives these events in a publicly accessible database at the Mikulski Archive for Space Telescopes (MAST), and provides tools for working with the database to extract scientific results, particularly over short time domains. The software includes a re-implementation of core functionality of the GALEX mission calibration pipeline to produce photon list files from raw spacecraft data as well as a suite of command line tools to generate calibrated light curves, images, and movies from the MAST database.
Measurements of 100 'Critical' Minor Planets from NEAT Archive

NASA Astrophysics Data System (ADS)

Deshmukh, Shishir

2017-07-01

Uncertainties associated with the orbits of minor planets can be reduced by analyzing archival imagery as attempted in the current investigation. Archival images from NEAT and NASA’s Skymorph database were analyzed using standard software to identify the minor planets listed in the critical list. Findings of each minor planet were submitted to Minor Planet Center (MPC) to offer better orbital solutions.
The Alaska Arctic Vegetation Archive (AVA-AK)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walker, Donald; Breen, Amy; Druckenmiller, Lisa

The Alaska Arctic Vegetation Archive (AVA-AK, GIVD-ID: NA-US-014) is a free, publically available database archive of vegetation-plot data from the Arctic tundra region of northern Alaska. The archive currently contains 24 datasets with 3,026 non-overlapping plots. Of these, 74% have geolocation data with 25-m or better precision. Species cover data and header data are stored in a Turboveg database. A standardized Pan Arctic Species List provides a consistent nomenclature for vascular plants, bryophytes, and lichens in the archive. A web-based online Alaska Arctic Geoecological Atlas (AGA-AK) allows viewing and downloading the species data in a variety of formats, and providesmore » access to a wide variety of ancillary data. We conducted a preliminary cluster analysis of the first 16 datasets (1,613 plots) to examine how the spectrum of derived clusters is related to the suite of datasets, habitat types, and environmental gradients. Here, we present the contents of the archive, assess its strengths and weaknesses, and provide three supplementary files that include the data dictionary, a list of habitat types, an overview of the datasets, and details of the cluster analysis.« less
The Alaska Arctic Vegetation Archive (AVA-AK)

DOE PAGES

Walker, Donald; Breen, Amy; Druckenmiller, Lisa; ...

2016-05-17

The Alaska Arctic Vegetation Archive (AVA-AK, GIVD-ID: NA-US-014) is a free, publically available database archive of vegetation-plot data from the Arctic tundra region of northern Alaska. The archive currently contains 24 datasets with 3,026 non-overlapping plots. Of these, 74% have geolocation data with 25-m or better precision. Species cover data and header data are stored in a Turboveg database. A standardized Pan Arctic Species List provides a consistent nomenclature for vascular plants, bryophytes, and lichens in the archive. A web-based online Alaska Arctic Geoecological Atlas (AGA-AK) allows viewing and downloading the species data in a variety of formats, and providesmore » access to a wide variety of ancillary data. We conducted a preliminary cluster analysis of the first 16 datasets (1,613 plots) to examine how the spectrum of derived clusters is related to the suite of datasets, habitat types, and environmental gradients. Here, we present the contents of the archive, assess its strengths and weaknesses, and provide three supplementary files that include the data dictionary, a list of habitat types, an overview of the datasets, and details of the cluster analysis.« less
Superfund Public Information System (SPIS), January 1999

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1999-01-01

The Superfund Public Information System (SPIS) on CD-ROM contains Superfund data for the United States Environmental Protection Agency. The Superfund data is a collection of three databases: Records of Decision (RODS); Comprehensive Environmental, Response, Compensation, and Liability Information System (CERCLIS); and Archive (NFRAP). Descriptions of these databases and CD contents are listed below. Data content: The CD contains the complete text of the official ROD documents signed and issued by EPA from fiscal years 1982--1996; 147 RODs for fiscal year 1997; and seven RODs for fiscal year 1998. The CD also contains 89 Explanation of Significant Difference (ESD) documents, asmore » well as 48 ROD Amendments. CERCLIS and Archive (NFRAP) data is through January 19, 1999. RODS is the Records Of Decision System. RODS is used to track site clean-ups under the Superfund program to justify the type of treatment chosen at each site. RODS contains information on technology justification, site history, community participation, enforcement activities, site characteristics, scope and role of response action, and remedy. Explanation of Significant Differences (ESDs) are also available on the CD. CERCLIS is the Comprehensive Environmental Response, Compensation, and Liability Information System. It is the official repository for all Superfund site and incident data. It contains comprehensive information on hazardous waste sites, site inspections, preliminary assessments, and remedial status. The system is sponsored by the EPA`s Office of Emergency and Remedial Response, Information Management Center. Archive (NFRAP) consists of hazardous waste sites that have no further remedial action planned; only basic identifying information is provided for archive sites. The sites found in the Archive database were originally in the CERCLIS database, but were removed beginning in the fall of 1995.« less
Warfighter Visualizations Compilations

DTIC Science & Technology

2013-05-01

list of the user’s favorite websites or other textual content, sub-categorized into types, such as blogs, social networking sites, comics , videos...available: The example in the prototype shows a random archived comic from the website. Other options include thumbnail strips of imagery or dynamic...varied, and range from serving as statistical benchmarks, for increasing social consciousness and interaction, for improving educational interactions
Profiles of international archives: Les archives Jean Piaget, University of Geneva, Switzerland.

PubMed

Burman, Jeremy Trevelyan

2013-05-01

This research report provides a look behind closed doors at the Jean Piaget Archives in Geneva, Switzerland. It situates the potential visitor, contextualizes the Archives in its own history, and then describes what scholars can expect to find. New details about Piaget's views on Equal Rights and Equal Pay are also provided, including a look at how they affected the women who worked his factory (esp. Bärbel Inhelder). (PsycINFO Database Record (c) 2013 APA, all rights reserved).
samiDB: A Prototype Data Archive for Big Science Exploration

NASA Astrophysics Data System (ADS)

Konstantopoulos, I. S.; Green, A. W.; Cortese, L.; Foster, C.; Scott, N.

2015-04-01

samiDB is an archive, database, and query engine to serve the spectra, spectral hypercubes, and high-level science products that make up the SAMI Galaxy Survey. Based on the versatile Hierarchical Data Format (HDF5), samiDB does not depend on relational database structures and hence lightens the setup and maintenance load imposed on science teams by metadata tables. The code, written in Python, covers the ingestion, querying, and exporting of data as well as the automatic setup of an HTML schema browser. samiDB serves as a maintenance-light data archive for Big Science and can be adopted and adapted by science teams that lack the means to hire professional archivists to set up the data back end for their projects.
NLTE4 Plasma Population Kinetics Database

National Institute of Standards and Technology Data Gateway

SRD 159 NLTE4 Plasma Population Kinetics Database (Web database for purchase) This database contains benchmark results for simulation of plasma population kinetics and emission spectra. The data were contributed by the participants of the 4th Non-LTE Code Comparison Workshop who have unrestricted access to the database. The only limitation for other users is in hidden labeling of the output results. Guest users can proceed to the database entry page without entering userid and password.
The Biological Macromolecule Crystallization Database and NASA Protein Crystal Growth Archive

PubMed Central

Gilliland, Gary L.; Tung, Michael; Ladner, Jane

1996-01-01

The NIST/NASA/CARB Biological Macromolecule Crystallization Database (BMCD), NIST Standard Reference Database 21, contains crystal data and crystallization conditions for biological macromolecules. The database entries include data abstracted from published crystallographic reports. Each entry consists of information describing the biological macromolecule crystallized and crystal data and the crystallization conditions for each crystal form. The BMCD serves as the NASA Protein Crystal Growth Archive in that it contains protocols and results of crystallization experiments undertaken in microgravity (space). These database entries report the results, whether successful or not, from NASA-sponsored protein crystal growth experiments in microgravity and from microgravity crystallization studies sponsored by other international organizations. The BMCD was designed as a tool to assist x-ray crystallographers in the development of protocols to crystallize biological macromolecules, those that have previously been crystallized, and those that have not been crystallized. PMID:11542472
Current status of the international Halley Watch infrared net archive

NASA Technical Reports Server (NTRS)

Mcguinness, Brian B.

1988-01-01

The primary purposes of the Halley Watch have been to promote Halley observations, coordinate and standardize the observing where useful, and to archive the results in a database readily accessible to cometary scientists. The intention of IHW is to store the observations themselves, along with any information necessary to allow users to understand and use the data, but to exclude interpretations of these data. Each of the archives produced by the IHW will appear in two versions: a printed archive and a digital archive on CD-ROMs. The archive is expected to have a very long lifetime. The IHW has already produced an archive for P/Crommelin. This consists of one printed volume and two 1600 bpi tapes. The Halley archive will contain at least twenty gigabytes of information.
RoMEO Studies 8: Self-Archiving: The Logic behind the Colour-Coding Used in the Copyright Knowledge Bank

ERIC Educational Resources Information Center

Jenkins, Celia; Probets, Steve; Oppenheim, Charles; Hubbard, Bill

2007-01-01

Purpose: The purpose of this research is to show how the self-archiving of journal papers is a major step towards providing open access to research. However, copyright transfer agreements (CTAs) that are signed by an author prior to publication often indicate whether, and in what form, self-archiving is allowed. The SHERPA/RoMEO database enables…
New Capabilities in the Astrophysics Multispectral Archive Search Engine

NASA Astrophysics Data System (ADS)

Cheung, C. Y.; Kelley, S.; Roussopoulos, N.

The Astrophysics Multispectral Archive Search Engine (AMASE) uses object-oriented database techniques to provide a uniform multi-mission and multi-spectral interface to search for data in the distributed archives. We describe our experience of porting AMASE from Illustra object-relational DBMS to the Informix Universal Data Server. New capabilities and utilities have been developed, including a spatial datablade that supports Nearest Neighbor queries.
Benchmark of four popular virtual screening programs: construction of the active/decoy dataset remains a major determinant of measured performance.

PubMed

Chaput, Ludovic; Martinez-Sanz, Juan; Saettel, Nicolas; Mouawad, Liliane

2016-01-01

In a structure-based virtual screening, the choice of the docking program is essential for the success of a hit identification. Benchmarks are meant to help in guiding this choice, especially when undertaken on a large variety of protein targets. Here, the performance of four popular virtual screening programs, Gold, Glide, Surflex and FlexX, is compared using the Directory of Useful Decoys-Enhanced database (DUD-E), which includes 102 targets with an average of 224 ligands per target and 50 decoys per ligand, generated to avoid biases in the benchmarking. Then, a relationship between these program performances and the properties of the targets or the small molecules was investigated. The comparison was based on two metrics, with three different parameters each. The BEDROC scores with α = 80.5, indicated that, on the overall database, Glide succeeded (score > 0.5) for 30 targets, Gold for 27, FlexX for 14 and Surflex for 11. The performance did not depend on the hydrophobicity nor the openness of the protein cavities, neither on the families to which the proteins belong. However, despite the care in the construction of the DUD-E database, the small differences that remain between the actives and the decoys likely explain the successes of Gold, Surflex and FlexX. Moreover, the similarity between the actives of a target and its crystal structure ligand seems to be at the basis of the good performance of Glide. When all targets with significant biases are removed from the benchmarking, a subset of 47 targets remains, for which Glide succeeded for only 5 targets, Gold for 4 and FlexX and Surflex for 2. The performance dramatic drop of all four programs when the biases are removed shows that we should beware of virtual screening benchmarks, because good performances may be due to wrong reasons. Therefore, benchmarking would hardly provide guidelines for virtual screening experiments, despite the tendency that is maintained, i.e., Glide and Gold display better performance than FlexX and Surflex. We recommend to always use several programs and combine their results. Graphical AbstractSummary of the results obtained by virtual screening with the four programs, Glide, Gold, Surflex and FlexX, on the 102 targets of the DUD-E database. The percentage of targets with successful results, i.e., with BDEROC(α = 80.5) > 0.5, when the entire database is considered are in Blue, and when targets with biased chemical libraries are removed are in Red.
Ground truth and benchmarks for performance evaluation

NASA Astrophysics Data System (ADS)

Takeuchi, Ayako; Shneier, Michael; Hong, Tsai Hong; Chang, Tommy; Scrapper, Christopher; Cheok, Geraldine S.

2003-09-01

Progress in algorithm development and transfer of results to practical applications such as military robotics requires the setup of standard tasks, of standard qualitative and quantitative measurements for performance evaluation and validation. Although the evaluation and validation of algorithms have been discussed for over a decade, the research community still faces a lack of well-defined and standardized methodology. The range of fundamental problems include a lack of quantifiable measures of performance, a lack of data from state-of-the-art sensors in calibrated real-world environments, and a lack of facilities for conducting realistic experiments. In this research, we propose three methods for creating ground truth databases and benchmarks using multiple sensors. The databases and benchmarks will provide researchers with high quality data from suites of sensors operating in complex environments representing real problems of great relevance to the development of autonomous driving systems. At NIST, we have prototyped a High Mobility Multi-purpose Wheeled Vehicle (HMMWV) system with a suite of sensors including a Riegl ladar, GDRS ladar, stereo CCD, several color cameras, Global Position System (GPS), Inertial Navigation System (INS), pan/tilt encoders, and odometry . All sensors are calibrated with respect to each other in space and time. This allows a database of features and terrain elevation to be built. Ground truth for each sensor can then be extracted from the database. The main goal of this research is to provide ground truth databases for researchers and engineers to evaluate algorithms for effectiveness, efficiency, reliability, and robustness, thus advancing the development of algorithms.
Image Databases.

ERIC Educational Resources Information Center

Pettersson, Rune

Different kinds of pictorial databases are described with respect to aims, user groups, search possibilities, storage, and distribution. Some specific examples are given for databases used for the following purposes: (1) labor markets for artists; (2) document management; (3) telling a story; (4) preservation (archives and museums); (5) research;…
SPINS: standardized protein NMR storage. A data dictionary and object-oriented relational database for archiving protein NMR spectra.

PubMed

Baran, Michael C; Moseley, Hunter N B; Sahota, Gurmukh; Montelione, Gaetano T

2002-10-01

Modern protein NMR spectroscopy laboratories have a rapidly growing need for an easily queried local archival system of raw experimental NMR datasets. SPINS (Standardized ProteIn Nmr Storage) is an object-oriented relational database that provides facilities for high-volume NMR data archival, organization of analyses, and dissemination of results to the public domain by automatic preparation of the header files required for submission of data to the BioMagResBank (BMRB). The current version of SPINS coordinates the process from data collection to BMRB deposition of raw NMR data by standardizing and integrating the storage and retrieval of these data in a local laboratory file system. Additional facilities include a data mining query tool, graphical database administration tools, and a NMRStar v2. 1.1 file generator. SPINS also includes a user-friendly internet-based graphical user interface, which is optionally integrated with Varian VNMR NMR data collection software. This paper provides an overview of the data model underlying the SPINS database system, a description of its implementation in Oracle, and an outline of future plans for the SPINS project.
Validation of electronic structure methods for isomerization reactions of large organic molecules.

PubMed

Luo, Sijie; Zhao, Yan; Truhlar, Donald G

2011-08-14

In this work the ISOL24 database of isomerization energies of large organic molecules presented by Huenerbein et al. [Phys. Chem. Chem. Phys., 2010, 12, 6940] is updated, resulting in the new benchmark database called ISOL24/11, and this database is used to test 50 electronic model chemistries. To accomplish the update, the very expensive and highly accurate CCSD(T)-F12a/aug-cc-pVDZ method is first exploited to investigate a six-reaction subset of the 24 reactions, and by comparison of various methods with the benchmark, MCQCISD-MPW is confirmed to be of high accuracy. The final ISOL24/11 database is composed of six reaction energies calculated by CCSD(T)-F12a/aug-cc-pVDZ and 18 calculated by MCQCISD-MPW. We then tested 40 single-component density functionals (both local and hybrid), eight doubly hybrid functionals, and two other methods against ISOL24/11. It is found that the SCS-MP3/CBS method, which is used as benchmark for the original ISOL24, has an MUE of 1.68 kcal mol(-1), which is close to or larger than some of the best tested DFT methods. Using the new benchmark, we find ωB97X-D and MC3MPWB to be the best single-component and doubly hybrid functionals respectively, with PBE0-D3 and MC3MPW performing almost as well. The best single-component density functionals without molecular mechanics dispersion-like terms are M08-SO, M08-HX, M05-2X, and M06-2X. The best single-component density functionals without Hartree-Fock exchange are M06-L-D3 when MM terms are included and M06-L when they are not.
Radiation Embrittlement Archive Project

DOE Office of Scientific and Technical Information (OSTI.GOV)

Klasky, Hilda B; Bass, Bennett Richard; Williams, Paul T

2013-01-01

The Radiation Embrittlement Archive Project (REAP), which is being conducted by the Probabilistic Integrity Safety Assessment (PISA) Program at Oak Ridge National Laboratory under funding from the U.S. Nuclear Regulatory Commission s (NRC) Office of Nuclear Regulatory Research, aims to provide an archival source of information about the effect of neutron radiation on the properties of reactor pressure vessel (RPV) steels. Specifically, this project is an effort to create an Internet-accessible RPV steel embrittlement database. The project s website, https://reap.ornl.gov, provides information in two forms: (1) a document archive with surveillance capsule(s) reports and related technical reports, in PDF format,more » for the 104 commercial nuclear power plants (NPPs) in the United States, with similar reports from other countries; and (2) a relational database archive with detailed information extracted from the reports. The REAP project focuses on data collected from surveillance capsule programs for light-water moderated, nuclear power reactor vessels operated in the United States, including data on Charpy V-notch energy testing results, tensile properties, composition, exposure temperatures, neutron flux (rate of irradiation damage), and fluence, (Fast Neutron Fluence a cumulative measure of irradiation for E>1 MeV). Additionally, REAP contains data from surveillance programs conducted in other countries. REAP is presently being extended to focus on embrittlement data analysis, as well. This paper summarizes the current status of the REAP database and highlights opportunities to access the data and to participate in the project.« less

Imaged Document Optical Correlation and Conversion System (IDOCCS)

NASA Astrophysics Data System (ADS)

Stalcup, Bruce W.; Dennis, Phillip W.; Dydyk, Robert B.

1999-03-01

Today, the paper document is fast becoming a thing of the past. With the rapid development of fast, inexpensive computing and storage devices, many government and private organizations are archiving their documents in electronic form (e.g., personnel records, medical records, patents, etc.). In addition, many organizations are converting their paper archives to electronic images, which are stored in a computer database. Because of this, there is a need to efficiently organize this data into comprehensive and accessible information resources. The Imaged Document Optical Correlation and Conversion System (IDOCCS) provides a total solution to the problem of managing and retrieving textual and graphic information from imaged document archives. At the heart of IDOCCS, optical correlation technology provides the search and retrieval capability of document images. The IDOCCS can be used to rapidly search for key words or phrases within the imaged document archives and can even determine the types of languages contained within a document. In addition, IDOCCS can automatically compare an input document with the archived database to determine if it is a duplicate, thereby reducing the overall resources required to maintain and access the document database. Embedded graphics on imaged pages can also be exploited, e.g., imaged documents containing an agency's seal or logo, or documents with a particular individual's signature block, can be singled out. With this dual capability, IDOCCS outperforms systems that rely on optical character recognition as a basis for indexing and storing only the textual content of documents for later retrieval.
21 CFR 830.320 - Submission of unique device identification information.

Code of Federal Regulations, 2014 CFR

2014-04-01

... Identification Database § 830.320 Submission of unique device identification information. (a) Designation of... Unique Device Identification Database (GUDID) in a format that we can process, review, and archive...
Neutron Deep Penetration Calculations in Light Water with Monte Carlo TRIPOLI-4® Variance Reduction Techniques

NASA Astrophysics Data System (ADS)

Lee, Yi-Kang

2017-09-01

Nuclear decommissioning takes place in several stages due to the radioactivity in the reactor structure materials. A good estimation of the neutron activation products distributed in the reactor structure materials impacts obviously on the decommissioning planning and the low-level radioactive waste management. Continuous energy Monte-Carlo radiation transport code TRIPOLI-4 has been applied on radiation protection and shielding analyses. To enhance the TRIPOLI-4 application in nuclear decommissioning activities, both experimental and computational benchmarks are being performed. To calculate the neutron activation of the shielding and structure materials of nuclear facilities, the knowledge of 3D neutron flux map and energy spectra must be first investigated. To perform this type of neutron deep penetration calculations with the Monte Carlo transport code, variance reduction techniques are necessary in order to reduce the uncertainty of the neutron activation estimation. In this study, variance reduction options of the TRIPOLI-4 code were used on the NAIADE 1 light water shielding benchmark. This benchmark document is available from the OECD/NEA SINBAD shielding benchmark database. From this benchmark database, a simplified NAIADE 1 water shielding model was first proposed in this work in order to make the code validation easier. Determination of the fission neutron transport was performed in light water for penetration up to 50 cm for fast neutrons and up to about 180 cm for thermal neutrons. Measurement and calculation results were benchmarked. Variance reduction options and their performance were discussed and compared.
Enhancement of real-time EPICS IOC PV management for the data archiving system

NASA Astrophysics Data System (ADS)

Kim, Jae-Ha

2015-10-01

The operation of a 100-MeV linear proton accelerator, the major driving values and experimental data need to be archived. According to the experimental conditions, different data are required. Functions that can add new data and delete data in real time need to be implemented. In an experimental physics and industrial control system (EPICS) input output controller (IOC), the value of process variables (PVs) are matched with the driving values and data. The PV values are archived in text file format by using the channel archiver. There is no need to create a database (DB) server, just a need for large hard disk. Through the web, the archived data can be loaded, and new PV values can be archived without stopping the archive engine. The details of the implementation of a data archiving system with channel archiver are presented, and some preliminary results are reported.
Analysis of Students' Assessments in Middle School Curriculum Materials: Aiming Precisely at Benchmarks and Standards.

ERIC Educational Resources Information Center

Stern, Luli; Ahlgren, Andrew

2002-01-01

Project 2061 of the American Association for the Advancement of Science (AAAS) developed and field-tested a procedure for analyzing curriculum materials, including assessments, in terms of contribution to the attainment of benchmarks and standards. Using this procedure, Project 2061 produced a database of reports on nine science middle school…
Mass-storage management for distributed image/video archives

NASA Astrophysics Data System (ADS)

Franchi, Santina; Guarda, Roberto; Prampolini, Franco

1993-04-01

The realization of image/video database requires a specific design for both database structures and mass storage management. This issue has addressed the project of the digital image/video database system that has been designed at IBM SEMEA Scientific & Technical Solution Center. Proper database structures have been defined to catalog image/video coding technique with the related parameters, and the description of image/video contents. User workstations and servers are distributed along a local area network. Image/video files are not managed directly by the DBMS server. Because of their wide size, they are stored outside the database on network devices. The database contains the pointers to the image/video files and the description of the storage devices. The system can use different kinds of storage media, organized in a hierarchical structure. Three levels of functions are available to manage the storage resources. The functions of the lower level provide media management. They allow it to catalog devices and to modify device status and device network location. The medium level manages image/video files on a physical basis. It manages file migration between high capacity media and low access time media. The functions of the upper level work on image/video file on a logical basis, as they archive, move and copy image/video data selected by user defined queries. These functions are used to support the implementation of a storage management strategy. The database information about characteristics of both storage devices and coding techniques are used by the third level functions to fit delivery/visualization requirements and to reduce archiving costs.
A Database of Historical Information on Landslides and Floods in Italy

NASA Astrophysics Data System (ADS)

Guzzetti, F.; Tonelli, G.

2003-04-01

For the past 12 years we have maintained and updated a database of historical information on landslides and floods in Italy, known as the National Research Council's AVI (Damaged Urban Areas) Project archive. The database was originally designed to respond to a specific request of the Minister of Civil Protection, and was aimed at helping the regional assessment of landslide and flood risk in Italy. The database was first constructed in 1991-92 to cover the period 1917 to 1990. Information of damaging landslide and flood event was collected by searching archives, by screening thousands of newspaper issues, by reviewing the existing technical and scientific literature on landslides and floods in Italy, and by interviewing landslide and flood experts. The database was then updated chiefly through the analysis of hundreds of newspaper articles, and it now covers systematically the period 1900 to 1998, and non-systematically the periods 1900 to 1916 and 1999 to 2002. Non systematic information on landslide and flood events older than 20th century is also present in the database. The database currently contains information on more than 32,000 landslide events occurred at more than 25,700 sites, and on more than 28,800 flood events occurred at more than 15,600 sites. After a brief outline of the history and evolution of the AVI Project archive, we present and discuss: (a) the present structure of the database, including the hardware and software solutions adopted to maintain, manage, use and disseminate the information stored in the database, (b) the type and amount of information stored in the database, including an estimate of its completeness, and (c) examples of recent applications of the database, including a web-based GIS systems to show the location of sites historically affected by landslides and floods, and an estimate of geo-hydrological (i.e., landslide and flood) risk in Italy based on the available historical information.
The impact of database quality on keystroke dynamics authentication

NASA Astrophysics Data System (ADS)

Panasiuk, Piotr; Rybnik, Mariusz; Saeed, Khalid; Rogowski, Marcin

2016-06-01

This paper concerns keystroke dynamics, also partially in the context of touchscreen devices. The authors concentrate on the impact of database quality and propose their algorithm to test database quality issues. The algorithm is used on their own as well as the well-known . Following specific problems were researched: classification accuracy, development of user typing proficiency, time precision during sample acquisition, representativeness of training set, sample length.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Prather, J. C.; Smith, S. K.; Watson, C. R.

The National Radiobiology Archives is a comprehensive effort to gather, organize, and catalog original data, representative specimens, and supporting materials related to significant radiobiology studies. This provides researchers with information for analyses which compare or combine results of these and other studies and with materials for analysis by advanced molecular biology techniques. This Programmer's Guide document describes the database access software, NRADEMO, and the subset loading script NRADEMO/MAINT/MAINTAIN, which comprise the National Laboratory Archives Distributed Access Package. The guide is intended for use by an experienced database management specialist. It contains information about the physical and logical organization of themore » software and data files. It also contains printouts of all the scripts and associated batch processing files. It is part of a suite of documents published by the National Radiobiology Archives.« less
The new Gemini Observatory archive: a fast and low cost observatory data archive running in the cloud

NASA Astrophysics Data System (ADS)

Hirst, Paul; Cardenes, Ricardo

2016-08-01

We have developed and deployed a new data archive for the Gemini Observatory. Focused on simplicity and ease of use, the archive provides a number of powerful and novel features including automatic association of calibration data with the science data, and the ability to bookmark searches. A simple but powerful API allows programmatic search and download of data. The archive is hosted on Amazon Web Services, which provides us excellent internet connectivity and significant cost savings in both operations and development over more traditional deployment options. The code is written in python, utilizing a PostgreSQL database and Apache web server.
OWL2 benchmarking for the evaluation of knowledge based systems.

PubMed

Khan, Sher Afgun; Qadir, Muhammad Abdul; Abbas, Muhammad Azeem; Afzal, Muhammad Tanvir

2017-01-01

OWL2 semantics are becoming increasingly popular for the real domain applications like Gene engineering and health MIS. The present work identifies the research gap that negligible attention has been paid to the performance evaluation of Knowledge Base Systems (KBS) using OWL2 semantics. To fulfil this identified research gap, an OWL2 benchmark for the evaluation of KBS is proposed. The proposed benchmark addresses the foundational blocks of an ontology benchmark i.e. data schema, workload and performance metrics. The proposed benchmark is tested on memory based, file based, relational database and graph based KBS for performance and scalability measures. The results show that the proposed benchmark is able to evaluate the behaviour of different state of the art KBS on OWL2 semantics. On the basis of the results, the end users (i.e. domain expert) would be able to select a suitable KBS appropriate for his domain.
Programmed database system at the Chang Gung Craniofacial Center: part II--digitizing photographs.

PubMed

Chuang, Shiow-Shuh; Hung, Kai-Fong; de Villa, Glenda H; Chen, Philip K T; Lo, Lun-Jou; Chang, Sophia C N; Yu, Chung-Chih; Chen, Yu-Ray

2003-07-01

The archival tools used for digital images in advertising are not to fulfill the clinic requisition and are just beginning to develop. The storage of a large amount of conventional photographic slides needs a lot of space and special conditions. In spite of special precautions, degradation of the slides still occurs. The most common degradation is the appearance of fungus flecks. With the recent advances in digital technology, it is now possible to store voluminous numbers of photographs on a computer hard drive and keep them for a long time. A self-programmed interface has been developed to integrate database and image browser system that can build and locate needed files archive in a matter of seconds with the click of a button. This system requires hardware and software were market provided. There are 25,200 patients recorded in the database that involve 24,331 procedures. In the image files, there are 6,384 patients with 88,366 digital pictures files. From 1999 through 2002, NT400,000 dollars have been saved using the new system. Photographs can be managed with the integrating Database and Browse software for database archiving. This allows labeling of the individual photographs with demographic information and browsing. Digitized images are not only more efficient and economical than the conventional slide images, but they also facilitate clinical studies.
Evaluation of Database Coverage: A Comparison of Two Methodologies.

ERIC Educational Resources Information Center

Tenopir, Carol

1982-01-01

Describes experiment which compared two techniques used for evaluating and comparing database coverage of a subject area, e.g., "bibliography" and "subject profile." Differences in time, cost, and results achieved are compared by applying techniques to field of volcanology using two databases, Geological Reference File and GeoArchive. Twenty…
HIV Structural Database

National Institute of Standards and Technology Data Gateway

SRD 102 HIV Structural Database (Web, free access) The HIV Protease Structural Database is an archive of experimentally determined 3-D structures of Human Immunodeficiency Virus 1 (HIV-1), Human Immunodeficiency Virus 2 (HIV-2) and Simian Immunodeficiency Virus (SIV) Proteases and their complexes with inhibitors or products of substrate cleavage.
RBscore&NBench: a high-level web server for nucleic acid binding residues prediction with a large-scale benchmarking database.

PubMed

Miao, Zhichao; Westhof, Eric

2016-07-08

RBscore&NBench combines a web server, RBscore and a database, NBench. RBscore predicts RNA-/DNA-binding residues in proteins and visualizes the prediction scores and features on protein structures. The scoring scheme of RBscore directly links feature values to nucleic acid binding probabilities and illustrates the nucleic acid binding energy funnel on the protein surface. To avoid dataset, binding site definition and assessment metric biases, we compared RBscore with 18 web servers and 3 stand-alone programs on 41 datasets, which demonstrated the high and stable accuracy of RBscore. A comprehensive comparison led us to develop a benchmark database named NBench. The web server is available on: http://ahsoka.u-strasbg.fr/rbscorenbench/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Buckets: Smart Objects for Digital Libraries

NASA Technical Reports Server (NTRS)

Nelson, Michael L.

2001-01-01

Current discussion of digital libraries (DLs) is often dominated by the merits of the respective storage, search and retrieval functionality of archives, repositories, search engines, search interfaces and database systems. While these technologies are necessary for information management, the information content is more important than the systems used for its storage and retrieval. Digital information should have the same long-term survivability prospects as traditional hardcopy information and should be protected to the extent possible from evolving search engine technologies and vendor vagaries in database management systems. Information content and information retrieval systems should progress on independent paths and make limited assumptions about the status or capabilities of the other. Digital information can achieve independence from archives and DL systems through the use of buckets. Buckets are an aggregative, intelligent construct for publishing in DLs. Buckets allow the decoupling of information content from information storage and retrieval. Buckets exist within the Smart Objects and Dumb Archives model for DLs in that many of the functionalities and responsibilities traditionally associated with archives are pushed down (making the archives dumber) into the buckets (making them smarter). Some of the responsibilities imbued to buckets are the enforcement of their terms and conditions, and maintenance and display of their contents.
Conversion of a traditional image archive into an image resource on compact disc.

PubMed Central

Andrew, S M; Benbow, E W

1997-01-01

The conversion of a traditional archive of pathology images was organised on 35 mm slides into a database of images stored on compact disc (CD-ROM), and textual descriptions were added to each image record. Students on a didactic pathology course found this resource useful as an aid to revision, despite relative computer illiteracy, and it is anticipated that students on a new problem based learning course, which incorporates experience with information technology, will benefit even more readily when they use the database as an educational resource. A text and image database on CD-ROM can be updated repeatedly, and the content manipulated to reflect the content and style of the courses it supports. Images PMID:9306931
STARS 2.0: 2nd-generation open-source archiving and query software

NASA Astrophysics Data System (ADS)

Winegar, Tom

2008-07-01

The Subaru Telescope is in process of developing an open-source alternative to the 1st-generation software and databases (STARS 1) used for archiving and query. For STARS 2, we have chosen PHP and Python for scripting and MySQL as the database software. We have collected feedback from staff and observers, and used this feedback to significantly improve the design and functionality of our future archiving and query software. Archiving - We identified two weaknesses in 1st-generation STARS archiving software: a complex and inflexible table structure and uncoordinated system administration for our business model: taking pictures from the summit and archiving them in both Hawaii and Japan. We adopted a simplified and normalized table structure with passive keyword collection, and we are designing an archive-to-archive file transfer system that automatically reports real-time status and error conditions and permits error recovery. Query - We identified several weaknesses in 1st-generation STARS query software: inflexible query tools, poor sharing of calibration data, and no automatic file transfer mechanisms to observers. We are developing improved query tools and sharing of calibration data, and multi-protocol unassisted file transfer mechanisms for observers. In the process, we have redefined a 'query': from an invisible search result that can only transfer once in-house right now, with little status and error reporting and no error recovery - to a stored search result that can be monitored, transferred to different locations with multiple protocols, reporting status and error conditions and permitting recovery from errors.
Water level ingest, archive and processing system - an integral part of NOAA's tsunami database

NASA Astrophysics Data System (ADS)

McLean, S. J.; Mungov, G.; Dunbar, P. K.; Price, D. J.; Mccullough, H.

2013-12-01

The National Oceanic and Atmospheric Administration (NOAA), National Geophysical Data Center (NGDC) and collocated World Data Service for Geophysics (WDS) provides long-term archive, data management, and access to national and global tsunami data. Archive responsibilities include the NOAA Global Historical Tsunami event and runup database, damage photos, as well as other related hazards data. Beginning in 2008, NGDC was given the responsibility of archiving, processing and distributing all tsunami and hazards-related water level data collected from NOAA observational networks in a coordinated and consistent manner. These data include the Deep-ocean Assessment and Reporting of Tsunami (DART) data provided by the National Data Buoy Center (NDBC), coastal-tide-gauge data from the National Ocean Service (NOS) network and tide-gauge data from the two National Weather Service (NWS) Tsunami Warning Centers (TWCs) regional networks. Taken together, this integrated archive supports tsunami forecast, warning, research, mitigation and education efforts of NOAA and the Nation. Due to the variety of the water level data, the automatic ingest system was redesigned, along with upgrading the inventory, archive and delivery capabilities based on modern digital data archiving practices. The data processing system was also upgraded and redesigned focusing on data quality assessment in an operational manner. This poster focuses on data availability highlighting the automation of all steps of data ingest, archive, processing and distribution. Examples are given from recent events such as the October 2012 hurricane Sandy, the Feb 06, 2013 Solomon Islands tsunami, and the June 13, 2013 meteotsunami along the U.S. East Coast.
Database resources of the National Center for Biotechnology Information

PubMed Central

2015-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:25398906

Database resources of the National Center for Biotechnology Information

PubMed Central

2016-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:26615191
Towards the Interoperability of Web, Database, and Mass Storage Technologies for Petabyte Archives

NASA Technical Reports Server (NTRS)

Moore, Reagan; Marciano, Richard; Wan, Michael; Sherwin, Tom; Frost, Richard

1996-01-01

At the San Diego Supercomputer Center, a massive data analysis system (MDAS) is being developed to support data-intensive applications that manipulate terabyte sized data sets. The objective is to support scientific application access to data whether it is located at a Web site, stored as an object in a database, and/or storage in an archival storage system. We are developing a suite of demonstration programs which illustrate how Web, database (DBMS), and archival storage (mass storage) technologies can be integrated. An application presentation interface is being designed that integrates data access to all of these sources. We have developed a data movement interface between the Illustra object-relational database and the NSL UniTree archival storage system running in a production mode at the San Diego Supercomputer Center. With this interface, an Illustra client can transparently access data on UniTree under the control of the Illustr DBMS server. The current implementation is based on the creation of a new DBMS storage manager class, and a set of library functions that allow the manipulation and migration of data stored as Illustra 'large objects'. We have extended this interface to allow a Web client application to control data movement between its local disk, the Web server, the DBMS Illustra server, and the UniTree mass storage environment. This paper describes some of the current approaches successfully integrating these technologies. This framework is measured against a representative sample of environmental data extracted from the San Diego Ba Environmental Data Repository. Practical lessons are drawn and critical research areas are highlighted.
NCO Production Management Branch

Science.gov Websites

Climate Climate Prediction Climate Archives Weather Safety Storm Ready NOAA Central Library Photo Library Management Branch Production Management Branch About the Production Management Branch NCO REQUEST FOR CHANGE (RFC) DATABASE ACCESS NCO Request For Change (RFC) Archive [For INTERNAL Use Only] NCO Request For
Cracking the Egg: The South Carolina Digital Library's New Perspective

ERIC Educational Resources Information Center

Vinson, Christopher G.; Boyd, Kate Foster

2008-01-01

This article explores the historical foundations of the South Carolina Digital Library, a collaborative statewide program that ties together academic special collections and archives, public libraries, state government archives, and other cultural resource institutions in an effort to provide the state with a comprehensive database of online…
Natural Learning Case Study Archives

ERIC Educational Resources Information Center

Lawler, Robert W.

2015-01-01

Natural Learning Case Study Archives (NLCSA) is a research facility for those interested in using case study analysis to deepen their understanding of common sense knowledge and natural learning (how the mind interacts with everyday experiences to develop common sense knowledge). The database comprises three case study corpora based on experiences…
36 CFR 1225.24 - When can an agency apply previously approved schedules to electronic records?

Code of Federal Regulations, 2012 CFR

2012-07-01

... Property NATIONAL ARCHIVES AND RECORDS ADMINISTRATION RECORDS MANAGEMENT SCHEDULING RECORDS § 1225.24 When... must notify the National Archives and Records Administration, Modern Records Programs (NWM), 8601... authority reference; and (v) Format of the records (e.g., database, scanned images, digital photographs, etc...
36 CFR 1225.24 - When can an agency apply previously approved schedules to electronic records?

Code of Federal Regulations, 2011 CFR

2011-07-01

... Property NATIONAL ARCHIVES AND RECORDS ADMINISTRATION RECORDS MANAGEMENT SCHEDULING RECORDS § 1225.24 When... must notify the National Archives and Records Administration, Modern Records Programs (NWM), 8601... authority reference; and (v) Format of the records (e.g., database, scanned images, digital photographs, etc...
36 CFR 1225.24 - When can an agency apply previously approved schedules to electronic records?

Code of Federal Regulations, 2014 CFR

2014-07-01

... Property NATIONAL ARCHIVES AND RECORDS ADMINISTRATION RECORDS MANAGEMENT SCHEDULING RECORDS § 1225.24 When... must notify the National Archives and Records Administration, Modern Records Programs (NWM), 8601... authority reference; and (v) Format of the records (e.g., database, scanned images, digital photographs, etc...
The Midwest State Archives Guide Project: Collateral Reports.

ERIC Educational Resources Information Center

Evans, Max J.; And Others

This document comprises four reports related to the Midwest State Archives Guide Project. The first report, "Producing In-House Finding Aids and Administrative Reports: An Expanded Data Base Design," details an expanded SPINDEX database design that could be used to produce a variety of administrative reports. Specific examples are given…
Dose-Response Analysis of RNA-Seq Profiles in Archival ...

EPA Pesticide Factsheets

Use of archival resources has been limited to date by inconsistent methods for genomic profiling of degraded RNA from formalin-fixed paraffin-embedded (FFPE) samples. RNA-sequencing offers a promising way to address this problem. Here we evaluated transcriptomic dose responses using RNA-sequencing in paired FFPE and frozen (FROZ) samples from two archival studies in mice, one 20 years old. Experimental treatments included 3 different doses of di(2-ethylhexyl)phthalate or dichloroacetic acid for the recently archived and older studies, respectively. Total RNA was ribo-depleted and sequenced using the Illumina HiSeq platform. In the recently archived study, FFPE samples had 35% lower total counts compared to FROZ samples but high concordance in fold-change values of differentially expressed genes (DEGs) (r2 = 0.99), highly enriched pathways (90% overlap with FROZ), and benchmark dose estimates for preselected target genes (2% difference vs FROZ). In contrast, older FFPE samples had markedly lower total counts (3% of FROZ) and poor concordance in global DEGs and pathways. However, counts from FFPE and FROZ samples still positively correlated (r2 = 0.84 across all transcripts) and showed comparable dose responses for more highly expressed target genes. These findings highlight potential applications and issues in using RNA-sequencing data from FFPE samples. Recently archived FFPE samples were highly similar to FROZ samples in sequencing q
Imaged document information location and extraction using an optical correlator

NASA Astrophysics Data System (ADS)

Stalcup, Bruce W.; Dennis, Phillip W.; Dydyk, Robert B.

1999-12-01

Today, the paper document is fast becoming a thing of the past. With the rapid development of fast, inexpensive computing and storage devices, many government and private organizations are archiving their documents in electronic form (e.g., personnel records, medical records, patents, etc.). Many of these organizations are converting their paper archives to electronic images, which are then stored in a computer database. Because of this, there is a need to efficiently organize this data into comprehensive and accessible information resources and provide for rapid access to the information contained within these imaged documents. To meet this need, Litton PRC and Litton Data Systems Division are developing a system, the Imaged Document Optical Correlation and Conversion System (IDOCCS), to provide a total solution to the problem of managing and retrieving textual and graphic information from imaged document archives. At the heart of IDOCCS, optical correlation technology provide a means for the search and retrieval of information from imaged documents. IDOCCS can be used to rapidly search for key words or phrases within the imaged document archives and has the potential to determine the types of languages contained within a document. In addition, IDOCCS can automatically compare an input document with the archived database to determine if it is a duplicate, thereby reducing the overall resources required to maintain and access the document database. Embedded graphics on imaged pages can also be exploited, e.g., imaged documents containing an agency's seal or logo can be singled out. In this paper, we present a description of IDOCCS as well as preliminary performance results and theoretical projections.
Recovery and archiving key Arctic Alaska vegetation map and plot data for the Arctic-Boreal Vulnerability Field Experiment (ABoVE)

NASA Astrophysics Data System (ADS)

Walker, D. A.; Breen, A. L.; Broderson, D.; Epstein, H. E.; Fisher, W.; Grunblatt, J.; Heinrichs, T.; Raynolds, M. K.; Walker, M. D.; Wirth, L.

2013-12-01

Abundant ground-based information will be needed to inform remote-sensing and modeling studies of NASA's Arctic-Boreal Vulnerability Experiment (ABoVE). A large body of plot and map data collected by the Alaska Geobotany Center (AGC) and collaborators from the Arctic regions of Alaska and the circumpolar Arctic over the past several decades is being archived and made accessible to scientists and the public via the Geographic Information Network of Alaska's (GINA's) 'Catalog' display and portal system. We are building two main types of data archives: Vegetation Plot Archive: For the plot information we use a Turboveg database to construct the Alaska portion of the international Arctic Vegetation Archive (AVA) http://www.geobotany.uaf.edu/ava/. High quality plot data and non-digital legacy datasets in danger of being lost have highest priority for entry into the archive. A key aspect of the database is the PanArctic Species List (PASL-1), developed specifically for the AVA to provide a standard of species nomenclature for the entire Arctic biome. A wide variety of reports, documents, and ancillary data are linked to each plot's geographic location. Geoecological Map Archive: This database includes maps and remote sensing products and links to other relevant data associated with the maps, mainly those produced by the Alaska Geobotany Center. Map data include GIS shape files of vegetation, land-cover, soils, landforms and other categorical variables and digital raster data of elevation, multispectral satellite-derived data, and data products and metadata associated with these. The map archive will contain all the information that is currently in the hierarchical Toolik-Arctic Geobotanical Atlas (T-AGA) in Alaska http://www.arcticatlas.org, plus several additions that are in the process of development and will be combined with GINA's already substantial holdings of spatial data from northern Alaska. The Geoecological Atlas Portal uses GINA's Catalog tool to develop a web interface to view and access the plot and map data. The mapping portal allows visualization of GIS data, sample-point locations and imagery and access to the map data. Catalog facilitates the discovery and dissemination of science-based information products in support of analysis and decision-making concerned with development and climate change and is currently used by GINA in several similar archive/distribution portals.
Long-term data archiving

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moore, David Steven

2009-01-01

Long term data archiving has much value for chemists, not only to retain access to research and product development records, but also to enable new developments and new discoveries. There are some recent regulatory requirements (e.g., FDA 21 CFR Part 11), but good science and good business both benefit regardless. A particular example of the benefits of and need for long term data archiving is the management of data from spectroscopic laboratory instruments. The sheer amount of spectroscopic data is increasing at a scary rate, and the pressures to archive come from the expense to create the data (or recreatemore » it if it is lost) as well as its high information content. The goal of long-term data archiving is to save and organize instrument data files as well as any needed meta data (such as sample ID, LIMS information, operator, date, time, instrument conditions, sample type, excitation details, environmental parameters, etc.). This editorial explores the issues involved in long-term data archiving using the example of Raman spectral databases. There are at present several such databases, including common data format libraries and proprietary libraries. However, such databases and libraries should ultimately satisfy stringent criteria for long term data archiving, including readability for long times into the future, robustness to changes in computer hardware and operating systems, and use of public domain data formats. The latter criterion implies the data format should be platform independent and the tools to create the data format should be easily and publicly obtainable or developable. Several examples of attempts at spectral libraries exist, such as the ASTM ANDI format, and the JCAMP-DX format. On the other hand, proprietary library spectra can be exchanged and manipulated using proprietary tools. As the above examples have deficiencies according to the three long term data archiving criteria, Extensible Markup Language (XML; a product of the World Wide Web Consortium, an independent standards body) as a new data interchange tool is being investigated and implemented. In order to facilitate data archiving, Raman data needs calibration as well as some other kinds of data treatment. Figure 1 illustrates schematically the present situation for Raman data calibration in the world-wide Raman spectroscopy community, and presents some of the terminology used.« less
Macromolecular Structure Database. Final Progress Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilliland, Gary L.

2003-09-23

The central activity of the PDB continues to be the collection, archiving and distribution of high quality structural data to the scientific community on a timely basis. In support of these activities NIST has continued its roles in developing the physical archive, in developing data uniformity, in dealing with NMR issues and in the distribution of PDB data through CD-ROMs. The physical archive holdings have been organized, inventoried, and a database has been created to facilitate their use. Data from individual PDB entries have been annotated to produce uniform values improving tremendously the accuracy of results of queries. Working withmore » the NMR community we have established data items specific for NMR that will be included in new entries and facilitate data deposition. The PDB CD-ROM production has continued on a quarterly basis, and new products are being distributed.« less
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; Dicuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian

2012-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information

PubMed Central

Acland, Abigail; Agarwala, Richa; Barrett, Tanya; Beck, Jeff; Benson, Dennis A.; Bollin, Colleen; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Church, Deanna M.; Clark, Karen; DiCuccio, Michael; Dondoshansky, Ilya; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Gorelenkov, Viatcheslav; Hoeppner, Marilu; Johnson, Mark; Kelly, Christopher; Khotomlianski, Viatcheslav; Kimchi, Avi; Kimelman, Michael; Kitts, Paul; Krasnov, Sergey; Kuznetsov, Anatoliy; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Karsch-Mizrachi, Ilene; Murphy, Terence; Ostell, James; O'Sullivan, Christopher; Panchenko, Anna; Phan, Lon; Pruitt, Don Preussm Kim D.; Rubinstein, Wendy; Sayers, Eric W.; Schneider, Valerie; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Siyan, Karanjit; Slotta, Douglas; Soboleva, Alexandra; Soussov, Vladimir; Starchenko, Grigory; Tatusova, Tatiana A.; Trawick, Bart W.; Vakatov, Denis; Wang, Yanli; Ward, Minghong; John Wilbur, W.; Yaschenko, Eugene; Zbicz, Kerry

2014-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, PubReader, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Primer-BLAST, COBALT, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, ClinVar, MedGen, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page. PMID:24259429
Troubleshooting Public Data Archiving: Suggestions to Increase Participation

PubMed Central

Roche, Dominique G.; Lanfear, Robert; Binning, Sandra A.; Haff, Tonya M.; Schwanz, Lisa E.; Cain, Kristal E.; Kokko, Hanna; Jennions, Michael D.; Kruuk, Loeske E. B.

2014-01-01

An increasing number of publishers and funding agencies require public data archiving (PDA) in open-access databases. PDA has obvious group benefits for the scientific community, but many researchers are reluctant to share their data publicly because of real or perceived individual costs. Improving participation in PDA will require lowering costs and/or increasing benefits for primary data collectors. Small, simple changes can enhance existing measures to ensure that more scientific data are properly archived and made publicly available: (1) facilitate more flexible embargoes on archived data, (2) encourage communication between data generators and re-users, (3) disclose data re-use ethics, and (4) encourage increased recognition of publicly archived data. PMID:24492920
The Planetary Archive

NASA Astrophysics Data System (ADS)

Penteado, Paulo F.; Trilling, David; Szalay, Alexander; Budavári, Tamás; Fuentes, César

2014-11-01

We are building the first system that will allow efficient data mining in the astronomical archives for observations of Solar System Bodies. While the Virtual Observatory has enabled data-intensive research making use of large collections of observations across multiple archives, Planetary Science has largely been denied this opportunity: most astronomical data services are built based on sky positions, and moving objects are often filtered out.To identify serendipitous observations of Solar System objects, we ingest the archive metadata. The coverage of each image in an archive is a volume in a 3D space (RA,Dec,time), which we can represent efficiently through a hierarchical triangular mesh (HTM) for the spatial dimensions, plus a contiguous time interval. In this space, an asteroid occupies a curve, which we determine integrating its orbit into the past. Thus when an asteroid trajectory intercepts the volume of an archived image, we have a possible observation of that body. Our pipeline then looks in the archive's catalog for a source with the corresponding coordinates, to retrieve its photometry. All these matches are stored into a database, which can be queried by object identifier.This database consists of archived observations of known Solar System objects. This means that it grows not only from the ingestion of new images, but also from the growth in the number of known objects. As new bodies are discovered, our pipeline can find archived observations where they could have been recorded, providing colors for these newly-found objects. This growth becomes more relevant with the new generation of wide-field surveys, particularly LSST.We also present one use case of our prototype archive: after ingesting the metadata for SDSS, 2MASS and GALEX, we were able to identify serendipitous observations of Solar System bodies in these 3 archives. Cross-matching these occurrences provided us with colors from the UV to the IR, a much wider spectral range than that commonly used for asteroid taxonomy. We present here archive-derived spectrophotometry from searching for 440 thousand asteroids, from 0.3 to 3 µm. In the future we will expand to other archives, including HST, Spitzer, WISE and Pan-STARRS.
XWeB: The XML Warehouse Benchmark

NASA Astrophysics Data System (ADS)

Mahboubi, Hadj; Darmont, Jérôme

With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems.
SysML model of exoplanet archive functionality and activities

NASA Astrophysics Data System (ADS)

Ramirez, Solange

2016-08-01

The NASA Exoplanet Archive is an online service that serves data and information on exoplanets and their host stars to help astronomical research related to search for and characterization of extra-solar planetary systems. In order to provide the most up to date data sets to the users, the exoplanet archive performs weekly updates that include additions into the database and updates to the services as needed. These weekly updates are complex due to interfaces within the archive. I will be presenting a SysML model that helps us perform these update activities in a weekly basis.

Informatics in radiology: use of CouchDB for document-based storage of DICOM objects.

PubMed

Rascovsky, Simón J; Delgado, Jorge A; Sanz, Alexander; Calvo, Víctor D; Castrillón, Gabriel

2012-01-01

Picture archiving and communication systems traditionally have depended on schema-based Structured Query Language (SQL) databases for imaging data management. To optimize database size and performance, many such systems store a reduced set of Digital Imaging and Communications in Medicine (DICOM) metadata, discarding informational content that might be needed in the future. As an alternative to traditional database systems, document-based key-value stores recently have gained popularity. These systems store documents containing key-value pairs that facilitate data searches without predefined schemas. Document-based key-value stores are especially suited to archive DICOM objects because DICOM metadata are highly heterogeneous collections of tag-value pairs conveying specific information about imaging modalities, acquisition protocols, and vendor-supported postprocessing options. The authors used an open-source document-based database management system (Apache CouchDB) to create and test two such databases; CouchDB was selected for its overall ease of use, capability for managing attachments, and reliance on HTTP and Representational State Transfer standards for accessing and retrieving data. A large database was created first in which the DICOM metadata from 5880 anonymized magnetic resonance imaging studies (1,949,753 images) were loaded by using a Ruby script. To provide the usual DICOM query functionality, several predefined "views" (standard queries) were created by using JavaScript. For performance comparison, the same queries were executed in both the CouchDB database and a SQL-based DICOM archive. The capabilities of CouchDB for attachment management and database replication were separately assessed in tests of a similar, smaller database. Results showed that CouchDB allowed efficient storage and interrogation of all DICOM objects; with the use of information retrieval algorithms such as map-reduce, all the DICOM metadata stored in the large database were searchable with only a minimal increase in retrieval time over that with the traditional database management system. Results also indicated possible uses for document-based databases in data mining applications such as dose monitoring, quality assurance, and protocol optimization. RSNA, 2012
The National Deep-Sea Coral and Sponge Database: A Comprehensive Resource for United States Deep-Sea Coral and Sponge Records

NASA Astrophysics Data System (ADS)

Dornback, M.; Hourigan, T.; Etnoyer, P.; McGuinn, R.; Cross, S. L.

2014-12-01

Research on deep-sea corals has expanded rapidly over the last two decades, as scientists began to realize their value as long-lived structural components of high biodiversity habitats and archives of environmental information. The NOAA Deep Sea Coral Research and Technology Program's National Database for Deep-Sea Corals and Sponges is a comprehensive resource for georeferenced data on these organisms in U.S. waters. The National Database currently includes more than 220,000 deep-sea coral records representing approximately 880 unique species. Database records from museum archives, commercial and scientific bycatch, and from journal publications provide baseline information with relatively coarse spatial resolution dating back as far as 1842. These data are complemented by modern, in-situ submersible observations with high spatial resolution, from surveys conducted by NOAA and NOAA partners. Management of high volumes of modern high-resolution observational data can be challenging. NOAA is working with our data partners to incorporate this occurrence data into the National Database, along with images and associated information related to geoposition, time, biology, taxonomy, environment, provenance, and accuracy. NOAA is also working to link associated datasets collected by our program's research, to properly archive them to the NOAA National Data Centers, to build a robust metadata record, and to establish a standard protocol to simplify the process. Access to the National Database is provided through an online mapping portal. The map displays point based records from the database. Records can be refined by taxon, region, time, and depth. The queries and extent used to view the map can also be used to download subsets of the database. The database, map, and website is already in use by NOAA, regional fishery management councils, and regional ocean planning bodies, but we envision it as a model that can expand to accommodate data on a global scale.
Development of geotechnical analysis and design modules for the Virginia Department of Transportation's geotechnical database.

DOT National Transportation Integrated Search

2005-01-01

In 2003, an Internet-based Geotechnical Database Management System (GDBMS) was developed for the Virginia Department of Transportation (VDOT) using distributed Geographic Information System (GIS) methodology for data management, archival, retrieval, ...
Tracking vegetation phenology across diverse North American biomes using PhenoCam imagery

DOE PAGES

Richardson, Andrew D.; Hufkens, Koen; Milliman, Tom; ...

2018-03-13

Vegetation phenology controls the seasonality of many ecosystem processes, as well as numerous biosphere-atmosphere feedbacks. Phenology is also highly sensitive to climate change and variability. Here we present a series of datasets, together consisting of almost 750 years of observations, characterizing vegetation phenology in diverse ecosystems across North America. Our data are derived from conventional, visible-wavelength, automated digital camera imagery collected through the PhenoCam network. For each archived image, we extracted RGB (red, green, blue) colour channel information, with means and other statistics calculated across a region-of-interest (ROI) delineating a specific vegetation type. From the high-frequency (typically, 30 min) imagery,more » we derived time series characterizing vegetation colour, including "canopy greenness", processed to 1- and 3-day intervals. For ecosystems with one or more annual cycles of vegetation activity, we provide estimates, with uncertainties, for the start of the "greenness rising" and end of the "greenness falling" stages. Lastly, the database can be used for phenological model validation and development, evaluation of satellite remote sensing data products, benchmarking earth system models, and studies of climate change impacts on terrestrial ecosystems.« less
Tracking vegetation phenology across diverse North American biomes using PhenoCam imagery

PubMed Central

Richardson, Andrew D.; Hufkens, Koen; Milliman, Tom; Aubrecht, Donald M.; Chen, Min; Gray, Josh M.; Johnston, Miriam R.; Keenan, Trevor F.; Klosterman, Stephen T.; Kosmala, Margaret; Melaas, Eli K.; Friedl, Mark A.; Frolking, Steve

2018-01-01

Vegetation phenology controls the seasonality of many ecosystem processes, as well as numerous biosphere-atmosphere feedbacks. Phenology is also highly sensitive to climate change and variability. Here we present a series of datasets, together consisting of almost 750 years of observations, characterizing vegetation phenology in diverse ecosystems across North America. Our data are derived from conventional, visible-wavelength, automated digital camera imagery collected through the PhenoCam network. For each archived image, we extracted RGB (red, green, blue) colour channel information, with means and other statistics calculated across a region-of-interest (ROI) delineating a specific vegetation type. From the high-frequency (typically, 30 min) imagery, we derived time series characterizing vegetation colour, including “canopy greenness”, processed to 1- and 3-day intervals. For ecosystems with one or more annual cycles of vegetation activity, we provide estimates, with uncertainties, for the start of the “greenness rising” and end of the “greenness falling” stages. The database can be used for phenological model validation and development, evaluation of satellite remote sensing data products, benchmarking earth system models, and studies of climate change impacts on terrestrial ecosystems. PMID:29533393
Tracking vegetation phenology across diverse North American biomes using PhenoCam imagery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Richardson, Andrew D.; Hufkens, Koen; Milliman, Tom

Vegetation phenology controls the seasonality of many ecosystem processes, as well as numerous biosphere-atmosphere feedbacks. Phenology is also highly sensitive to climate change and variability. Here we present a series of datasets, together consisting of almost 750 years of observations, characterizing vegetation phenology in diverse ecosystems across North America. Our data are derived from conventional, visible-wavelength, automated digital camera imagery collected through the PhenoCam network. For each archived image, we extracted RGB (red, green, blue) colour channel information, with means and other statistics calculated across a region-of-interest (ROI) delineating a specific vegetation type. From the high-frequency (typically, 30 min) imagery,more » we derived time series characterizing vegetation colour, including "canopy greenness", processed to 1- and 3-day intervals. For ecosystems with one or more annual cycles of vegetation activity, we provide estimates, with uncertainties, for the start of the "greenness rising" and end of the "greenness falling" stages. Lastly, the database can be used for phenological model validation and development, evaluation of satellite remote sensing data products, benchmarking earth system models, and studies of climate change impacts on terrestrial ecosystems.« less
Tracking vegetation phenology across diverse North American biomes using PhenoCam imagery

NASA Astrophysics Data System (ADS)

Richardson, Andrew D.; Hufkens, Koen; Milliman, Tom; Aubrecht, Donald M.; Chen, Min; Gray, Josh M.; Johnston, Miriam R.; Keenan, Trevor F.; Klosterman, Stephen T.; Kosmala, Margaret; Melaas, Eli K.; Friedl, Mark A.; Frolking, Steve

2018-03-01

Vegetation phenology controls the seasonality of many ecosystem processes, as well as numerous biosphere-atmosphere feedbacks. Phenology is also highly sensitive to climate change and variability. Here we present a series of datasets, together consisting of almost 750 years of observations, characterizing vegetation phenology in diverse ecosystems across North America. Our data are derived from conventional, visible-wavelength, automated digital camera imagery collected through the PhenoCam network. For each archived image, we extracted RGB (red, green, blue) colour channel information, with means and other statistics calculated across a region-of-interest (ROI) delineating a specific vegetation type. From the high-frequency (typically, 30 min) imagery, we derived time series characterizing vegetation colour, including “canopy greenness”, processed to 1- and 3-day intervals. For ecosystems with one or more annual cycles of vegetation activity, we provide estimates, with uncertainties, for the start of the “greenness rising” and end of the “greenness falling” stages. The database can be used for phenological model validation and development, evaluation of satellite remote sensing data products, benchmarking earth system models, and studies of climate change impacts on terrestrial ecosystems.
The Social and Organizational Life Data Archive (SOLDA).

ERIC Educational Resources Information Center

Reed, Ken; Blunsdon, Betsy; Rimme, Malcolm

2000-01-01

Outlines the rationale and design of the Social and Organizational Life Data Archive (SOLDA), an on-line collection of survey and other statistical data relevant to research in the fields of management, organizational studies, industrial relations, marketing, and related social sciences. The database uses CD-ROM technology and the World Wide Web…
36 CFR § 1225.24 - When can an agency apply previously approved schedules to electronic records?

Code of Federal Regulations, 2013 CFR

2013-07-01

... Property NATIONAL ARCHIVES AND RECORDS ADMINISTRATION RECORDS MANAGEMENT SCHEDULING RECORDS § 1225.24 When... must notify the National Archives and Records Administration, Modern Records Programs (NWM), 8601... authority reference; and (v) Format of the records (e.g., database, scanned images, digital photographs, etc...
The McIntosh Archive: A solar feature database spanning four solar cycles

NASA Astrophysics Data System (ADS)

Gibson, S. E.; Malanushenko, A. V.; Hewins, I.; McFadden, R.; Emery, B.; Webb, D. F.; Denig, W. F.

2016-12-01

The McIntosh Archive consists of a set of hand-drawn solar Carrington maps created by Patrick McIntosh from 1964 to 2009. McIntosh used mainly H-alpha, He-1 10830 and photospheric magnetic measurements from both ground-based and NASA satellite observations. With these he traced coronal holes, polarity inversion lines, filaments, sunspots and plage, yielding a unique 45-year record of the features associated with the large-scale solar magnetic field. We will present the results of recent efforts to preserve and digitize this archive. Most of the original hand-drawn maps have been scanned, a method for processing these scans into digital, searchable format has been developed and streamlined, and an archival repository at NOAA's National Centers for Environmental Information (NCEI) has been created. We will demonstrate how Solar Cycle 23 data may now be accessed and how it may be utilized for scientific applications. In addition, we will discuss how this database of human-recognized features, which overlaps with the onset of high-resolution, continuous modern solar data, may act as a training set for computer feature recognition algorithms.
NASDA's earth observation satellite data archive policy for the earth observation data and information system (EOIS)

NASA Technical Reports Server (NTRS)

Sobue, Shin-ichi; Yoshida, Fumiyoshi; Ochiai, Osamu

1996-01-01

NASDA's new Advanced Earth Observing Satellite (ADEOS) is scheduled for launch in August, 1996. ADEOS carries 8 sensors to observe earth environmental phenomena and sends their data to NASDA, NASA, and other foreign ground stations around the world. The downlink data bit rate for ADEOS is 126 MB/s and the total volume of data is about 100 GB per day. To archive and manage such a large quantity of data with high reliability and easy accessibility it was necessary to develop a new mass storage system with a catalogue information database using advanced database management technology. The data will be archived and maintained in the Master Data Storage Subsystem (MDSS) which is one subsystem in NASDA's new Earth Observation data and Information System (EOIS). The MDSS is based on a SONY ID1 digital tape robotics system. This paper provides an overview of the EOIS system, with a focus on the Master Data Storage Subsystem and the NASDA Earth Observation Center (EOC) archive policy for earth observation satellite data.
STScI Archive Manual, Version 7.0

NASA Astrophysics Data System (ADS)

Padovani, Paolo

1999-06-01

The STScI Archive Manual provides information a user needs to know to access the HST archive via its two user interfaces: StarView and a World Wide Web (WWW) interface. It provides descriptions of the StarView screens used to access information in the database and the format of that information, and introduces the use to the WWW interface. Using the two interfaces, users can search for observations, preview public data, and retrieve data from the archive. Using StarView one can also find calibration reference files and perform detailed association searches. With the WWW interface archive users can access, and obtain information on, all Multimission Archive at Space Telescope (MAST) data, a collection of mainly optical and ultraviolet datasets which include, amongst others, the International Ultraviolet Explorer (IUE) Final Archive. Both interfaces feature a name resolver which simplifies searches based on target name.
Development of bilateral data transferability in the Virginia Department of Transportation's Geotechnical Database Management System Framework.

DOT National Transportation Integrated Search

2006-01-01

An Internet-based, spatiotemporal Geotechnical Database Management System (GDBMS) Framework was designed, developed, and implemented at the Virginia Department of Transportation (VDOT) in 2002 to retrieve, manage, archive, and analyze geotechnical da...
Seventy Years of the Journal “Medical Archives”

PubMed Central

Masic, Izet

2016-01-01

This year journal “Medical Archives” celebrates 70th anniversary of its continuing publication. Medical Archives is oldest biomedical journal in Bosnia and Herzegovina and one of the oldest medical journals in Europe, established in the year 1947, as official scientific and professional journal of Association of Physicians of Bosnia and Herzegovina. Until present Medical Archives has published over 5000 articles. Today Medical Archives is internationally recognized medical peer-reviewed indexed journal, visible in more than 30 international on-line databases. PMID:26980925
Improving NGDC Track-line Data Quality Control

NASA Astrophysics Data System (ADS)

Chandler, M. T.; Wessel, P.

2004-12-01

Ship-board gravity, magnetic and bathymetry data archived at the National Geophysical Data Center (NGDC) represent decades of seagoing research, containing over 4,500 cruises. Cruise data remain relevent despite the prominence of satellite altimetry-derived global grids because many geologic processes remain resolvable by oceanographic research alone. Due to the tremendous investment put forth by scientists and taxpayers to compile this vast archive and the significant errors found within it, additional quality assessment and corrections are warranted. These can best be accomplished by adding to existing quality control measures at NGDC. We are currently developing open source software to provide additional quality control. Along with NGDC's current sanity checking, new data at NGDC will also be subjected to an along-track ``sniffer'' which will detect and flag suspicious data for later graphical inspection using a visual editor. If new data pass these tests, they will undergo further scrutinization using a crossover error (COE) calculator which will compare new data values to existing values at points of intersection within the archive. Data passing these tests will be deemed ``quality data`` and suitable for permanent addition to the archive, while data that fail will be returned to the source institution for correction. Crossover errors will be stored and an online COE database will be available. The COE database will allow users to apply corrections to the NGDC track-line database to produce corrected data files. At no time will the archived data itself be modified. An attempt will also be made to reduce navigational errors for pre-GPS navigated cruises. Upon completion these programs will be used to explore and model systematic errors within the archive, generate correction tables for all cruises, and to quantify the error budget in marine geophysical observations. Software will be released and these procedures will be implemented in cooperation with NGDC staff.
The Isothermal Dendritic Growth Experiment Archive

NASA Astrophysics Data System (ADS)

Koss, Matthew

2009-03-01

The growth of dendrites is governed by the interplay between two simple and familiar processes---the irreversible diffusion of energy, and the reversible work done in the formation of new surface area. To advance our understanding of these processes, NASA sponsored a project that flew on the Space Shuttle Columbia is 1994, 1996, and 1997 to record and analyze benchmark data in an apparent-microgravity ``laboratory.'' In this laboratory, energy transfer by gravity driven convection was essentially eliminated and one could test independently, for the first time, both components of dendritic growth theory. The analysis of this data shows that although the diffusion of energy can be properly accounted for, the results from interfacial physics appear to be in disagreement and alternate models should receive increased attention. Unfortunately, currently and for the foreseeable future, there is no access or financial support to develop and conduct additional experiments of this type. However, the benchmark data of 35mm photonegatives, video, and all supporting instrument data are now available at the IDGE Archive at the College of the Holy Cross. This data may still have considerable relevance to researchers working specifically with dendritic growth, and more generally those working in the synthesis, growth & processing of materials, multiscale computational modeling, pattern formation, and systems far from equilibrium.
Mining Connected Data

NASA Astrophysics Data System (ADS)

Michel, L.; Motch, C.; Pineau, F. X.

2009-05-01

As members of the Survey Science Consortium of the XMM-Newton mission the Strasbourg Observatory is in charge of the real-time cross-correlations of X-ray data with archival catalogs. We also are committed to provide a specific tools to handle these cross-correlations and propose identifications at other wavelengths. In order to do so, we developed a database generator (Saada) managing persitent links and supporting heterogeneous input datasets. This system allows to easily build an archive containing numerous and complex links between individual items [1]. It also offers a powerfull query engine able to select sources on the basis of the properties (existence, distance, colours) of the X-ray-archival associations. We present such a database in operation for the 2XMMi catalogue. This system is flexible enough to provide both a public data interface and a servicing interface which could be used in the framework of the Simbol-X ground segment.
Automated search and retrieval of information from imaged documents using optical correlation techniques

NASA Astrophysics Data System (ADS)

Stalcup, Bruce W.; Dennis, Phillip W.; Dydyk, Robert B.

1999-10-01

Litton PRC and Litton Data Systems Division are developing a system, the Imaged Document Optical Correlation and Conversion System (IDOCCS), to provide a total solution to the problem of managing and retrieving textual and graphic information from imaged document archives. At the heart of IDOCCS, optical correlation technology provides the search and retrieval of information from imaged documents. IDOCCS can be used to rapidly search for key words or phrases within the imaged document archives. In addition, IDOCCS can automatically compare an input document with the archived database to determine if it is a duplicate, thereby reducing the overall resources required to maintain and access the document database. Embedded graphics on imaged pages can also be exploited; e.g., imaged documents containing an agency's seal or logo can be singled out. In this paper, we present a description of IDOCCS as well as preliminary performance results and theoretical projections.
The Cambridge Structural Database: a quarter of a million crystal structures and rising.

PubMed

Allen, Frank H

2002-06-01

The Cambridge Structural Database (CSD) now contains data for more than a quarter of a million small-molecule crystal structures. The information content of the CSD, together with methods for data acquisition, processing and validation, are summarized, with particular emphasis on the chemical information added by CSD editors. Nearly 80% of new structural data arrives electronically, mostly in CIF format, and the CCDC acts as the official crystal structure data depository for 51 major journals. The CCDC now maintains both a CIF archive (more than 73,000 CIFs dating from 1996), as well as the distributed binary CSD archive; the availability of data in both archives is discussed. A statistical survey of the CSD is also presented and projections concerning future accession rates indicate that the CSD will contain at least 500,000 crystal structures by the year 2010.
A complete database for the Einstein imaging proportional counter

NASA Technical Reports Server (NTRS)

Helfand, David J.

1991-01-01

A complete database for the Einstein Imaging Proportional Counter (IPC) was completed. The original data that makes up the archive is described as well as the structure of the database, the Op-Ed analysis system, the technical advances achieved relative to the analysis of (IPC) data, the data products produced, and some uses to which the database has been put by scientists outside Columbia University over the past year.

Meet Spinky: An Open-Source Spindle and K-Complex Detection Toolbox Validated on the Open-Access Montreal Archive of Sleep Studies (MASS).

PubMed

Lajnef, Tarek; O'Reilly, Christian; Combrisson, Etienne; Chaibi, Sahbi; Eichenlaub, Jean-Baptiste; Ruby, Perrine M; Aguera, Pierre-Emmanuel; Samet, Mounir; Kachouri, Abdennaceur; Frenette, Sonia; Carrier, Julie; Jerbi, Karim

2017-01-01

Sleep spindles and K-complexes are among the most prominent micro-events observed in electroencephalographic (EEG) recordings during sleep. These EEG microstructures are thought to be hallmarks of sleep-related cognitive processes. Although tedious and time-consuming, their identification and quantification is important for sleep studies in both healthy subjects and patients with sleep disorders. Therefore, procedures for automatic detection of spindles and K-complexes could provide valuable assistance to researchers and clinicians in the field. Recently, we proposed a framework for joint spindle and K-complex detection (Lajnef et al., 2015a) based on a Tunable Q-factor Wavelet Transform (TQWT; Selesnick, 2011a) and morphological component analysis (MCA). Using a wide range of performance metrics, the present article provides critical validation and benchmarking of the proposed approach by applying it to open-access EEG data from the Montreal Archive of Sleep Studies (MASS; O'Reilly et al., 2014). Importantly, the obtained scores were compared to alternative methods that were previously tested on the same database. With respect to spindle detection, our method achieved higher performance than most of the alternative methods. This was corroborated with statistic tests that took into account both sensitivity and precision (i.e., Matthew's coefficient of correlation (MCC), F1, Cohen κ). Our proposed method has been made available to the community via an open-source tool named Spinky (for spindle and K-complex detection). Thanks to a GUI implementation and access to Matlab and Python resources, Spinky is expected to contribute to an open-science approach that will enhance replicability and reliable comparisons of classifier performances for the detection of sleep EEG microstructure in both healthy and patient populations.
High-Order Methods for Computational Physics

DTIC Science & Technology

1999-03-01

computation is running in 278 Ronald D. Henderson parallel. Instead we use the concept of a voxel database (VDB) of geometric positions in the mesh [85...processor 0 Fig. 4.19. Connectivity and communications axe established by building a voxel database (VDB) of positions. A VDB maps each position to a...studies such as the highly accurate stability computations considered help expand the database for this benchmark problem. The two-dimensional linear
Making geospatial data in ASF archive readily accessible

NASA Astrophysics Data System (ADS)

Gens, R.; Hogenson, K.; Wolf, V. G.; Drew, L.; Stern, T.; Stoner, M.; Shapran, M.

2015-12-01

The way geospatial data is searched, managed, processed and used has changed significantly in recent years. A data archive such as the one at the Alaska Satellite Facility (ASF), one of NASA's twelve interlinked Distributed Active Archive Centers (DAACs), used to be searched solely via user interfaces that were specifically developed for its particular archive and data sets. ASF then moved to using an application programming interface (API) that defined a set of routines, protocols, and tools for distributing the geospatial information stored in the database in real time. This provided a more flexible access to the geospatial data. Yet, it was up to user to develop the tools to get a more tailored access to the data they needed. We present two new approaches for serving data to users. In response to the recent Nepal earthquake we developed a data feed for distributing ESA's Sentinel data. Users can subscribe to the data feed and are provided with the relevant metadata the moment a new data set is available for download. The second approach was an Open Geospatial Consortium (OGC) web feature service (WFS). The WFS hosts the metadata along with a direct link from which the data can be downloaded. It uses the open-source GeoServer software (Youngblood and Iacovella, 2013) and provides an interface to include the geospatial information in the archive directly into the user's geographic information system (GIS) as an additional data layer. Both services are run on top of a geospatial PostGIS database, an open-source geographic extension for the PostgreSQL object-relational database (Marquez, 2015). Marquez, A., 2015. PostGIS essentials. Packt Publishing, 198 p. Youngblood, B. and Iacovella, S., 2013. GeoServer Beginner's Guide, Packt Publishing, 350 p.
Project MICAS: a multivendor open-system incremental approach to implementing an integrated enterprise-wide PACS: works in progress

NASA Astrophysics Data System (ADS)

Smith, Edward M.; Wright, Jeffrey; Fontaine, Marc T.; Robinson, Arvin E.

1998-07-01

The Medical Information, Communication and Archive System (MICAS) is a multi-vendor incremental approach to PACS. MICAS is a multi-modality integrated image management system that incorporates the radiology information system (RIS) and radiology image database (RID) with future 'hooks' to other hospital databases. Even though this approach to PACS is more risky than a single-vendor turn-key approach, it offers significant advantages. The vendors involved in the initial phase of MICAS are IDX Corp., ImageLabs, Inc. and Digital Equipment Corp (DEC). The network architecture operates at 100 MBits per sec except between the modalities and the stackable intelligent switch which is used to segment MICAS by modality. Each modality segment contains the acquisition engine for the modality, a temporary archive and one or more diagnostic workstations. All archived studies are available at all workstations, but there is no permanent archive at this time. At present, the RIS vendor is responsible for study acquisition and workflow as well as maintenance of the temporary archive. Management of study acquisition, workflow and the permanent archive will become the responsibility of the archive vendor when the archive is installed in the second quarter of 1998. The modalities currently interfaced to MICAS are MRI, CT and a Howtek film digitizer with Nuclear Medicine and computed radiography (CR) to be added when the permanent archive is installed. There are six dual-monitor diagnostic workstations which use ImageLabs Shared Vision viewer software located in MRI, CT, Nuclear Medicine, musculoskeletal reading areas and two in Radiology's main reading area. One of the major lessons learned to date is that the permanent archive should have been part of the initial MICAS installation and the archive vendor should have been responsible for image acquisition rather than the RIS vendor. Currently an archive vendor is being selected who will be responsible for the management of the archive plus the HIS/RIS interface, image acquisition, modality work list manager and interfacing to the current DICOM viewer software. The next phase of MICAS will include interfacing ultrasound, locating servers outside of the Radiology LAN to support the distribution of images and reports to the clinical floors and physician offices both within and outside of the University of Rochester Medical Center (URMC) campus and the teaching archive.
Architecture and evolution of Goddard Space Flight Center Distributed Active Archive Center

NASA Technical Reports Server (NTRS)

Bedet, Jean-Jacques; Bodden, Lee; Rosen, Wayne; Sherman, Mark; Pease, Phil

1994-01-01

The Goddard Space Flight Center (GSFC) Distributed Active Archive Center (DAAC) has been developed to enhance Earth Science research by improved access to remote sensor earth science data. Building and operating an archive, even one of a moderate size (a few Terabytes), is a challenging task. One of the critical components of this system is Unitree, the Hierarchical File Storage Management System. Unitree, selected two years ago as the best available solution, requires constant system administrative support. It is not always suitable as an archive and distribution data center, and has moderate performance. The Data Archive and Distribution System (DADS) software developed to monitor, manage, and automate the ingestion, archive, and distribution functions turned out to be more challenging than anticipated. Having the software and tools is not sufficient to succeed. Human interaction within the system must be fully understood to improve efficiency to improve efficiency and ensure that the right tools are developed. One of the lessons learned is that the operability, reliability, and performance aspects should be thoroughly addressed in the initial design. However, the GSFC DAAC has demonstrated that it is capable of distributing over 40 GB per day. A backup system to archive a second copy of all data ingested is under development. This backup system will be used not only for disaster recovery but will also replace the main archive when it is unavailable during maintenance or hardware replacement. The GSFC DAAC has put a strong emphasis on quality at all level of its organization. A Quality team has also been formed to identify quality issues and to propose improvements. The DAAC has conducted numerous tests to benchmark the performance of the system. These tests proved to be extremely useful in identifying bottlenecks and deficiencies in operational procedures.
An Image Archive With The ACR/NEMA Message Formats

NASA Astrophysics Data System (ADS)

Seshadri, Sridhar B.; Khalsa, Satjeet; Arenson, Ronald L.; Brikman, Inna; Davey, Michael J.

1988-06-01

An image archive has been designed to manage and store radiologic images received from within the main Hospital and a from a suburban orthopedic clinic. Images are stored on both magnetic as well as optical media. Prior comparison examinations are combined with the current examination to generate a 'viewing folder' that is sent to the display station for primary diagnosis. An 'archive-manager' controls the database managment, periodic optical disk backup and 'viewing-folder' generation. Images are converted into the ACR/NEMA message format before being written to the optical disk. The software design of the 'archive-manager' and its associated modules is presented. Enhancements to the system are discussed.
PLEXdb: Gene expression resources for plants and plant pathogens

USDA-ARS?s Scientific Manuscript database

PLEXdb (Plant Expression Database), in partnership with community databases, supports comparisons of gene expression across multiple plant and pathogen species, promoting individuals and/or consortia to upload genome-scale data sets to contrast them to previously archived data. These analyses facili...
The Rules of the Game: Properties of a Database of Expository Language Samples

ERIC Educational Resources Information Center

Heilmann, John; Malone, Thomas O.

2014-01-01

Purpose: The authors created a database of expository oral language samples with the aims of describing the nature of students' expository discourse and providing benchmark data for typically developing preteen and teenage students. Method: Using a favorite game or sport protocol, language samples were collected from 235 typically developing…
Selecting a Relational Database Management System for Library Automation Systems.

ERIC Educational Resources Information Center

Shekhel, Alex; O'Brien, Mike

1989-01-01

Describes the evaluation of four relational database management systems (RDBMSs) (Informix Turbo, Oracle 6.0 TPS, Unify 2000 and Relational Technology's Ingres 5.0) to determine which is best suited for library automation. The evaluation criteria used to develop a benchmark specifically designed to test RDBMSs for libraries are discussed. (CLB)
Visual Attention Modeling for Stereoscopic Video: A Benchmark and Computational Model.

PubMed

Fang, Yuming; Zhang, Chi; Li, Jing; Lei, Jianjun; Perreira Da Silva, Matthieu; Le Callet, Patrick

2017-10-01

In this paper, we investigate the visual attention modeling for stereoscopic video from the following two aspects. First, we build one large-scale eye tracking database as the benchmark of visual attention modeling for stereoscopic video. The database includes 47 video sequences and their corresponding eye fixation data. Second, we propose a novel computational model of visual attention for stereoscopic video based on Gestalt theory. In the proposed model, we extract the low-level features, including luminance, color, texture, and depth, from discrete cosine transform coefficients, which are used to calculate feature contrast for the spatial saliency computation. The temporal saliency is calculated by the motion contrast from the planar and depth motion features in the stereoscopic video sequences. The final saliency is estimated by fusing the spatial and temporal saliency with uncertainty weighting, which is estimated by the laws of proximity, continuity, and common fate in Gestalt theory. Experimental results show that the proposed method outperforms the state-of-the-art stereoscopic video saliency detection models on our built large-scale eye tracking database and one other database (DML-ITRACK-3D).
The EXOSAT database and archive

NASA Technical Reports Server (NTRS)

Reynolds, A. P.; Parmar, A. N.

1992-01-01

The EXOSAT database provides on-line access to the results and data products (spectra, images, and lightcurves) from the EXOSAT mission as well as access to data and logs from a number of other missions (such as EINSTEIN, COS-B, ROSAT, and IRAS). In addition, a number of familiar optical, infrared, and x ray catalogs, including the Hubble Space Telescope (HST) guide star catalog are available. The complete database is located at the EXOSAT observatory at ESTEC in the Netherlands and is accessible remotely via a captive account. The database management system was specifically developed to efficiently access the database and to allow the user to perform statistical studies on large samples of astronomical objects as well as to retrieve scientific and bibliographic information on single sources. The system was designed to be mission independent and includes timing, image processing, and spectral analysis packages as well as software to allow the easy transfer of analysis results and products to the user's own institute. The archive at ESTEC comprises a subset of the EXOSAT observations, stored on magnetic tape. Observations of particular interest were copied in compressed format to an optical jukebox, allowing users to retrieve and analyze selected raw data entirely from their terminals. Such analysis may be necessary if the user's needs are not accommodated by the products contained in the database (in terms of time resolution, spectral range, and the finesse of the background subtraction, for instance). Long-term archiving of the full final observation data is taking place at ESRIN in Italy as part of the ESIS program, again using optical media, and ESRIN have now assumed responsibility for distributing the data to the community. Tests showed that raw observational data (typically several tens of megabytes for a single target) can be transferred via the existing networks in reasonable time.
A Geospatial Database that Supports Derivation of Climatological Features of Severe Weather

NASA Astrophysics Data System (ADS)

Phillips, M.; Ansari, S.; Del Greco, S.

2007-12-01

The Severe Weather Data Inventory (SWDI) at NOAA's National Climatic Data Center (NCDC) provides user access to archives of several datasets critical to the detection and evaluation of severe weather. These datasets include archives of: · NEXRAD Level-III point features describing general storm structure, hail, mesocyclone and tornado signatures · National Weather Service Storm Events Database · National Weather Service Local Storm Reports collected from storm spotters · National Weather Service Warnings · Lightning strikes from Vaisala's National Lightning Detection Network (NLDN) SWDI archives all of these datasets in a spatial database that allows for convenient searching and subsetting. These data are accessible via the NCDC web site, Web Feature Services (WFS) or automated web services. The results of interactive web page queries may be saved in a variety of formats, including plain text, XML, Google Earth's KMZ, standards-based NetCDF and Shapefile. NCDC's Storm Risk Assessment Project (SRAP) uses data from the SWDI database to derive gridded climatology products that show the spatial distributions of the frequency of various events. SRAP also can relate SWDI events to other spatial data such as roads, population, watersheds, and other geographic, sociological, or economic data to derive products that are useful in municipal planning, emergency management, the insurance industry, and other areas where there is a need to quantify and qualify how severe weather patterns affect people and property.
RADIUS: Research Archive on Disability in the United States. [CD-ROMs].

ERIC Educational Resources Information Center

Sociometrics Corp., Los Altos, CA.

This Research Archive on Disability in the United States (RADIUS), a database on CD-ROM, contains 19 data sets on the prevalence, incidence, correlates, and consequences of disability in the United States. The 19 data sets are: (1) 1991 National Maternal and Infant Health Follow-Up Survey; (2) National Pediatric Trauma Registry, 1988-1994; (3)…
DOE Research and Development Accomplishments Nobel Chemists Associated with

Science.gov Websites

the DOE and Predecessors RSS Archive Videos XML DOE R&D Accomplishments DOE R&D Accomplishments searchQuery Ã Find searchQuery x Find DOE R&D Acccomplishments Navigation dropdown arrow The Blog Archive SC Stories Snapshots R&D Nuggets Database dropdown arrow Search Tag Cloud Browse
Kellogg Library and Archive Retrieval System (KLARS) Document Capture Manual. Draft Version.

ERIC Educational Resources Information Center

Hugo, Jane

This manual is designed to supply background information for Kellogg Library and Archive Retrieval System (KLARS) processors and others who might work with the system, outline detailed policies and procedures for processors who prepare and enter data into the adult education database on KLARS, and inform general readers about the system. KLARS is…
Combustion Science to Reduce PM Emissions for Military Platforms

DTIC Science & Technology

2012-01-01

355 7.0 References 356 Appendix: List of Archival Publications and Conference Papers 376 vi List...carbonaddition HITRAN Database of infra-red spectra HP High Pressure HW Harris and Weiner ICCD Intensified charge coupled device ID internal diameter IR ...archival publication based on this work received a distinguished outstanding paper award at the 32nd International Combustion Symposium
MIMIC II: a massive temporal ICU patient database to support research in intelligent patient monitoring

NASA Technical Reports Server (NTRS)

Saeed, M.; Lieu, C.; Raber, G.; Mark, R. G.

2002-01-01

Development and evaluation of Intensive Care Unit (ICU) decision-support systems would be greatly facilitated by the availability of a large-scale ICU patient database. Following our previous efforts with the MIMIC (Multi-parameter Intelligent Monitoring for Intensive Care) Database, we have leveraged advances in networking and storage technologies to develop a far more massive temporal database, MIMIC II. MIMIC II is an ongoing effort: data is continuously and prospectively archived from all ICU patients in our hospital. MIMIC II now consists of over 800 ICU patient records including over 120 gigabytes of data and is growing. A customized archiving system was used to store continuously up to four waveforms and 30 different parameters from ICU patient monitors. An integrated user-friendly relational database was developed for browsing of patients' clinical information (lab results, fluid balance, medications, nurses' progress notes). Based upon its unprecedented size and scope, MIMIC II will prove to be an important resource for intelligent patient monitoring research, and will support efforts in medical data mining and knowledge-discovery.
The ERESE Project: Interfacing with the ERDA Digital Archive and ERR Reference Database in EarthRef.org

NASA Astrophysics Data System (ADS)

Koppers, A. A.; Staudigel, H.; Mills, H.; Keller, M.; Wallace, A.; Bachman, N.; Helly, J.; Helly, M.; Miller, S. P.; Massell Symons, C.

2004-12-01

To bridge the gap between Earth science teachers, librarians, scientists and data archive managers, we have started the ERESE project that will create, archive and make available "Enduring Resources in Earth Science Education" through information technology (IT) portals. In the first phase of this National Science Digital Library (NSDL) project, we are focusing on the development of these ERESE resources for middle and high school teachers to be used in lesson plans with "plate tectonics" and "magnetics" as their main theme. In this presentation, we will show how these new ERESE resources are being generated, how they can be uploaded via online web wizards, how they are archived, how we make them available via the EarthRef.org Digital Archive (ERDA) and Reference Database (ERR), and how they relate to the SIOExplorer database containing data objects for all seagoing cruises carried out by the Scripps Institution of Oceanography. The EarthRef.org web resource uses the vision of a "general description" of the Earth as a geological system to provide an IT infrastructure for the Earth sciences. This emphasizes the marriage of the "scientific process" (and its results) with an educational cyber-infrastructure for teaching Earth sciences, on any level, from middle school to college and graduate levels. Eight different databases reside under EarthRef.org from which ERDA holds any digital object that has been uploaded by other scientists, teachers and students for free, while the ERR holds more than 80,000 publications. For more than 1,500 of these publications, this latter database makes available for downloading JPG/PDF images of the abstracts, data tables, methods and appendices, together with their digitized contents in Microsoft Word and Excel format. Both holdings are being used to store the ERESE objects that are being generated by a group of undergraduate students majoring in Environmental Systems (ESYS) program at the UCSD with an emphasis on the Earth Sciences. These students perform library and internet research in order to design and generate these "Enduring Resources in Earth Science Education" that they test by closely interacting with the research faculty at the Scripps Institution of Oceanography. Typical ERESE resources can be diagrams, model cartoons, maps, data sets for analyses, and glossary items and essays to explain certain Earth Science concepts and are ready to be used in the classroom.
Database resources of the National Center for Biotechnology Information

PubMed Central

Sayers, Eric W.; Barrett, Tanya; Benson, Dennis A.; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M.; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D.; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A.; Wagner, Lukas; Wang, Yanli; Wilbur, W. John; Yaschenko, Eugene; Ye, Jian

2012-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:22140104
Database resources of the National Center for Biotechnology Information

PubMed Central

2013-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page. PMID:23193264

Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings

NASA Astrophysics Data System (ADS)

Kaskhedikar, Apoorva Prakash

According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI pertinent to the building type. The ability to identify and rank the important variables is of great importance in practical implementation of the benchmarking tools which rely on query-based building and HVAC variable filters specified by the user.
The Biomolecular Interaction Network Database and related tools 2005 update

PubMed Central

Alfarano, C.; Andrade, C. E.; Anthony, K.; Bahroos, N.; Bajec, M.; Bantoft, K.; Betel, D.; Bobechko, B.; Boutilier, K.; Burgess, E.; Buzadzija, K.; Cavero, R.; D'Abreo, C.; Donaldson, I.; Dorairajoo, D.; Dumontier, M. J.; Dumontier, M. R.; Earles, V.; Farrall, R.; Feldman, H.; Garderman, E.; Gong, Y.; Gonzaga, R.; Grytsan, V.; Gryz, E.; Gu, V.; Haldorsen, E.; Halupa, A.; Haw, R.; Hrvojic, A.; Hurrell, L.; Isserlin, R.; Jack, F.; Juma, F.; Khan, A.; Kon, T.; Konopinsky, S.; Le, V.; Lee, E.; Ling, S.; Magidin, M.; Moniakis, J.; Montojo, J.; Moore, S.; Muskat, B.; Ng, I.; Paraiso, J. P.; Parker, B.; Pintilie, G.; Pirone, R.; Salama, J. J.; Sgro, S.; Shan, T.; Shu, Y.; Siew, J.; Skinner, D.; Snyder, K.; Stasiuk, R.; Strumpf, D.; Tuekam, B.; Tao, S.; Wang, Z.; White, M.; Willis, R.; Wolting, C.; Wong, S.; Wrong, A.; Xin, C.; Yao, R.; Yates, B.; Zhang, S.; Zheng, K.; Pawson, T.; Ouellette, B. F. F.; Hogue, C. W. V.

2005-01-01

The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research and to provide this information, as well as tools to enable data analysis, freely to researchers worldwide. BIND data are curated into a comprehensive machine-readable archive of computable information and provides users with methods to discover interactions and molecular mechanisms. BIND has worked to develop new methods for visualization that amplify the underlying annotation of genes and proteins to facilitate the study of molecular interaction networks. BIND has maintained an open database policy since its inception in 1999. Data growth has proceeded at a tremendous rate, approaching over 100 000 records. New services provided include a new BIND Query and Submission interface, a Standard Object Access Protocol service and the Small Molecule Interaction Database (http://smid.blueprint.org) that allows users to determine probable small molecule binding sites of new sequences and examine conserved binding residues. PMID:15608229
Benchmarking protein classification algorithms via supervised cross-validation.

PubMed

Kertész-Farkas, Attila; Dhir, Somdutta; Sonego, Paolo; Pacurar, Mircea; Netoteia, Sergiu; Nijveen, Harm; Kuzniar, Arnold; Leunissen, Jack A M; Kocsor, András; Pongor, Sándor

2008-04-24

Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and random sampling was used to construct model datasets, suitable for algorithm comparison.
The archive of the History of Psychology at the University of Rome, Sapienza.

PubMed

Bartolucci, Chiara; Fox Lee, Shayna

2016-02-01

The History of Psychology Archive at the University of Rome, Sapienza was founded in 2008 in the Department of Dynamic and Clinical Psychology. The archive aspires to become an indispensable tool to (a) understand the currents, schools, and research traditions that have marked the path of Italian psychology, (b) focus on issues of general and applied psychology developed in each university, (c) identify experimental and clinical-differential methodologies specific to each lab, (d) reconstruct the genesis and consolidation of psychology institutions and, ultimately, (e) write a "story," set according to the most recent historiographical criteria. The archive is designed according to scholarship on the history of Italian psychology from the past two decades. The online archive is divided into five sections for ease of access. The Sapienza archive is a work in progress and it has plans for expansion. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
New super-computing facility in RIKEN

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ohta, Shigemi

1994-12-31

A new superconductor, Fujitsu VPP500/28, was installed in the Institute of Physical and Chemical Research (RIKEN) at the end of March, 1994. It consists of 28 processing elements (PE`s) connected by a high-speed crossbar switch. The switch is a combination of GaAs and ECL circuitry with peak band width of 800 Mbyte per second. Each PE consists of a GaAs/ECL vector processor with 1.6 Gflops peak speed and 256 Mbyte SRAM local memory. In addition, there are 8 GByte DRAM space, two 100 Gbyte RAID disks and a 10 TByte archive based on SONY File Bank system. The author ranmore » three major benchmarks on this machine: modified LINPACK, lattice QCD and FFT. In the modified LINPACK benchmark, a sustained speed of about 28 Gflops is achieved, by removing the restriction on the size of the matrices. In the lattice QCD benchmark, a sustained speed of about 30 Gflops is achieved for inverting staggered fermion propagation matrix on a 32{sup 4} lattice. In the FFT benchmark, real data of 32, 128, 512, and 2048 MByte are Fourier-transformed. The sustained speed for each is respectively 21, 21, 20, and 19 Gflops. The numbers are obtained after only a few weeks of coding efforts and can be improved further.« less
Digital Archiving: Where the Past Lives Again

NASA Astrophysics Data System (ADS)

Paxson, K. B.

2012-06-01

The process of digital archiving for variable star data by manual entry with an Excel spreadsheet is described. Excel-based tools including a Step Magnitude Calculator and a Julian Date Calculator for variable star observations where magnitudes and Julian dates have not been reduced are presented. Variable star data in the literature and the AAVSO International Database prior to 1911 are presented and reviewed, with recent archiving work being highlighted. Digitization using optical character recognition software conversion is also demonstrated, with editing and formatting suggestions for the OCR-converted text.
U.S. Geological Survey archived data recovery in Texas, 2008-11

USGS Publications Warehouse

Wehmeyer, Loren L.; Reece, Brian D.

2011-01-01

The 2008–11 data rescue and recovery efforts by the U.S. Geological Survey (USGS) Texas Water Science Center resulted in an efficient workflow process, database, and Web user interface for scientists and citizens to access archived environmental information with practical applications. Much of this information is unique and has never been readily available to the public. The methods developed and lessons learned during this effort are now being applied to facilitate recovering archived information requested by USGS scientists, cooperators, and the general public.
EarthRef.org: Exploring aspects of a Cyber Infrastructure in Earth Science and Education

NASA Astrophysics Data System (ADS)

Staudigel, H.; Koppers, A.; Tauxe, L.; Constable, C.; Helly, J.

2004-12-01

EarthRef.org is the common host and (co-) developer of a range of earth science databases and IT resources providing a test bed for a Cyberinfrastructure in Earth Science and Education (CIESE). EarthRef.org data base efforts include in particular the Geochemical Earth Reference Model (GERM), the Magnetics Information Consortium (MagIC), the Educational Resources for Earth Science Education (ERESE) project, the Seamount Catalog, the Mid-Ocean Ridge Catalog, the Radio-Isotope Geochronology (RiG) initiative for CHRONOS, and the Microbial Observatory for Fe oxidizing microbes on Loihi Seamount (FeMO; the most recent development). These diverse databases are developed under a single database umbrella and webserver at the San Diego Supercomputing Center. All the data bases have similar structures, with consistent metadata concepts, a common database layout, and automated upload wizards. Shared resources include supporting databases like an address book, a reference/publication catalog, and a common digital archive making database development and maintenance cost-effective, while guaranteeing interoperability. The EarthRef.org CIESE provides a common umbrella for synthesis information as well as sample-based data, and it bridges the gap between science and science education in middle and high schools, validating the potential for a system wide data infrastructure in a CIESE. EarthRef.org experiences have shown that effective communication with the respective communities is a key part of a successful CIESE facilitating both utility and community buy-in. GERM has been particularly successful at developing a metadata scheme for geochemistry and in the development of a new electronic journal (G-cubed) that has made much progress in data publication and linkages between journals and community data bases. GERM also has worked, through editors and publishers, towards interfacing databases with the publication process, to accomplish a more scholarly and database friendly data publication environment, and to interface with the respective science communities. MagIC has held several workshops that have resulted in an integrated data archival environment using metadata that are interchangeable with the geochemical metadata. MagIC archives a wide array of paleo and rock magnetic directional, intensity and magnetic property data as well as integrating computational tools. ERESE brought together librarians, teachers, and scientists to create an educational environment that supports inquiry driven education and the use of science data. Experiences in EarthRef.org demonstrates the feasibility of an effective, community wide CIESE for data publication, archival and modeling, as well as the outreach to the educational community.
The use of the Hirsch index in benchmarking hepatic surgery research.

PubMed

Cucchetti, Alessandro; Mazzotti, Federico; Pellegrini, Sara; Cescon, Matteo; Maroni, Lorenzo; Ercolani, Giorgio; Pinna, Antonio Daniele

2013-10-01

The Hirsch index (h-index) is recognized as an effective way to summarize an individual's scientific research output. However, a benchmark for evaluating surgeon scientists in the field of hepatic surgery is still not available. A total of 3,251 authors who published between 1949 and 2011 were identified using the Scopus identification number. The h-index, the total number of cited document, the total number of citations, and the scientific age were calculated for each author using both Scopus and Google Scholar. The median h-index was 6 and the median scientific age, assessed with Google Scholar, was 19 years. The numbers of cited documents, numbers of citations, and h-indexes obtained from Scopus and Google Scholar showed good correlation with one another; however, the results from the 2 databases were modified in different ways by scientific age. By plotting scientific age against h-index percentiles an h-index growth chart for both Scopus database and Google Scholar was provided. This analysis provides a first benchmark to assess surgeon scientists' productivity in the field of liver surgery. Copyright © 2013 Elsevier Inc. All rights reserved.
The Star Schema Benchmark and Augmented Fact Table Indexing

NASA Astrophysics Data System (ADS)

O'Neil, Patrick; O'Neil, Elizabeth; Chen, Xuedong; Revilak, Stephen

We provide a benchmark measuring star schema queries retrieving data from a fact table with Where clause column restrictions on dimension tables. Clustering is crucial to performance with modern disk technology, since retrievals with filter factors down to 0.0005 are now performed most efficiently by sequential table search rather than by indexed access. DB2’s Multi-Dimensional Clustering (MDC) provides methods to "dice" the fact table along a number of orthogonal "dimensions", but only when these dimensions are columns in the fact table. The diced cells cluster fact rows on several of these "dimensions" at once so queries restricting several such columns can access crucially localized data, with much faster query response. Unfortunately, columns of dimension tables of a star schema are not usually represented in the fact table. In this paper, we show a simple way to adjoin physical copies of dimension columns to the fact table, dicing data to effectively cluster query retrieval, and explain how such dicing can be achieved on database products other than DB2. We provide benchmark measurements to show successful use of this methodology on three commercial database products.
The Digitization of Early English Books: A Database Comparison of Internet Archive and Early English Books Online

ERIC Educational Resources Information Center

Brightenburg, Cindy

2016-01-01

The use of digital books is diverse, ranging from casual reading to in-depth primary source research. Digitization of early English printed books in particular, has provided greater access to a previously limited resource for academic faculty and researchers. Internet Archive, a free, internet website and Early English Books Online, a subscription…
Using and Distributing Spaceflight Data: The Johnson Space Center Life Sciences Data Archive

NASA Technical Reports Server (NTRS)

Cardenas, J. A.; Buckey, J. C.; Turner, J. N.; White, T. S.; Havelka,J. A.

1995-01-01

Life sciences data collected before, during and after spaceflight are valuable and often irreplaceable. The Johnson Space Center Life is hard to find, and much of the data (e.g. Sciences Data Archive has been designed to provide researchers, engineers, managers and educators interactive access to information about and data from human spaceflight experiments. The archive system consists of a Data Acquisition System, Database Management System, CD-ROM Mastering System and Catalog Information System (CIS). The catalog information system is the heart of the archive. The CIS provides detailed experiment descriptions (both written and as QuickTime movies), hardware descriptions, hardware images, documents, and data. An initial evaluation of the archive at a scientific meeting showed that 88% of those who evaluated the catalog want to use the system when completed. The majority of the evaluators found the archive flexible, satisfying and easy to use. We conclude that the data archive effectively provides key life sciences data to interested users.
Newspaper archives + text mining = rich sources of historical geo-spatial data

NASA Astrophysics Data System (ADS)

Yzaguirre, A.; Smit, M.; Warren, R.

2016-04-01

Newspaper archives are rich sources of cultural, social, and historical information. These archives, even when digitized, are typically unstructured and organized by date rather than by subject or location, and require substantial manual effort to analyze. The effort of journalists to be accurate and precise means that there is often rich geo-spatial data embedded in the text, alongside text describing events that editors considered to be of sufficient importance to the region or the world to merit column inches. A regional newspaper can add over 100,000 articles to its database each year, and extracting information from this data for even a single country would pose a substantial Big Data challenge. In this paper, we describe a pilot study on the construction of a database of historical flood events (location(s), date, cause, magnitude) to be used in flood assessment projects, for example to calibrate models, estimate frequency, establish high water marks, or plan for future events in contexts ranging from urban planning to climate change adaptation. We then present a vision for extracting and using the rich geospatial data available in unstructured text archives, and suggest future avenues of research.
The Global Streamflow Indices and Metadata archive (G-SIM): A compilation of global streamflow time series indices and meta-data

NASA Astrophysics Data System (ADS)

Do, Hong; Gudmundsson, Lukas; Leonard, Michael; Westra, Seth; Senerivatne, Sonia

2017-04-01

In-situ observations of daily streamflow with global coverage are a crucial asset for understanding large-scale freshwater resources which are an essential component of the Earth system and a prerequisite for societal development. Here we present the Global Streamflow Indices and Metadata archive (G-SIM), a collection indices derived from more than 20,000 daily streamflow time series across the globe. These indices are designed to support global assessments of change in wet and dry extremes, and have been compiled from 12 free-to-access online databases (seven national databases and five international collections). The G-SIM archive also includes significant metadata to help support detailed understanding of streamflow dynamics, with the inclusion of drainage area shapefile and many essential catchment properties such as land cover type, soil and topographic characteristics. The automated procedure in data handling and quality control of the project makes G-SIM a reproducible, extendible archive and can be utilised for many purposes in large-scale hydrology. Some potential applications include the identification of observational trends in hydrological extremes, the assessment of climate change impacts on streamflow regimes, and the validation of global hydrological models.
The SAMI Galaxy Survey: A prototype data archive for Big Science exploration

NASA Astrophysics Data System (ADS)

Konstantopoulos, I. S.; Green, A. W.; Foster, C.; Scott, N.; Allen, J. T.; Fogarty, L. M. R.; Lorente, N. P. F.; Sweet, S. M.; Hopkins, A. M.; Bland-Hawthorn, J.; Bryant, J. J.; Croom, S. M.; Goodwin, M.; Lawrence, J. S.; Owers, M. S.; Richards, S. N.

2015-11-01

We describe the data archive and database for the SAMI Galaxy Survey, an ongoing observational program that will cover ≈3400 galaxies with integral-field (spatially-resolved) spectroscopy. Amounting to some three million spectra, this is the largest sample of its kind to date. The data archive and built-in query engine use the versatile Hierarchical Data Format (HDF5), which precludes the need for external metadata tables and hence the setup and maintenance overhead those carry. The code produces simple outputs that can easily be translated to plots and tables, and the combination of these tools makes for a light system that can handle heavy data. This article acts as a contextual companion to the SAMI Survey Database source code repository, samiDB, which is freely available online and written entirely in Python. We also discuss the decisions related to the selection of tools and the creation of data visualisation modules. It is our aim that the work presented in this article-descriptions, rationale, and source code-will be of use to scientists looking to set up a maintenance-light data archive for a Big Science data load.
EMMA—mouse mutant resources for the international scientific community

PubMed Central

Wilkinson, Phil; Sengerova, Jitka; Matteoni, Raffaele; Chen, Chao-Kung; Soulat, Gaetan; Ureta-Vidal, Abel; Fessele, Sabine; Hagn, Michael; Massimi, Marzia; Pickford, Karen; Butler, Richard H.; Marschall, Susan; Mallon, Ann-Marie; Pickard, Amanda; Raspa, Marcello; Scavizzi, Ferdinando; Fray, Martin; Larrigaldie, Vanessa; Leyritz, Johan; Birney, Ewan; Tocchini-Valentini, Glauco P.; Brown, Steve; Herault, Yann; Montoliu, Lluis; de Angelis, Martin Hrabé; Smedley, Damian

2010-01-01

The laboratory mouse is the premier animal model for studying human disease and thousands of mutants have been identified or produced, most recently through gene-specific mutagenesis approaches. High throughput strategies by the International Knockout Mouse Consortium (IKMC) are producing mutants for all protein coding genes. Generating a knock-out line involves huge monetary and time costs so capture of both the data describing each mutant alongside archiving of the line for distribution to future researchers is critical. The European Mouse Mutant Archive (EMMA) is a leading international network infrastructure for archiving and worldwide provision of mouse mutant strains. It operates in collaboration with the other members of the Federation of International Mouse Resources (FIMRe), EMMA being the European component. Additionally EMMA is one of four repositories involved in the IKMC, and therefore the current figure of 1700 archived lines will rise markedly. The EMMA database gathers and curates extensive data on each line and presents it through a user-friendly website. A BioMart interface allows advanced searching including integrated querying with other resources e.g. Ensembl. Other resources are able to display EMMA data by accessing our Distributed Annotation System server. EMMA database access is publicly available at http://www.emmanet.org. PMID:19783817
Operating a petabyte class archive at ESO

NASA Astrophysics Data System (ADS)

Suchar, Dieter; Lockhart, John S.; Burrows, Andrew

2008-07-01

The challenges of setting up and operating a Petabyte Class Archive will be described in terms of computer systems within a complex Data Centre environment. The computer systems, including the ESO Primary and Secondary Archive and the associated computational environments such as relational databases will be explained. This encompasses the entire system project cycle, including the technical specifications, procurement process, equipment installation and all further operational phases. The ESO Data Centre construction and the complexity of managing the environment will be presented. Many factors had to be considered during the construction phase, such as power consumption, targeted cooling and the accumulated load on the building structure to enable the smooth running of a Petabyte class Archive.
Adsorption structures and energetics of molecules on metal surfaces: Bridging experiment and theory

NASA Astrophysics Data System (ADS)

Maurer, Reinhard J.; Ruiz, Victor G.; Camarillo-Cisneros, Javier; Liu, Wei; Ferri, Nicola; Reuter, Karsten; Tkatchenko, Alexandre

2016-05-01

Adsorption geometry and stability of organic molecules on surfaces are key parameters that determine the observable properties and functions of hybrid inorganic/organic systems (HIOSs). Despite many recent advances in precise experimental characterization and improvements in first-principles electronic structure methods, reliable databases of structures and energetics for large adsorbed molecules are largely amiss. In this review, we present such a database for a range of molecules adsorbed on metal single-crystal surfaces. The systems we analyze include noble-gas atoms, conjugated aromatic molecules, carbon nanostructures, and heteroaromatic compounds adsorbed on five different metal surfaces. The overall objective is to establish a diverse benchmark dataset that enables an assessment of current and future electronic structure methods, and motivates further experimental studies that provide ever more reliable data. Specifically, the benchmark structures and energetics from experiment are here compared with the recently developed van der Waals (vdW) inclusive density-functional theory (DFT) method, DFT + vdWsurf. In comparison to 23 adsorption heights and 17 adsorption energies from experiment we find a mean average deviation of 0.06 Å and 0.16 eV, respectively. This confirms the DFT + vdWsurf method as an accurate and efficient approach to treat HIOSs. A detailed discussion identifies remaining challenges to be addressed in future development of electronic structure methods, for which the here presented benchmark database may serve as an important reference.
GRAFLAB 2.3 for UNIX - A MATLAB database, plotting, and analysis tool: User`s guide

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dunn, W.N.

1998-03-01

This report is a user`s manual for GRAFLAB, which is a new database, analysis, and plotting package that has been written entirely in the MATLAB programming language. GRAFLAB is currently used for data reduction, analysis, and archival. GRAFLAB was written to replace GRAFAID, which is a FORTRAN database, analysis, and plotting package that runs on VAX/VMS.
Database resources of the National Center for Biotechnology Information.

PubMed

2016-01-04

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Database resources of the National Center for Biotechnology Information.

PubMed

2015-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Performance Evaluation of NoSQL Databases: A Case Study

DTIC Science & Technology

2015-02-01

a centralized relational database. The customer decided to consider NoSQL technologies for two specific uses, namely:  the primary data store for...17 custom specific 6. FU NoSQL availab data mo arking of data g a specific wo sin benchmark f hmark for tran le workload de o publish meas their...The choice of a particular NoSQL database imposes a specific distributed software architecture and data model, and is a major determinant of the
Comparison of hospital databases on antibiotic consumption in France, for a single management tool.

PubMed

Henard, S; Boussat, S; Demoré, B; Clément, S; Lecompte, T; May, T; Rabaud, C

2014-07-01

The surveillance of antibiotic use in hospitals and of data on resistance is an essential measure for antibiotic stewardship. There are 3 national systems in France to collect data on antibiotic use: DREES, ICATB, and ATB RAISIN. We compared these databases and drafted recommendations for the creation of an optimized database of information on antibiotic use, available to all concerned personnel: healthcare authorities, healthcare facilities, and healthcare professionals. We processed and analyzed the 3 databases (2008 data), and surveyed users. The qualitative analysis demonstrated major discrepancies in terms of objectives, healthcare facilities, participation rate, units of consumption, conditions for collection, consolidation, and control of data, and delay before availability of results. The quantitative analysis revealed that the consumption data for a given healthcare facility differed from one database to another, challenging the reliability of data collection. We specified user expectations: to compare consumption and resistance data, to carry out benchmarking, to obtain data on the prescribing habits in healthcare units, or to help understand results. The study results demonstrated the need for a reliable, single, and automated tool to manage data on antibiotic consumption compared with resistance data on several levels (national, regional, healthcare facility, healthcare units), providing rapid local feedback and educational benchmarking. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Databases for Microbiologists

DOE PAGES

Zhulin, Igor B.

2015-05-26

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.
Databases for Microbiologists

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhulin, Igor B.

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.
Databases for Microbiologists

PubMed Central

2015-01-01

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. The purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists. PMID:26013493
A benchmark study of scoring methods for non-coding mutations.

PubMed

Drubay, Damien; Gautheret, Daniel; Michiels, Stefan

2018-05-15

Detailed knowledge of coding sequences has led to different candidate models for pathogenic variant prioritization. Several deleteriousness scores have been proposed for the non-coding part of the genome, but no large-scale comparison has been realized to date to assess their performance. We compared the leading scoring tools (CADD, FATHMM-MKL, Funseq2 and GWAVA) and some recent competitors (DANN, SNP and SOM scores) for their ability to discriminate assumed pathogenic variants from assumed benign variants (using the ClinVar, COSMIC and 1000 genomes project databases). Using the ClinVar benchmark, CADD was the best tool for detecting the pathogenic variants that are mainly located in protein coding gene regions. Using the COSMIC benchmark, FATHMM-MKL, GWAVA and SOMliver outperformed the other tools for pathogenic variants that are typically located in lincRNAs, pseudogenes and other parts of the non-coding genome. However, all tools had low precision, which could potentially be improved by future non-coding genome feature discoveries. These results may have been influenced by the presence of potential benign variants in the COSMIC database. The development of a gold standard as consistent as ClinVar for these regions will be necessary to confirm our tool ranking. The Snakemake, C++ and R codes are freely available from https://github.com/Oncostat/BenchmarkNCVTools and supported on Linux. damien.drubay@gustaveroussy.fr or stefan.michiels@gustaveroussy.fr. Supplementary data are available at Bioinformatics online.
CARDS - comprehensive aerological reference data set. Station history, Version 2.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1994-03-01

The possibility of anthropogenic climate change has reached the attention of Government officials and researchers. However, one cannot study climate change without climate data. The CARDS project will produce high-quality upper-air data for the research community and for policy-makers. The authors intend to produce a dataset which is: easy to use, as complete as possible, as free of random errors as possible. They will also attempt to identify biases and remove them whenever possible. In this report, they relate progress toward their goal. They created a robust new format for archiving upper-air data, and designed a relational database structure tomore » hold them. The authors have converted 13 datasets to the new format and have archived over 10,000,000 individual soundings from 10 separate data sources. They produce and archive a metadata summary of each sounding they load. They have researched station histories, and have built a preliminary upper-air station history database. They have converted station-sorted data from their primary database into synoptic-sorted data in a parallel database. They have tested and will soon implement an advanced quality-control procedure, capable of detecting and often repairing errors in geopotential height, temperature, humidity, and wind. This unique quality-control method uses simultaneous vertical, horizontal, and temporal checks of several meteorological variables. It can detect errors other methods cannot. This report contains the station histories for the CARDS data set.« less
Database Resources of the BIG Data Center in 2018

PubMed Central

Xu, Xingjian; Hao, Lili; Zhu, Junwei; Tang, Bixia; Zhou, Qing; Song, Fuhai; Chen, Tingting; Zhang, Sisi; Dong, Lili; Lan, Li; Wang, Yanqing; Sang, Jian; Hao, Lili; Liang, Fang; Cao, Jiabao; Liu, Fang; Liu, Lin; Wang, Fan; Ma, Yingke; Xu, Xingjian; Zhang, Lijuan; Chen, Meili; Tian, Dongmei; Li, Cuiping; Dong, Lili; Du, Zhenglin; Yuan, Na; Zeng, Jingyao; Zhang, Zhewen; Wang, Jinyue; Shi, Shuo; Zhang, Yadong; Pan, Mengyu; Tang, Bixia; Zou, Dong; Song, Shuhui; Sang, Jian; Xia, Lin; Wang, Zhennan; Li, Man; Cao, Jiabao; Niu, Guangyi; Zhang, Yang; Sheng, Xin; Lu, Mingming; Wang, Qi; Xiao, Jingfa; Zou, Dong; Wang, Fan; Hao, Lili; Liang, Fang; Li, Mengwei; Sun, Shixiang; Zou, Dong; Li, Rujiao; Yu, Chunlei; Wang, Guangyu; Sang, Jian; Liu, Lin; Li, Mengwei; Li, Man; Niu, Guangyi; Cao, Jiabao; Sun, Shixiang; Xia, Lin; Yin, Hongyan; Zou, Dong; Xu, Xingjian; Ma, Lina; Chen, Huanxin; Sun, Yubin; Yu, Lei; Zhai, Shuang; Sun, Mingyuan; Zhang, Zhang; Zhao, Wenming; Xiao, Jingfa; Bao, Yiming; Song, Shuhui; Hao, Lili; Li, Rujiao; Ma, Lina; Sang, Jian; Wang, Yanqing; Tang, Bixia; Zou, Dong; Wang, Fan

2018-01-01

Abstract The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. PMID:29036542
The Longhorn Array Database (LAD): An Open-Source, MIAME compliant implementation of the Stanford Microarray Database (SMD)

PubMed Central

Killion, Patrick J; Sherlock, Gavin; Iyer, Vishwanath R

2003-01-01

Background The power of microarray analysis can be realized only if data is systematically archived and linked to biological annotations as well as analysis algorithms. Description The Longhorn Array Database (LAD) is a MIAME compliant microarray database that operates on PostgreSQL and Linux. It is a fully open source version of the Stanford Microarray Database (SMD), one of the largest microarray databases. LAD is available at Conclusions Our development of LAD provides a simple, free, open, reliable and proven solution for storage and analysis of two-color microarray data. PMID:12930545
How to Search, Write, Prepare and Publish the Scientific Papers in the Biomedical Journals

PubMed Central

Masic, Izet

2011-01-01

This article describes the methodology of preparation, writing and publishing scientific papers in biomedical journals. given is a concise overview of the concept and structure of the System of biomedical scientific and technical information and the way of biomedical literature retreival from worldwide biomedical databases. Described are the scientific and professional medical journals that are currently published in Bosnia and Herzegovina. Also, given is the comparative review on the number and structure of papers published in indexed journals in Bosnia and Herzegovina, which are listed in the Medline database. Analyzed are three B&H journals indexed in MEDLINE database: Medical Archives (Medicinski Arhiv), Bosnian Journal of Basic Medical Sciences and Medical Gazette (Medicinki Glasnik) in 2010. The largest number of original papers was published in the Medical Archives. There is a statistically significant difference in the number of papers published by local authors in relation to international journals in favor of the Medical Archives. True, the Journal Bosnian Journal of Basic Medical Sciences does not categorize the articles and we could not make comparisons. Journal Medical Archives and Bosnian Journal of Basic Medical Sciences by percentage published the largest number of articles by authors from Sarajevo and Tuzla, the two oldest and largest university medical centers in Bosnia and Herzegovina. The author believes that it is necessary to make qualitative changes in the reception and reviewing of papers for publication in biomedical journals published in Bosnia and Herzegovina which should be the responsibility of the separate scientific authority/ committee composed of experts in the field of medicine at the state level. PMID:23572850
A VBA Desktop Database for Proposal Processing at National Optical Astronomy Observatories

NASA Astrophysics Data System (ADS)

Brown, Christa L.

National Optical Astronomy Observatories (NOAO) has developed a relational Microsoft Windows desktop database using Microsoft Access and the Microsoft Office programming language, Visual Basic for Applications (VBA). The database is used to track data relating to observing proposals from original receipt through the review process, scheduling, observing, and final statistical reporting. The database has automated proposal processing and distribution of information. It allows NOAO to collect and archive data so as to query and analyze information about our science programs in new ways.
Flexible Decision Support in Device-Saturated Environments

DTIC Science & Technology

2003-10-01

also output tuples to a remote MySQL or Postgres database. 3.3 GUI The GUI allows the user to pose queries using SQL and to display query...DatabaseConnection.java – handles connections to an external database (such as MySQL or Postgres ). • Debug.java – contains the code for printing out Debug messages...also provided. It is possible to output the results of queries to a MySQL or Postgres database for archival and the GUI can query those results
ELAPSE - NASA AMES LISP AND ADA BENCHMARK SUITE: EFFICIENCY OF LISP AND ADA PROCESSING - A SYSTEM EVALUATION

NASA Technical Reports Server (NTRS)

Davis, G. J.

1994-01-01

One area of research of the Information Sciences Division at NASA Ames Research Center is devoted to the analysis and enhancement of processors and advanced computer architectures, specifically in support of automation and robotic systems. To compare systems' abilities to efficiently process Lisp and Ada, scientists at Ames Research Center have developed a suite of non-parallel benchmarks called ELAPSE. The benchmark suite was designed to test a single computer's efficiency as well as alternate machine comparisons on Lisp, and/or Ada languages. ELAPSE tests the efficiency with which a machine can execute the various routines in each environment. The sample routines are based on numeric and symbolic manipulations and include two-dimensional fast Fourier transformations, Cholesky decomposition and substitution, Gaussian elimination, high-level data processing, and symbol-list references. Also included is a routine based on a Bayesian classification program sorting data into optimized groups. The ELAPSE benchmarks are available for any computer with a validated Ada compiler and/or Common Lisp system. Of the 18 routines that comprise ELAPSE, provided within this package are 14 developed or translated at Ames. The others are readily available through literature. The benchmark that requires the most memory is CHOLESKY.ADA. Under VAX/VMS, CHOLESKY.ADA requires 760K of main memory. ELAPSE is available on either two 5.25 inch 360K MS-DOS format diskettes (standard distribution) or a 9-track 1600 BPI ASCII CARD IMAGE format magnetic tape. The contents of the diskettes are compressed using the PKWARE archiving tools. The utility to unarchive the files, PKUNZIP.EXE, is included. The ELAPSE benchmarks were written in 1990. VAX and VMS are trademarks of Digital Equipment Corporation. MS-DOS is a registered trademark of Microsoft Corporation.
Use of integral experiments in support to the validation of JEFF-3.2 nuclear data evaluation

NASA Astrophysics Data System (ADS)

Leclaire, Nicolas; Cochet, Bertrand; Jinaphanh, Alexis; Haeck, Wim

2017-09-01

For many years now, IRSN has developed its own Monte Carlo continuous energy capability, which allows testing various nuclear data libraries. In that prospect, a validation database of 1136 experiments was built from cases used for the validation of the APOLLO2-MORET 5 multigroup route of the CRISTAL V2.0 package. In this paper, the keff obtained for more than 200 benchmarks using the JEFF-3.1.1 and JEFF-3.2 libraries are compared to benchmark keff values and main discrepancies are analyzed regarding the neutron spectrum. Special attention is paid on benchmarks for which the results have been highly modified between both JEFF-3 versions.
Landsat-4 and Landsat-5 thematic mapper band 6 historical performance and calibration

USGS Publications Warehouse

Barsi, J.A.; Chander, G.; Markham, B.L.; Higgs, N.; ,

2005-01-01

Launched in 1982 and 1984 respectively, the Landsat-4 and -5 Thematic Mappers (TM) are the backbone of an extensive archive of moderate resolution Earth imagery. However, these sensors and their data products were not subjected to the type of intensive monitoring that has been part of the Landsat-7 system since its launch in 1999. With Landsat-4's 11 year and Landsat-5's 20+ year data record, there is a need to understand the historical behavior of the instruments in order to verify the scientific integrity of the archive and processed products. Performance indicators of the Landsat-4 and -5 thermal bands have recently been extracted from a processing system database allowing for a more complete study of thermal band characteristics and calibration than was previously possible. The database records responses to the internal calibration system, instrument temperatures and applied gains and offsets for each band for every scene processed through the National Landsat Archive Production System (NLAPS). Analysis of this database has allowed for greater understanding of the calibration and improvement in the processing system. This paper will cover the trends in the Landsat-4 and -5 thermal bands, the effect of the changes seen in the trends, and how these trends affect the use of the thermal data.
A spatio-temporal landslide inventory for the NW of Spain: BAPA database

NASA Astrophysics Data System (ADS)

Valenzuela, Pablo; Domínguez-Cuesta, María José; Mora García, Manuel Antonio; Jiménez-Sánchez, Montserrat

2017-09-01

A landslide database has been created for the Principality of Asturias, NW Spain: the BAPA (Base de datos de Argayos del Principado de Asturias - Principality of Asturias Landslide Database). Data collection is mainly performed through searching local newspaper archives. Moreover, a BAPA App and a BAPA website (http://geol.uniovi.es/BAPA) have been developed to obtain additional information from citizens and institutions. Presently, the dataset covers the period 1980-2015, recording 2063 individual landslides. The use of free cartographic servers, such as Google Maps, Google Street View and Iberpix (Government of Spain), combined with the spatial descriptions and pictures contained in the press news, makes it possible to assess different levels of spatial accuracy. In the database, 59% of the records show an exact spatial location, and 51% of the records provided accurate dates, showing the usefulness of press archives as temporal records. Thus, 32% of the landslides show the highest spatial and temporal accuracy levels. The database also gathers information about the type and characteristics of the landslides, the triggering factors and the damage and costs caused. Field work was conducted to validate the methodology used in assessing the spatial location, temporal occurrence and characteristics of the landslides.
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian

2011-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Challenges and Experiences of Building Multidisciplinary Datasets across Cultures

NASA Astrophysics Data System (ADS)

Jamiyansharav, K.; Laituri, M.; Fernandez-Gimenez, M.; Fassnacht, S. R.; Venable, N. B. H.; Allegretti, A. M.; Reid, R.; Baival, B.; Jamsranjav, C.; Ulambayar, T.; Linn, S.; Angerer, J.

2017-12-01

Efficient data sharing and management are key challenges to multidisciplinary scientific research. These challenges are further complicated by adding a multicultural component. We address the construction of a complex database for social-ecological analysis in Mongolia. Funded by the National Science Foundation (NSF) Dynamics of Coupled Natural and Human (CNH) Systems, the Mongolian Rangelands and Resilience (MOR2) project focuses on the vulnerability of Mongolian pastoral systems to climate change and adaptive capacity. The MOR2 study spans over three years of fieldwork in 36 paired districts (Soum) from 18 provinces (Aimag) of Mongolia that covers steppe, mountain forest steppe, desert steppe and eastern steppe ecological zones. Our project team is composed of hydrologists, social scientists, geographers, and ecologists. The MOR2 database includes multiple ecological, social, meteorological, geospatial and hydrological datasets, as well as archives of original data and survey in multiple formats. Managing this complex database requires significant organizational skills, attention to detail and ability to communicate within collective team members from diverse disciplines and across multiple institutions in the US and Mongolia. We describe the database's rich content, organization, structure and complexity. We discuss lessons learned, best practices and recommendations for complex database management, sharing, and archiving in creating a cross-cultural and multi-disciplinary database.
Fast in-database cross-matching of high-cadence, high-density source lists with an up-to-date sky model

NASA Astrophysics Data System (ADS)

Scheers, B.; Bloemen, S.; Mühleisen, H.; Schellart, P.; van Elteren, A.; Kersten, M.; Groot, P. J.

2018-04-01

Coming high-cadence wide-field optical telescopes will image hundreds of thousands of sources per minute. Besides inspecting the near real-time data streams for transient and variability events, the accumulated data archive is a wealthy laboratory for making complementary scientific discoveries. The goal of this work is to optimise column-oriented database techniques to enable the construction of a full-source and light-curve database for large-scale surveys, that is accessible by the astronomical community. We adopted LOFAR's Transients Pipeline as the baseline and modified it to enable the processing of optical images that have much higher source densities. The pipeline adds new source lists to the archive database, while cross-matching them with the known cataloguedsources in order to build a full light-curve archive. We investigated several techniques of indexing and partitioning the largest tables, allowing for faster positional source look-ups in the cross matching algorithms. We monitored all query run times in long-term pipeline runs where we processed a subset of IPHAS data that have image source density peaks over 170,000 per field of view (500,000 deg-2). Our analysis demonstrates that horizontal table partitions of declination widths of one-degree control the query run times. Usage of an index strategy where the partitions are densely sorted according to source declination yields another improvement. Most queries run in sublinear time and a few (< 20%) run in linear time, because of dependencies on input source-list and result-set size. We observed that for this logical database partitioning schema the limiting cadence the pipeline achieved with processing IPHAS data is 25 s.

GROWTH OF THE INTERNATIONAL CRITICALITY SAFETY AND REACTOR PHYSICS EXPERIMENT EVALUATION PROJECTS

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. Blair Briggs; John D. Bess; Jim Gulliford

2011-09-01

Since the International Conference on Nuclear Criticality Safety (ICNC) 2007, the International Criticality Safety Benchmark Evaluation Project (ICSBEP) and the International Reactor Physics Experiment Evaluation Project (IRPhEP) have continued to expand their efforts and broaden their scope. Eighteen countries participated on the ICSBEP in 2007. Now, there are 20, with recent contributions from Sweden and Argentina. The IRPhEP has also expanded from eight contributing countries in 2007 to 16 in 2011. Since ICNC 2007, the contents of the 'International Handbook of Evaluated Criticality Safety Benchmark Experiments1' have increased from 442 evaluations (38000 pages), containing benchmark specifications for 3955 critical ormore » subcritical configurations to 516 evaluations (nearly 55000 pages), containing benchmark specifications for 4405 critical or subcritical configurations in the 2010 Edition of the ICSBEP Handbook. The contents of the Handbook have also increased from 21 to 24 criticality-alarm-placement/shielding configurations with multiple dose points for each, and from 20 to 200 configurations categorized as fundamental physics measurements relevant to criticality safety applications. Approximately 25 new evaluations and 150 additional configurations are expected to be added to the 2011 edition of the Handbook. Since ICNC 2007, the contents of the 'International Handbook of Evaluated Reactor Physics Benchmark Experiments2' have increased from 16 different experimental series that were performed at 12 different reactor facilities to 53 experimental series that were performed at 30 different reactor facilities in the 2011 edition of the Handbook. Considerable effort has also been made to improve the functionality of the searchable database, DICE (Database for the International Criticality Benchmark Evaluation Project) and verify the accuracy of the data contained therein. DICE will be discussed in separate papers at ICNC 2011. The status of the ICSBEP and the IRPhEP will be discussed in the full paper, selected benchmarks that have been added to the ICSBEP Handbook will be highlighted, and a preview of the new benchmarks that will appear in the September 2011 edition of the Handbook will be provided. Accomplishments of the IRPhEP will also be highlighted and the future of both projects will be discussed. REFERENCES (1) International Handbook of Evaluated Criticality Safety Benchmark Experiments, NEA/NSC/DOC(95)03/I-IX, Organisation for Economic Co-operation and Development-Nuclear Energy Agency (OECD-NEA), September 2010 Edition, ISBN 978-92-64-99140-8. (2) International Handbook of Evaluated Reactor Physics Benchmark Experiments, NEA/NSC/DOC(2006)1, Organisation for Economic Co-operation and Development-Nuclear Energy Agency (OECD-NEA), March 2011 Edition, ISBN 978-92-64-99141-5.« less
Superfund Public Information System (SPIS), June 1998 (on CD-ROM). Data file

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1998-06-01

The Superfund Public Information System (SPIS) on CD-ROM contains Superfund data for the United States Environmental Protection Agency. The Superfund data is a collection of four databases, CERCLIS, Archive (NFRAP), RODS, and NPL Sites. Descriptions of these databases and CD contents are listed below. The FolioViews browse and retrieval engine is used as a graphical interface to the data. Users can access simple queries and can do complex searching on key words or fields. In addition, context sensitive help, a Superfund process overview, and an integrated data dictionary are available. RODS is the Records Of Decision System. RODS is usedmore » to track site clean-ups under the Superfund program to justify the type of treatment chosen at each site. RODS contains information on technology justification, site history, community participation, enforcement activities, site characteristics, scope and role of response action, and remedy. Explanation of Significant Differences (ESDs) are also available on the CD. CERCLIS is the Comprehensive Environmental Response, Compensation, and Liability Information System. It is the official repository for all Superfund site and incident data. It contains comprehensive information on hazardous waste sites, site inspections, preliminary assessments, and remedial status. The system is sponsored by the EPA`s Office of Emergency and Remedial Response, Information Management Center. Archive (NFRAP) consists of hazardous waste sites that have no further remedial action planned; only basic identifying information is provided for archive sites. The sites found in the Archive database were originally in the CERCLIS database, but were removed beginning in the fall of 1995. NPL sites (available online) are fact sheets that describe the location and history of Superfund sites. Included are descriptions of the most recent activities and past actions at the sites that have contributed to the contamination. Population estimates, land usages, and nearby resources give background on the local setting surrounding a site.« less
Ensuring Safety of Navigation: A Three-Tiered Approach

NASA Astrophysics Data System (ADS)

Johnson, S. D.; Thompson, M.; Brazier, D.

2014-12-01

The primary responsibility of the Hydrographic Department at the Naval Oceanographic Office (NAVOCEANO) is to support US Navy surface and sub-surface Safety of Navigation (SoN) requirements. These requirements are interpreted, surveys are conducted, and accurate products are compiled and archived for future exploitation. For a number of years NAVOCEANO has employed a two-tiered data-basing structure to support SoN. The first tier (Data Warehouse, or DWH) provides access to the full-resolution sonar and lidar data. DWH preserves the original data such that any scale product can be built. The second tier (Digital Bathymetric Database - Variable resolution, or DBDB-V) served as the final archive for SoN chart scale, gridded products compiled from source bathymetry. DBDB-V has been incorporated into numerous DoD tactical decision aids and serves as the foundation bathymetry for ocean modeling. With the evolution of higher density survey systems and the addition of high-resolution gridded bathymetry product requirements, a two-tiered model did not provide an efficient solution for SoN. The two-tiered approach required scientists to exploit full-resolution data in order to build any higher resolution product. A new perspective on the archival and exploitation of source data was required. This new perspective has taken the form of a third tier, the Navigation Surface Database (NSDB). NSDB is an SQLite relational database populated with International Hydrographic Organization (IHO), S-102 compliant Bathymetric Attributed Grids (BAGs). BAGs archived within NSDB are developed at the highest resolution that the collection sensor system can support and contain nodal estimates for depth, uncertainty, separation values and metadata. Gridded surface analysis efforts culminate in the generation of the source resolution BAG files and their storage within NSDB. Exploitation of these resources eliminates the time and effort needed to re-grid and re-analyze native source file formats.
StarView: The object oriented design of the ST DADS user interface

NASA Technical Reports Server (NTRS)

Williams, J. D.; Pollizzi, J. A.

1992-01-01

StarView is the user interface being developed for the Hubble Space Telescope Data Archive and Distribution Service (ST DADS). ST DADS is the data archive for HST observations and a relational database catalog describing the archived data. Users will use StarView to query the catalog and select appropriate datasets for study. StarView sends requests for archived datasets to ST DADS which processes the requests and returns the database to the user. StarView is designed to be a powerful and extensible user interface. Unique features include an internal relational database to navigate query results, a form definition language that will work with both CRT and X interfaces, a data definition language that will allow StarView to work with any relational database, and the ability to generate adhoc queries without requiring the user to understand the structure of the ST DADS catalog. Ultimately, StarView will allow the user to refine queries in the local database for improved performance and merge in data from external sources for correlation with other query results. The user will be able to create a query from single or multiple forms, merging the selected attributes into a single query. Arbitrary selection of attributes for querying is supported. The user will be able to select how query results are viewed. A standard form or table-row format may be used. Navigation capabilities are provided to aid the user in viewing query results. Object oriented analysis and design techniques were used in the design of StarView to support the mechanisms and concepts required to implement these features. One such mechanism is the Model-View-Controller (MVC) paradigm. The MVC allows the user to have multiple views of the underlying database, while providing a consistent mechanism for interaction regardless of the view. This approach supports both CRT and X interfaces while providing a common mode of user interaction. Another powerful abstraction is the concept of a Query Model. This concept allows a single query to be built form a single or multiple forms before it is submitted to ST DADS. Supporting this concept is the adhoc query generator which allows the user to select and qualify an indeterminate number attributes from the database. The user does not need any knowledge of how the joins across various tables are to be resolved. The adhoc generator calculates the joins automatically and generates the correct SQL query.
LIPS database with LIPService: a microscopic image database of intracellular structures in Arabidopsis guard cells.

PubMed

Higaki, Takumi; Kutsuna, Natsumaro; Hasezawa, Seiichiro

2013-05-16

Intracellular configuration is an important feature of cell status. Recent advances in microscopic imaging techniques allow us to easily obtain a large number of microscopic images of intracellular structures. In this circumstance, automated microscopic image recognition techniques are of extreme importance to future phenomics/visible screening approaches. However, there was no benchmark microscopic image dataset for intracellular organelles in a specified plant cell type. We previously established the Live Images of Plant Stomata (LIPS) database, a publicly available collection of optical-section images of various intracellular structures of plant guard cells, as a model system of environmental signal perception and transduction. Here we report recent updates to the LIPS database and the establishment of a database table, LIPService. We updated the LIPS dataset and established a new interface named LIPService to promote efficient inspection of intracellular structure configurations. Cell nuclei, microtubules, actin microfilaments, mitochondria, chloroplasts, endoplasmic reticulum, peroxisomes, endosomes, Golgi bodies, and vacuoles can be filtered using probe names or morphometric parameters such as stomatal aperture. In addition to the serial optical sectional images of the original LIPS database, new volume-rendering data for easy web browsing of three-dimensional intracellular structures have been released to allow easy inspection of their configurations or relationships with cell status/morphology. We also demonstrated the utility of the new LIPS image database for automated organelle recognition of images from another plant cell image database with image clustering analyses. The updated LIPS database provides a benchmark image dataset for representative intracellular structures in Arabidopsis guard cells. The newly released LIPService allows users to inspect the relationship between organellar three-dimensional configurations and morphometrical parameters.
DOE Research and Development Accomplishments Website Policies/Important

Science.gov Websites

Links RSS Archive Videos XML DOE R&D Accomplishments DOE R&D Accomplishments searchQuery Ã Find searchQuery x Find DOE R&D Acccomplishments Navigation dropdown arrow The Basics Stories Snapshots R&D Nuggets Database dropdown arrow Search Tag Cloud Browse Reports Database Help
myPhyloDB: a local web-server and database for the storage and analysis of metagenomics data

USDA-ARS?s Scientific Manuscript database

The advent of next-generation sequencing has resulted in an explosion of metagenomics data associated with microbial communities from a variety of ecosystems. However, no database and/or analytical software is currently available that allows for archival and cross-study comparison of such data. my...
Sentence-Based Metadata: An Approach and Tool for Viewing Database Designs.

ERIC Educational Resources Information Center

Boyle, John M.; Gunge, Jakob; Bryden, John; Librowski, Kaz; Hanna, Hsin-Yi

2002-01-01

Describes MARS (Museum Archive Retrieval System), a research tool which enables organizations to exchange digital images and documents by means of a common thesaurus structure, and merge the descriptive data and metadata of their collections. Highlights include theoretical basis; searching the MARS database; and examples in European museums.…
A database of archived drilling records of the drill cuttings piles at the North West Hutton oil platform.

PubMed

Marsh, Roy

2003-05-01

Drill cuttings piles are found underneath several hundred oil platforms in the North Sea, and are contaminated with hydrocarbons and chemical products. This study characterised the environmental risk posed by the cuttings pile at the North West Hutton (NWH) oil platform. Data on the drilling fluids and chemical products used over the platform's drilling history were transferred from archived well reports into a custom database, to which were added toxicological and safety data. Although the database contained many gaps, it established that only seven chemical products used at NWH were not in the lowest category of the Offshore Chemicals Notification Scheme, and were used in only small quantities. The study therefore supports the view that the main environmental risk posed by cuttings piles comes from hydrocarbon contamination. The (dated) well records could help future core sampling to be targeted at specific locations in the cuttings piles. Data from many platforms could also be pooled to determine generic 'discharge profiles.' Future study would benefit from the existence, in the public domain, of a standardised, 'legacy' database of chemical products.
IMAGESEER - IMAGEs for Education and Research

NASA Technical Reports Server (NTRS)

Le Moigne, Jacqueline; Grubb, Thomas; Milner, Barbara

2012-01-01

IMAGESEER is a new Web portal that brings easy access to NASA image data for non-NASA researchers, educators, and students. The IMAGESEER Web site and database are specifically designed to be utilized by the university community, to enable teaching image processing (IP) techniques on NASA data, as well as to provide reference benchmark data to validate new IP algorithms. Along with the data and a Web user interface front-end, basic knowledge of the application domains, benchmark information, and specific NASA IP challenges (or case studies) are provided.
An approach to estimate body dimensions through constant body ratio benchmarks.

PubMed

Chao, Wei-Cheng; Wang, Eric Min-Yang

2010-12-01

Building a new anthropometric database is a difficult and costly job that requires considerable manpower and time. However, most designers and engineers do not know how to convert old anthropometric data into applicable new data with minimal errors and costs (Wang et al., 1999). To simplify the process of converting old anthropometric data into useful new data, this study analyzed the available data in paired body dimensions in an attempt to determine constant body ratio (CBR) benchmarks that are independent of gender and age. In total, 483 CBR benchmarks were identified and verified from 35,245 ratios analyzed. Additionally, 197 estimation formulae, taking as inputs 19 easily measured body dimensions, were built using 483 CBR benchmarks. Based on the results for 30 recruited participants, this study determined that the described approach is more accurate and cost-effective than alternative techniques. Copyright © 2010 Elsevier Ltd. All rights reserved.
Results of the GABLS3 diurnal-cycle benchmark for wind energy applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rodrigo, J. Sanz; Allaerts, D.; Avila, M.

We present results of the GABLS3 model intercomparison benchmark revisited for wind energy applications. The case consists of a diurnal cycle, measured at the 200-m tall Cabauw tower in the Netherlands, including a nocturnal low-level jet. The benchmark includes a sensitivity analysis of WRF simulations using two input meteorological databases and five planetary boundary-layer schemes. A reference set of mesoscale tendencies is used to drive microscale simulations using RANS k-ϵ and LES turbulence models. The validation is based on rotor-based quantities of interest. Cycle-integrated mean absolute errors are used to quantify model performance. The results of the benchmark are usedmore » to discuss input uncertainties from mesoscale modelling, different meso-micro coupling strategies (online vs offline) and consistency between RANS and LES codes when dealing with boundary-layer mean flow quantities. Altogether, all the microscale simulations produce a consistent coupling with mesoscale forcings.« less
Results of the GABLS3 diurnal-cycle benchmark for wind energy applications

DOE PAGES

Rodrigo, J. Sanz; Allaerts, D.; Avila, M.; ...

2017-06-13

We present results of the GABLS3 model intercomparison benchmark revisited for wind energy applications. The case consists of a diurnal cycle, measured at the 200-m tall Cabauw tower in the Netherlands, including a nocturnal low-level jet. The benchmark includes a sensitivity analysis of WRF simulations using two input meteorological databases and five planetary boundary-layer schemes. A reference set of mesoscale tendencies is used to drive microscale simulations using RANS k-ϵ and LES turbulence models. The validation is based on rotor-based quantities of interest. Cycle-integrated mean absolute errors are used to quantify model performance. The results of the benchmark are usedmore » to discuss input uncertainties from mesoscale modelling, different meso-micro coupling strategies (online vs offline) and consistency between RANS and LES codes when dealing with boundary-layer mean flow quantities. Altogether, all the microscale simulations produce a consistent coupling with mesoscale forcings.« less
Archive interoperability in the Virtual Observatory

NASA Astrophysics Data System (ADS)

Genova, Françoise

2003-02-01

Main goals of Virtual Observatory projects are to build interoperability between astronomical on-line services, observatory archives, databases and results published in journals, and to develop tools permitting the best scientific usage from the very large data sets stored in observatory archives and produced by large surveys. The different Virtual Observatory projects collaborate to define common exchange standards, which are the key for a truly International Virtual Observatory: for instance their first common milestone has been a standard allowing exchange of tabular data, called VOTable. The Interoperability Work Area of the European Astrophysical Virtual Observatory project aims at networking European archives, by building a prototype using the CDS VizieR and Aladin tools, and at defining basic rules to help archive providers in interoperability implementation. The prototype is accessible for scientific usage, to get user feedback (and science results!) at an early stage of the project. ISO archive participates very actively to this endeavour, and more generally to information networking. The on-going inclusion of the ISO log in SIMBAD will allow higher level links for users.
Cassini Archive Tracking System

NASA Technical Reports Server (NTRS)

Conner, Diane; Sayfi, Elias; Tinio, Adrian

2006-01-01

The Cassini Archive Tracking System (CATS) is a computer program that enables tracking of scientific data transfers from originators to the Planetary Data System (PDS) archives. Without CATS, there is no systematic means of locating products in the archive process or ensuring their completeness. By keeping a database of transfer communications and status, CATS enables the Cassini Project and the PDS to efficiently and accurately report on archive status. More importantly, problem areas are easily identified through customized reports that can be generated on the fly from any Web-enabled computer. A Web-browser interface and clearly defined authorization scheme provide safe distributed access to the system, where users can perform functions such as create customized reports, record a transfer, and respond to a transfer. CATS ensures that Cassini provides complete science archives to the PDS on schedule and that those archives are available to the science community by the PDS. The three-tier architecture is loosely coupled and designed for simple adaptation to multimission use. Written in the Java programming language, it is portable and can be run on any Java-enabled Web server.
St. Petersburg Coastal and Marine Science Center's Core Archive Portal

USGS Publications Warehouse

Reich, Chris; Streubert, Matt; Dwyer, Brendan; Godbout, Meg; Muslic, Adis; Umberger, Dan

2012-01-01

This Web site contains information on rock cores archived at the U.S. Geological Survey (USGS) St. Petersburg Coastal and Marine Science Center (SPCMSC). Archived cores consist of 3- to 4-inch-diameter coral cores, 1- to 2-inch-diameter rock cores, and a few unlabeled loose coral and rock samples. This document - and specifically the archive Web site portal - is intended to be a 'living' document that will be updated continually as additional cores are collected and archived. This document may also contain future references and links to a catalog of sediment cores. Sediment cores will include vibracores, pushcores, and other loose sediment samples collected for research purposes. This document will: (1) serve as a database for locating core material currently archived at the USGS SPCMSC facility; (2) provide a protocol for entry of new core material into the archive system; and, (3) set the procedures necessary for checking out core material for scientific purposes. Core material may be loaned to other governmental agencies, academia, or non-governmental organizations at the discretion of the USGS SPCMSC curator.
The Gediz River fluvial archive: A benchmark for Quaternary research in Western Anatolia

NASA Astrophysics Data System (ADS)

Maddy, D.; Veldkamp, A.; Demir, T.; van Gorp, W.; Wijbrans, J. R.; van Hinsbergen, D. J. J.; Dekkers, M. J.; Schreve, D.; Schoorl, J. M.; Scaife, R.; Stemerdink, C.; van der Schriek, T.; Bridgland, D. R.; Aytaç, A. S.

2017-06-01

The Gediz River, one of the principal rivers of Western Anatolia, has an extensive Pleistocene fluvial archive that potentially offers a unique window into fluvial system behaviour on the western margins of Asia during the Quaternary. In this paper we review our work on the Quaternary Gediz River Project (2001-2010) and present new data which leads to a revised stratigraphical model for the Early Pleistocene development of this fluvial system. In previous work we confirmed the preservation of eleven buried Early Pleistocene fluvial terraces of the Gediz River (designated GT11, the oldest and highest, to GT1, the youngest and lowest) which lie beneath the basalt-covered plateaux of the Kula Volcanic Province. Deciphering the information locked in this fluvial archive requires the construction of a robust geochronology. Fortunately, the Gediz archive provides ample opportunity for age-constraint based upon age estimates derived from basaltic lava flows that repeatedly entered the palaeo-Gediz valley floors. In this paper we present, for the first time, our complete dataset of 40Ar/39Ar age estimates and associated palaeomagnetic measurements. These data, which can be directly related to the underlying fluvial deposits, provide age constraints critical to our understanding of this sequence. The new chronology establishes the onset of Quaternary volcanism at ∼1320ka (MIS42). This volcanism, which is associated with GT6, confirms a pre-MIS42 age for terraces GT11-GT7. Evidence from the colluvial sequences directly overlying these early terraces suggests that they formed in response to hydrological and sediment budget changes forced by climate-driven vegetation change. The cyclic formation of terraces and their timing suggests they represent the obliquity-driven climate changes of the Early Pleistocene. By way of contrast the GT5-GT1 terrace sequence, constrained by a lava flow with an age estimate of ∼1247ka, span the time-interval MIS42 - MIS38 and therefore do not match the frequency of climate change as previously suggested. The onset of volcanism breaks the simple linkage of terracing to climate-driven change. These younger terraces more likely reflect a localized terracing process triggered by base level changes forced by volcanic eruptions and associated reactivation of pre-existing faults, lava dam construction, landsliding and subsequent lava-dammed lake drainage. Establishing a firm stratigraphy and geochronology for the Early Pleistocene archive provides a secure framework for future exploitation of this part of the archive and sets the standard as we begin our work on the Middle-Late Pleistocene sequence. We believe this work forms a benchmark study for detailed Quaternary research in Turkey.
[A new concept for integration of image databanks into a comprehensive patient documentation].

PubMed

Schöll, E; Holm, J; Eggli, S

2001-05-01

Image processing and archiving are of increasing importance in the practice of modern medicine. Particularly due to the introduction of computer-based investigation methods, physicians are dealing with a wide variety of analogue and digital picture archives. On the other hand, clinical information is stored in various text-based information systems without integration of image components. The link between such traditional medical databases and picture archives is a prerequisite for efficient data management as well as for continuous quality control and medical education. At the Department of Orthopedic Surgery, University of Berne, a software program was developed to create a complete multimedia electronic patient record. The client-server system contains all patients' data, questionnaire-based quality control, and a digital picture archive. Different interfaces guarantee the integration into the hospital's data network. This article describes our experiences in the development and introduction of a comprehensive image archiving system at a large orthopedic center.
A case Study of Applying Object-Relational Persistence in Astronomy Data Archiving

NASA Astrophysics Data System (ADS)

Yao, S. S.; Hiriart, R.; Barg, I.; Warner, P.; Gasson, D.

2005-12-01

The NOAO Science Archive (NSA) team is developing a comprehensive domain model to capture the science data in the archive. Java and an object model derived from the domain model weil address the application layer of the archive system. However, since RDBMS is the best proven technology for data management, the challenge is the paradigm mismatch between the object and the relational models. Transparent object-relational mapping (ORM) persistence is a successful solution to this challenge. In the data modeling and persistence implementation of NSA, we are using Hibernate, a well-accepted ORM tool, to bridge the object model in the business tier and the relational model in the database tier. Thus, the database is isolated from the Java application. The application queries directly on objects using a DBMS-independent object-oriented query API, which frees the application developers from the low level JDBC and SQL so that they can focus on the domain logic. We present the detailed design of the NSA R3 (Release 3) data model and object-relational persistence, including mapping, retrieving and caching. Persistence layer optimization and performance tuning will be analyzed. The system is being built on J2EE, so the integration of Hibernate into the EJB container and the transaction management are also explored.
The SSABLE system - Automated archive, catalog, browse and distribution of satellite data in near-real time

NASA Technical Reports Server (NTRS)

Simpson, James J.; Harkins, Daniel N.

1993-01-01

Historically, locating and browsing satellite data has been a cumbersome and expensive process. This has impeded the efficient and effective use of satellite data in the geosciences. SSABLE is a new interactive tool for the archive, browse, order, and distribution of satellite date based upon X Window, high bandwidth networks, and digital image rendering techniques. SSABLE provides for automatically constructing relational database queries to archived image datasets based on time, data, geographical location, and other selection criteria. SSABLE also provides a visual representation of the selected archived data for viewing on the user's X terminal. SSABLE is a near real-time system; for example, data are added to SSABLE's database within 10 min after capture. SSABLE is network and machine independent; it will run identically on any machine which satisfies the following three requirements: 1) has a bitmapped display (monochrome or greater); 2) is running the X Window system; and 3) is on a network directly reachable by the SSABLE system. SSABLE has been evaluated at over 100 international sites. Network response time in the United States and Canada varies between 4 and 7 s for browse image updates; reported transmission times to Europe and Australia typically are 20-25 s.

Telecommunications issues of intelligent database management for ground processing systems in the EOS era

NASA Technical Reports Server (NTRS)

Touch, Joseph D.

1994-01-01

Future NASA earth science missions, including the Earth Observing System (EOS), will be generating vast amounts of data that must be processed and stored at various locations around the world. Here we present a stepwise-refinement of the intelligent database management (IDM) of the distributed active archive center (DAAC - one of seven regionally-located EOSDIS archive sites) architecture, to showcase the telecommunications issues involved. We develop this architecture into a general overall design. We show that the current evolution of protocols is sufficient to support IDM at Gbps rates over large distances. We also show that network design can accommodate a flexible data ingestion storage pipeline and a user extraction and visualization engine, without interference between the two.
A user`s guide to LUGSAN II. A computer program to calculate and archive lug and sway brace loads for aircraft-carried stores

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dunn, W.N.

1998-03-01

LUG and Sway brace ANalysis (LUGSAN) II is an analysis and database computer program that is designed to calculate store lug and sway brace loads for aircraft captive carriage. LUGSAN II combines the rigid body dynamics code, SWAY85, with a Macintosh Hypercard database to function both as an analysis and archival system. This report describes the LUGSAN II application program, which operates on the Macintosh System (Hypercard 2.2 or later) and includes function descriptions, layout examples, and sample sessions. Although this report is primarily a user`s manual, a brief overview of the LUGSAN II computer code is included with suggestedmore » resources for programmers.« less
Developing a benchmark for emotional analysis of music

PubMed Central

Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the ‘Emotion in Music’ task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER. PMID:28282400
Developing a benchmark for emotional analysis of music.

PubMed

Aljanaki, Anna; Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the 'Emotion in Music' task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER.
Convolutional Neural Network on Embedded Linux(trademark) System-on-Chip: A Methodology and Performance Benchmark

DTIC Science & Technology

2016-05-01

A9 CPU and 15 W for the i7 CPU. A method of accelerating this computation is by using a customized hardware unit called a field- programmable gate...implementation of custom logic to accelerate com- putational workloads. This FPGA fabric, in addition to the standard programmable logic, contains 220...chip; field- programmable gate array Daniel Gebhardt U U U U 18 (619) 553-2786 INITIAL DISTRIBUTION 84300 Library (2) 85300 Archive/Stock (1
Convolutional Neural Network on Embedded Linux System-on-Chip: A Methodology and Performance Benchmark

DTIC Science & Technology

2016-05-01

A9 CPU and 15 W for the i7 CPU. A method of accelerating this computation is by using a customized hardware unit called a field- programmable gate...implementation of custom logic to accelerate com- putational workloads. This FPGA fabric, in addition to the standard programmable logic, contains 220...chip; field- programmable gate array Daniel Gebhardt U U U U 18 (619) 553-2786 INITIAL DISTRIBUTION 84300 Library (2) 85300 Archive/Stock (1
Benchmarking Ligand-Based Virtual High-Throughput Screening with the PubChem Database

PubMed Central

Butkiewicz, Mariusz; Lowe, Edward W.; Mueller, Ralf; Mendenhall, Jeffrey L.; Teixeira, Pedro L.; Weaver, C. David; Meiler, Jens

2013-01-01

With the rapidly increasing availability of High-Throughput Screening (HTS) data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD) have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL::ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR) models are built using Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Decision Trees (DTs), and Kohonen networks (KNs). Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS) and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed. PMID:23299552
Bio-inspired benchmark generator for extracellular multi-unit recordings

PubMed Central

Mondragón-González, Sirenia Lizbeth; Burguière, Eric

2017-01-01

The analysis of multi-unit extracellular recordings of brain activity has led to the development of numerous tools, ranging from signal processing algorithms to electronic devices and applications. Currently, the evaluation and optimisation of these tools are hampered by the lack of ground-truth databases of neural signals. These databases must be parameterisable, easy to generate and bio-inspired, i.e. containing features encountered in real electrophysiological recording sessions. Towards that end, this article introduces an original computational approach to create fully annotated and parameterised benchmark datasets, generated from the summation of three components: neural signals from compartmental models and recorded extracellular spikes, non-stationary slow oscillations, and a variety of different types of artefacts. We present three application examples. (1) We reproduced in-vivo extracellular hippocampal multi-unit recordings from either tetrode or polytrode designs. (2) We simulated recordings in two different experimental conditions: anaesthetised and awake subjects. (3) Last, we also conducted a series of simulations to study the impact of different level of artefacts on extracellular recordings and their influence in the frequency domain. Beyond the results presented here, such a benchmark dataset generator has many applications such as calibration, evaluation and development of both hardware and software architectures. PMID:28233819
IUEAGN: A database of ultraviolet spectra of active galactic nuclei

NASA Technical Reports Server (NTRS)

Pike, G.; Edelson, R.; Shull, J. M.; Saken, J.

1993-01-01

In 13 years of operation, IUE has gathered approximately 5000 spectra of almost 600 Active Galactic Nuclei (AGN). In order to undertake AGN studies which require large amounts of data, we are consistently reducing this entire archive and creating a homogeneous, easy-to-use database. First, the spectra are extracted using the Optimal extraction algorithm. Continuum fluxes are then measured across predefined bands, and line fluxes are measured with a multi-component fit. These results, along with source information such as redshifts and positions, are placed in the IUEAGN relational database. Analysis algorithms, statistical tests, and plotting packages run within the structure, and this flexible database can accommodate future data when they are released. This archival approach has already been used to survey line and continuum variability in six bright Seyfert 1s and rapid continuum variability in 14 blazars. Among the results that could only be obtained using a large archival study is evidence that blazars show a positive correlation between degree of variability and apparent luminosity, while Seyfert 1s show an anti-correlation. This suggests that beaming dominates the ultraviolet properties for blazars, while thermal emission from an accretion disk dominates for Seyfert 1s. Our future plans include a survey of line ratios in Seyfert 1s, to be fitted with photoionization models to test the models and determine the range of temperatures, densities and ionization parameters. We will also include data from IRAS, Einstein, EXOSAT, and ground-based telescopes to measure multi-wavelength correlations and broadband spectral energy distributions.
Database Resources of the BIG Data Center in 2018.

PubMed

2018-01-04

The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Modeling Urban Scenarios & Experiments: Fort Indiantown Gap Data Collections Summary and Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Daniel E.; Bandstra, Mark S.; Davidson, Gregory G.

This report summarizes experimental radiation detector, contextual sensor, weather, and global positioning system (GPS) data collected to inform and validate a comprehensive, operational radiation transport modeling framework to evaluate radiation detector system and algorithm performance. This framework will be used to study the influence of systematic effects (such as geometry, background activity, background variability, environmental shielding, etc.) on detector responses and algorithm performance using synthetic time series data. This work consists of performing data collection campaigns at a canonical, controlled environment for complete radiological characterization to help construct and benchmark a high-fidelity model with quantified system geometries, detector response functions,more » and source terms for background and threat objects. This data also provides an archival, benchmark dataset that can be used by the radiation detection community. The data reported here spans four data collection campaigns conducted between May 2015 and September 2016.« less
Open Rotor - Analysis of Diagnostic Data

NASA Technical Reports Server (NTRS)

Envia, Edmane

2011-01-01

NASA is researching open rotor propulsion as part of its technology research and development plan for addressing the subsonic transport aircraft noise, emission and fuel burn goals. The low-speed wind tunnel test for investigating the aerodynamic and acoustic performance of a benchmark blade set at the approach and takeoff conditions has recently concluded. A high-speed wind tunnel diagnostic test campaign has begun to investigate the performance of this benchmark open rotor blade set at the cruise condition. Databases from both speed regimes will comprise a comprehensive collection of benchmark open rotor data for use in assessing/validating aerodynamic and noise prediction tools (component & system level) as well as providing insights into the physics of open rotors to help guide the development of quieter open rotors.
Bias-Free Chemically Diverse Test Sets from Machine Learning.

PubMed

Swann, Ellen T; Fernandez, Michael; Coote, Michelle L; Barnard, Amanda S

2017-08-14

Current benchmarking methods in quantum chemistry rely on databases that are built using a chemist's intuition. It is not fully understood how diverse or representative these databases truly are. Multivariate statistical techniques like archetypal analysis and K-means clustering have previously been used to summarize large sets of nanoparticles however molecules are more diverse and not as easily characterized by descriptors. In this work, we compare three sets of descriptors based on the one-, two-, and three-dimensional structure of a molecule. Using data from the NIST Computational Chemistry Comparison and Benchmark Database and machine learning techniques, we demonstrate the functional relationship between these structural descriptors and the electronic energy of molecules. Archetypes and prototypes found with topological or Coulomb matrix descriptors can be used to identify smaller, statistically significant test sets that better capture the diversity of chemical space. We apply this same method to find a diverse subset of organic molecules to demonstrate how the methods can easily be reapplied to individual research projects. Finally, we use our bias-free test sets to assess the performance of density functional theory and quantum Monte Carlo methods.
Web-Resources for Astronomical Data in the Ultraviolet

NASA Astrophysics Data System (ADS)

Sachkov, M. E.; Malkov, O. Yu.

2017-12-01

In this paper we describe databases of space projects that are operating or have operated in the ultraviolet spectral region. We give brief descriptions and links to major sources for UV data on the web: archives, space mission sites, databases, catalogues. We pay special attention to the World Space Observatory—Ultraviolet mission that will be launched in 2021.
Brave New World: Data Intensive Science with SDSS and the VO

NASA Astrophysics Data System (ADS)

Thakar, A. R.; Szalay, A. S.; O'Mullane, W.; Nieto-Santisteban, M.; Budavari, T.; Li, N.; Carliles, S.; Haridas, V.; Malik, T.; Gray, J.

2004-12-01

With the advent of digital archives and the VO, astronomy is quickly changing from a data-hungry to a data-intensive science. Local and specialized access to data will remain the most direct and efficient way to get data out of individual archives, especially if you know what you are looking for. However, the enormous sizes of the upcoming archives will preclude this type of access for most institutions, and will not allow researchers to tap the vast potential for discovery in cross-matching and comparing data between different archives. The VO makes this type of interoperability and distributed data access possible by adopting industry standards for data access (SQL) and data interchange (SOAP/XML) with platform independence (Web services). As a sneak preview of this brave new world where astronomers may need to become SQL warriors, we present a look at VO-enabled access to catalog data in the SDSS Catalog Archive Server (CAS): CasJobs - a workbench environment that allows arbitrarily complex SQL queries and your own personal database (MyDB) that you can share with collaborators; OpenSkyQuery - an IVOA (International Virtual Observatory Alliance) compliant federation of multiple archives (OpenSkyNodes) that currently links nearly 20 catalogs and allows cross-match queries (in ADQL - Astronomical Data Query Language) between them; Spectrum and Filter Profile Web services that provide access to an open database of spectra (registered users may add their own spectra); and VO-enabled Mirage - a Java visualizatiion tool developed at Bell Labs and enhanced at JHU that allows side-by-side comparison of SDSS catalog and FITS image data. Anticipating the next generation of Petabyte archives like LSST by the end of the decade, we are developing a parallel cross-match engine for all-sky cross-matches between large surveys, along with a 100-Terabyte data intensive science laboratory with high-speed parallel data access.
Costs of landslides and floods in XX Century in a Calabrian town starting from the data stored in the Historical Archive of IRPI (Cosenza)

NASA Astrophysics Data System (ADS)

Giampa', Vincenzo; Pasqua, A. Aurora; Petrucci, Olga

2015-04-01

The paper firstly presents the historical archive of Cosenza IRPI Section and the historical database that has been built basing on the data contained in it. Then, an application of these data to Catanzaro, the town that is the administrative center of Calabria region (Southern Italy), is presented. The gathering of historical data on past floods and landslides in Cosenza IRPI Section has been started since 1996, and it is still in progress. In 2005, some donations coming from regional and municipal Public Works offices greatly increased the documental corpus, and required a more incisive classification and management that led us to organize the documents in a real historical archive. Documents were sorted according to municipalities they concerned. In this way, for each of the 409 municipalities of Calabria a set of documents, maps and images was available. Collected documents mainly concern damage caused by the occurrence, since XIX century, of phenomena as floods, flash floods and landslides triggered by extreme meteorological events, or even damage caused by strong earthquakes. At the beginning of 2014, the central office of IRPI (Perugia) funded a project aiming to the digitalization of the archive and the subsequent publication of it on a web-platform. In this paper, the procedure adopted to build the archive and implement the database is described. Then, the elaboration of the historical series of data on Catanzaro town, which has been frequently damaged by rainfall-induced landslides and floods, is also presented. Basing on the documents coming from the archive of Ministry Public Works and stored in our Historical Archive, an assessment of costs related to damage that during XX century affected the houses of this town has been performed. The research pointed out the types of most damaging phenomena, the municipal sectors most frequently damaged, and the evolution of damaged areas throughout the years according to the increasing urbanization.
An intercomparison of tropical cyclone best-track products for the southwest Pacific

NASA Astrophysics Data System (ADS)

Magee, Andrew D.; Verdon-Kidd, Danielle C.; Kiem, Anthony S.

2016-06-01

Recent efforts to understand tropical cyclone (TC) activity in the southwest Pacific (SWP) have led to the development of numerous TC databases. The methods used to compile each database vary and are based on data from different meteorological centres, standalone TC databases and archived synoptic charts. Therefore the aims of this study are to (i) provide a spatio-temporal comparison of three TC best-track (BT) databases and explore any differences between them (and any associated implications) and (ii) investigate whether there are any spatial, temporal or statistical differences between pre-satellite (1945-1969), post-satellite (1970-2011) and post-geostationary satellite (1982-2011) era TC data given the changing observational technologies with time. To achieve this, we compare three best-track TC databases for the SWP region (0-35° S, 135° E-120° W) from 1945 to 2011: the Joint Typhoon Warning Center (JTWC), the International Best Track Archive for Climate Stewardship (IBTrACS) and the Southwest Pacific Enhanced Archive of Tropical Cyclones (SPEArTC). The results of this study suggest that SPEArTC is the most complete repository of TCs for the SWP region. In particular, we show that the SPEArTC database includes a number of additional TCs, not included in either the JTWC or IBTrACS database. These SPEArTC events do occur under environmental conditions conducive to tropical cyclogenesis (TC genesis), including anomalously negative 700 hPa vorticity (VORT), anomalously negative vertical shear of zonal winds (VSZW), anomalously negative 700 hPa geopotential height (GPH), cyclonic (absolute) 700 hPa winds and low values of absolute vertical wind shear (EVWS). Further, while changes in observational technologies from 1945 have undoubtedly improved our ability to detect and monitor TCs, we show that the number of TCs detected prior to the satellite era (1945-1969) are not statistically different to those in the post-satellite era (post-1970). Although data from pre-satellite and pre-geostationary satellite periods are currently inadequate for investigating TC intensity, this study suggests that SPEArTC data (from 1945) may be used to investigate long-term variability of TC counts and TC genesis locations.
Studying Venus using a GIS database

NASA Technical Reports Server (NTRS)

Price, Maribeth; Suppe, John

1993-01-01

A Geographic Information System (GIS) can significantly enhance geological studies on Venus because it facilitates concurrent analysis of many sources of data, as demonstrated by our work on topographic and deformation characteristics of tesserae. We are creating a database of structures referenced to real-world coordinates to encourage the archival of Venusian studies in digital format and to foster quantitative analysis of many combinations of data. Contributions to this database from all aspects of Venusian science are welcome.
Seismic Calibration of Group 1 IMS Stations in Eastern Asia for Improved IDC Event Location

DTIC Science & Technology

2006-04-01

database has been assembled and delivered to the SMR (formerly CMR) Research and Development Support Services (RDSS) data archive. This database ...Data used in these tomographic inversions have been collected into a uniform database and delivered to the RDSS at the SMR. Extensive testing of these...complex 3-D velocity models is based on a finite difference approximation to the eikonal equation developed by Podvin and Lecomte (1 991) and
Building a Massive Volcano Archive and the Development of a Tool for the Science Community

NASA Technical Reports Server (NTRS)

Linick, Justin

2012-01-01

The Jet Propulsion Laboratory has traditionally housed one of the world's largest databases of volcanic satellite imagery, the ASTER Volcano Archive (10Tb), making these data accessible online for public and scientific use. However, a series of changes in how satellite imagery is housed by the Earth Observing System (EOS) Data Information System has meant that JPL has been unable to systematically maintain its database for the last several years. We have provided a fast, transparent, machine-to-machine client that has updated JPL's database and will keep it current in near real-time. The development of this client has also given us the capability to retrieve any data provided by NASA's Earth Observing System Clearinghouse (ECHO) that covers a volcanic event reported by U.S. Air Force Weather Agency (AFWA). We will also provide a publicly available tool that interfaces with ECHO that can provide functionality not available in any of ECHO's Earth science discovery tools.

Audio stream classification for multimedia database search

NASA Astrophysics Data System (ADS)

Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

2013-03-01

Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.
GenderMedDB: an interactive database of sex and gender-specific medical literature.

PubMed

Oertelt-Prigione, Sabine; Gohlke, Björn-Oliver; Dunkel, Mathias; Preissner, Robert; Regitz-Zagrosek, Vera

2014-01-01

Searches for sex and gender-specific publications are complicated by the absence of a specific algorithm within search engines and by the lack of adequate archives to collect the retrieved results. We previously addressed this issue by initiating the first systematic archive of medical literature containing sex and/or gender-specific analyses. This initial collection has now been greatly enlarged and re-organized as a free user-friendly database with multiple functions: GenderMedDB (http://gendermeddb.charite.de). GenderMedDB retrieves the included publications from the PubMed database. Manuscripts containing sex and/or gender-specific analysis are continuously screened and the relevant findings organized systematically into disciplines and diseases. Publications are furthermore classified by research type, subject and participant numbers. More than 11,000 abstracts are currently included in the database, after screening more than 40,000 publications. The main functions of the database include searches by publication data or content analysis based on pre-defined classifications. In addition, registrants are enabled to upload relevant publications, access descriptive publication statistics and interact in an open user forum. Overall, GenderMedDB offers the advantages of a discipline-specific search engine as well as the functions of a participative tool for the gender medicine community.
Publication of nuclear magnetic resonance experimental data with semantic web technology and the application thereof to biomedical research of proteins.

PubMed

Yokochi, Masashi; Kobayashi, Naohiro; Ulrich, Eldon L; Kinjo, Akira R; Iwata, Takeshi; Ioannidis, Yannis E; Livny, Miron; Markley, John L; Nakamura, Haruki; Kojima, Chojiro; Fujiwara, Toshimichi

2016-05-05

The nuclear magnetic resonance (NMR) spectroscopic data for biological macromolecules archived at the BioMagResBank (BMRB) provide a rich resource of biophysical information at atomic resolution. The NMR data archived in NMR-STAR ASCII format have been implemented in a relational database. However, it is still fairly difficult for users to retrieve data from the NMR-STAR files or the relational database in association with data from other biological databases. To enhance the interoperability of the BMRB database, we present a full conversion of BMRB entries to two standard structured data formats, XML and RDF, as common open representations of the NMR-STAR data. Moreover, a SPARQL endpoint has been deployed. The described case study demonstrates that a simple query of the SPARQL endpoints of the BMRB, UniProt, and Online Mendelian Inheritance in Man (OMIM), can be used in NMR and structure-based analysis of proteins combined with information of single nucleotide polymorphisms (SNPs) and their phenotypes. We have developed BMRB/XML and BMRB/RDF and demonstrate their use in performing a federated SPARQL query linking the BMRB to other databases through standard semantic web technologies. This will facilitate data exchange across diverse information resources.
Astroinformatics as a New Research Field. UkrVO Astroinformation Resources: Tasks and Prospective

NASA Astrophysics Data System (ADS)

Vavilova, I. B.

The data-oriented astronomy has allowed classifying the Astroinformatics as a new academic research field, which covers various multi-disciplinary applications of the e-Astronomy. Among them are the data modeling, data mining, metadata standards development, data access, digital astronomical databases, image archives and visualization, machine learning, statistics and other computational methods and software for work with astronomical survey and catalogues with their teta- topeta-scale astroinformation resource. In this review we describe briefly the astroinformatics applications and software/services performed for different astronomical tasks in frame of the VIrtual Roentgen and Gamma Observatory (VIRGO) and Ukrainian VirtualObservatory (UkrVO). Among them there are projects based on the archival space-born data of X-ray and gamma space observatories and on the Joint Digitized Archive (JDA) database of astroplate network collections. The UkrVO JDA DR1 deals with the star catalogues (FON, Polar zone, open clusters, GRB star fields) as well as the UkrVO JDA DR2 deals with the Solar System bodies (giant and small planets, satellites, astronomical heritage images).
The electronic Rothamsted Archive (e-RA), an online resource for data from the Rothamsted long-term experiments.

PubMed

Perryman, Sarah A M; Castells-Brooke, Nathalie I D; Glendining, Margaret J; Goulding, Keith W T; Hawkesford, Malcolm J; Macdonald, Andy J; Ostler, Richard J; Poulton, Paul R; Rawlings, Christopher J; Scott, Tony; Verrier, Paul J

2018-05-15

The electronic Rothamsted Archive, e-RA (www.era.rothamsted.ac.uk) provides a permanent managed database to both securely store and disseminate data from Rothamsted Research's long-term field experiments (since 1843) and meteorological stations (since 1853). Both historical and contemporary data are made available via this online database which provides the scientific community with access to a unique continuous record of agricultural experiments and weather measured since the mid-19 th century. Qualitative information, such as treatment and management practices, plans and soil information, accompanies the data and are made available on the e-RA website. e-RA was released externally to the wider scientific community in 2013 and this paper describes its development, content, curation and the access process for data users. Case studies illustrate the diverse applications of the data, including its original intended purposes and recent unforeseen applications. Usage monitoring demonstrates the data are of increasing interest. Future developments, including adopting FAIR data principles, are proposed as the resource is increasingly recognised as a unique archive of data relevant to sustainable agriculture, agroecology and the environment.
Fusion of Dependent and Independent Biometric Information Sources

DTIC Science & Technology

2005-03-01

palmprint , DNA, ECG, signature, etc. The comparison of various biometric techniques is given in [13] and is presented in Table 1. Since, each...theory. Experimental studies on the M2VTS database [32] showed that a reduction in error rates is up to about 40%. Four combination strategies are...taken from the CEDAR benchmark database . The word recognition results were the highest (91%) among published results for handwritten words (before 2001
Diagnosis and prediction of neuroendocrine liver metastases: a protocol of six systematic reviews.

PubMed

Arigoni, Stephan; Ignjatovic, Stefan; Sager, Patrizia; Betschart, Jonas; Buerge, Tobias; Wachtl, Josephine; Tschuor, Christoph; Limani, Perparim; Puhan, Milo A; Lesurtel, Mickael; Raptis, Dimitri A; Breitenstein, Stefan

2013-12-23

Patients with hepatic metastases from neuroendocrine tumors (NETs) benefit from an early diagnosis, which is crucial for the optimal therapy and management. Diagnostic procedures include morphological and functional imaging, identification of biomarkers, and biopsy. The aim of six systematic reviews discussed in this study is to assess the predictive value of Ki67 index and other biomarkers, to compare the diagnostic accuracy of morphological and functional imaging, and to define the role of biopsy in the diagnosis and prediction of neuroendocrine tumor liver metastases. An objective group of librarians will provide an electronic search strategy to examine the following databases: MEDLINE, EMBASE and The Cochrane Library (Cochrane Database of Systematic Reviews, Cochrane Central Register of Controlled Trials (CENTRAL), Database of Abstracts of Reviews of Effects). There will be no restriction concerning language and publication date. The qualitative and quantitative synthesis of the systematic review will be conducted with randomized controlled trials (RCT), prospective and retrospective comparative cohort studies, and case-control studies. Case series will be collected in a separate database and only used for descriptive purposes. This study is ongoing and presents a protocol of six systematic reviews to elucidate the role of histopathological and biochemical markers, biopsies of the primary tumor and the metastases as well as morphological and functional imaging modalities for the diagnosis and prediction of neuroendocrine liver metastases. These systematic reviews will assess the value and accuracy of several diagnostic modalities in patients with NET liver metastases, and will provide a basis for the development of clinical practice guidelines. The systematic reviews have been prospectively registered with the International Prospective Register of Systematic Reviews (PROSPERO): CRD42012002644; http://www.metaxis.com/prospero/full_doc.asp?RecordID=2644 (Archived by WebCite at http://www.webcitation.org/6LzCLd5sF), CRD42012002647; http://www.metaxis.com/prospero/full_doc.asp?RecordID=2647 (Archived by WebCite at http://www.webcitation.org/6LzCRnZnO), CRD42012002648; http://www.metaxis.com/prospero/full_doc.asp?RecordID=2648 (Archived by WebCite at http://www.webcitation.org/6LzCVeuVR), CRD42012002649; http://www.metaxis.com/prospero/full_doc.asp?RecordID=2649 (Archived by WebCite at http://www.webcitation.org/6LzCZzZWU), CRD42012002650; http://www.metaxis.com/prospero/full_doc.asp?RecordID=2650 (Archived by WebCite at http://www.webcitation.org/6LzDPhGb8), CRD42012002651; http://www.crd.york.ac.uk/PROSPERO/display_record.asp?ID=CRD42012002651#.UrMglPRDuVo (Archived by WebCite at http://www.webcitation.org/6LzClCNff).
From Field to Laboratory: A New Database Approach for Linking Microbial Field Ecology with Laboratory Studies

NASA Technical Reports Server (NTRS)

Bebout, Leslie; Keller, R.; Miller, S.; Jahnke, L.; DeVincenzi, D. (Technical Monitor)

2002-01-01

The Ames Exobiology Culture Collection Database (AECC-DB) has been developed as a collaboration between microbial ecologists and information technology specialists. It allows for extensive web-based archiving of information regarding field samples to document microbial co-habitation of specific ecosystem micro-environments. Documentation and archiving continues as pure cultures are isolated, metabolic properties determined, and DNA extracted and sequenced. In this way metabolic properties and molecular sequences are clearly linked back to specific isolates and the location of those microbes in the ecosystem of origin. Use of this database system presents a significant advancement over traditional bookkeeping wherein there is generally little or no information regarding the environments from which microorganisms were isolated. Generally there is only a general ecosystem designation (i.e., hot-spring). However within each of these there are a myriad of microenvironments with very different properties and determining exactly where (which microenvironment) a given microbe comes from is critical in designing appropriate isolation media and interpreting physiological properties. We are currently using the database to aid in the isolation of a large number of cyanobacterial species and will present results by PI's and students demonstrating the utility of this new approach.
ESA Science Archives, VO tools and remote Scientific Data reduction in Grid Architectures

NASA Astrophysics Data System (ADS)

Arviset, C.; Barbarisi, I.; de La Calle, I.; Fajersztejn, N.; Freschi, M.; Gabriel, C.; Gomez, P.; Guainazzi, M.; Ibarra, A.; Laruelo, A.; Leon, I.; Micol, A.; Parrilla, E.; Ortiz, I.; Osuna, P.; Salgado, J.; Stebe, A.; Tapiador, D.

2008-08-01

This paper presents the latest functionalities of the ESA Science Archives located at ESAC, Spain, in particular, the following archives : the ISO Data Archive (IDA {http://iso.esac.esa.int/ida}), the XMM-Newton Science Archive (XSA {http://xmm.esac.esa.int/xsa}), the Integral SOC Science Data Archive (ISDA {http://integral.esac.esa.int/isda}) and the Planetary Science Archive (PSA {http://www.rssd.esa.int/psa}), both the classical and the map-based Mars Express interfaces. Furthermore, the ESA VOSpec {http://esavo.esac.esa.int/vospecapp} spectra analysis tool is described, which allows to access and display spectral information from VO resources (both real observational and theoretical spectra), including access to Lines database and recent analysis functionalities. In addition, we detail the first implementation of RISA (Remote Interface for Science Analysis), a web service providing remote users the ability to create fully configurable XMM-Newton data analysis workflows, and to deploy and run them on the ESAC Grid. RISA makes fully use of the inter-operability provided by the SIAP (Simple Image Access Protocol) services as data input, and at the same time its VO-compatible output can directly be used by general VO-tools.
A precipitation database of station-based daily and monthly measurements for West Africa: Overview, quality control and harmonization

NASA Astrophysics Data System (ADS)

Bliefernicht, Jan; Waongo, Moussa; Annor, Thompson; Laux, Patrick; Lorenz, Manuel; Salack, Seyni; Kunstmann, Harald

2017-04-01

West Africa is a data sparse region. High quality and long-term precipitation data are often not readily available for applications in hydrology, agriculture, meteorology and other needs. To close this gap, we use multiple data sources to develop a precipitation database with long-term daily and monthly time series. This database was compiled from 16 archives including global databases e.g. from the Global Historical Climatology Network (GHCN), databases from research projects (e.g. the AMMA database) and databases of the national meteorological services of some West African countries. The collection consists of more than 2000 precipitation gauges with measurements dating from 1850 to 2015. Due to erroneous measurements (e.g. temporal offsets, unit conversion errors), missing values and inconsistent meta-data, the merging of this precipitation dataset is not straightforward and requires a thorough quality control and harmonization. To this end, we developed geostatistical-based algorithms for quality control of individual databases and harmonization to a joint database. The algorithms are based on a pairwise comparison of the correspondence of precipitation time series in dependence to the distance between stations. They were tested for precipitation time series from gages located in a rectangular domain covering Burkina Faso, Ghana, Benin and Togo. This harmonized and quality controlled precipitation database was recently used for several applications such as the validation of a high resolution regional climate model and the bias correction of precipitation projections provided the Coordinated Regional Climate Downscaling Experiment (CORDEX). In this presentation, we will give an overview of the novel daily and monthly precipitation database and the algorithms used for quality control and harmonization. We will also highlight the quality of global and regional archives (e.g. GHCN, GSOD, AMMA database) in comparison to the precipitation databases provided by the national meteorological services.
NADIR: A Flexible Archiving System Current Development

NASA Astrophysics Data System (ADS)

Knapic, C.; De Marco, M.; Smareglia, R.; Molinaro, M.

2014-05-01

The New Archiving Distributed InfrastructuRe (NADIR) is under development at the Italian center for Astronomical Archives (IA2) to increase the performances of the current archival software tools at the data center. Traditional softwares usually offer simple and robust solutions to perform data archive and distribution but are awkward to adapt and reuse in projects that have different purposes. Data evolution in terms of data model, format, publication policy, version, and meta-data content are the main threats to re-usage. NADIR, using stable and mature framework features, answers those very challenging issues. Its main characteristics are a configuration database, a multi threading and multi language environment (C++, Java, Python), special features to guarantee high scalability, modularity, robustness, error tracking, and tools to monitor with confidence the status of each project at each archiving site. In this contribution, the development of the core components is presented, commenting also on some performance and innovative features (multi-cast and publisher-subscriber paradigms). NADIR is planned to be developed as simply as possible with default configurations for every project, first of all for LBT and other IA2 projects.
Databases and archiving for cryoEM

PubMed Central

Patwardhan, Ardan; Lawson, Catherine L.

2017-01-01

Cryo-EM in structural biology is currently served by three public archives – EMDB for 3DEM reconstructions, PDB for models built from 3DEM reconstructions and EMPIAR for the raw 2D image data used to obtain the 3DEM reconstructions. These archives play a vital role for both the structural community and the wider biological community in making the data accessible so that results may be reused, reassessed and integrated with other structural and bioinformatics resources. The important role of the archives is underpinned by the fact that many journals mandate the deposition of data to PDB and EMDB on publication. The field is currently undergoing transformative changes where on the one hand high-resolution structures are becoming a routine occurrence while on the other hand electron tomography is enabling the study of macromolecules in the cellular context. Concomitantly the archives are evolving to best serve their stakeholder communities. In this chapter we describe the current state of the archives, resources available for depositing, accessing, searching, visualising and validating data, on-going community-wide initiatives and opportunities and challenges for the future. PMID:27572735
15 CFR 4.3 - Records under the FOIA.

Code of Federal Regulations, 2012 CFR

2012-01-01

... undertaking significant programming work) or merely extracts them from an existing database may be unclear... the United States Code or the National Archives and Records Administration's General Records Schedule...
15 CFR 4.3 - Records under the FOIA.

Code of Federal Regulations, 2010 CFR

2010-01-01

... undertaking significant programming work) or merely extracts them from an existing database may be unclear... the United States Code or the National Archives and Records Administration's General Records Schedule...
15 CFR 4.3 - Records under the FOIA.

Code of Federal Regulations, 2011 CFR

2011-01-01

... undertaking significant programming work) or merely extracts them from an existing database may be unclear... the United States Code or the National Archives and Records Administration's General Records Schedule...
15 CFR 4.3 - Records under the FOIA.

Code of Federal Regulations, 2013 CFR

2013-01-01

... undertaking significant programming work) or merely extracts them from an existing database may be unclear... the United States Code or the National Archives and Records Administration's General Records Schedule...
A complete public archive for the Einstein IPC

NASA Technical Reports Server (NTRS)

Helfand, David J.

1995-01-01

This report documents progress made in the period 24 Sept. 1993 - 23 Sept. 1995 on the project described in our proposal 'A Complete Public Archive for the Einstein IPC' which was approved under the Astrophysics Data Program in 1992. We have completed most of the principal objectives of the original proposal; a NFE was recently approved so that costs for publications in press can be covered and we can complete the public record for the Einstein IPC database.
Acta Informatica Medica Is Indexed In Pubmed And Archived In Pubmed Central

PubMed Central

Masic, Izet

2013-01-01

Acta Informatica Medica journal has been accepted for archiving in PubMed Central from 2011 onward. The journal started in 1993 as the official journal of the Society for Medical Informatics of Bosnia and Herzegovina. During the last 3 years, Acta Informatica Medica has een included in almost all prestigious online databases, including PubMed, Scopus and EMBASE. The 20th volume of the journal is fully international, with papers from 18 countries. PMID:23572852
Acta informatica medica is indexed in pubmed and archived in pubmed central.

PubMed

Masic, Izet

2013-03-01

Acta Informatica Medica journal has been accepted for archiving in PubMed Central from 2011 onward. The journal started in 1993 as the official journal of the Society for Medical Informatics of Bosnia and Herzegovina. During the last 3 years, Acta Informatica Medica has een included in almost all prestigious online databases, including PubMed, Scopus and EMBASE. The 20th volume of the journal is fully international, with papers from 18 countries.
Introducing the PRIDE Archive RESTful web services.

PubMed

Reisinger, Florian; del-Toro, Noemi; Ternent, Tobias; Hermjakob, Henning; Vizcaíno, Juan Antonio

2015-07-01

The PRIDE (PRoteomics IDEntifications) database is one of the world-leading public repositories of mass spectrometry (MS)-based proteomics data and it is a founding member of the ProteomeXchange Consortium of proteomics resources. In the original PRIDE database system, users could access data programmatically by accessing the web services provided by the PRIDE BioMart interface. New REST (REpresentational State Transfer) web services have been developed to serve the most popular functionality provided by BioMart (now discontinued due to data scalability issues) and address the data access requirements of the newly developed PRIDE Archive. Using the API (Application Programming Interface) it is now possible to programmatically query for and retrieve peptide and protein identifications, project and assay metadata and the originally submitted files. Searching and filtering is also possible by metadata information, such as sample details (e.g. species and tissues), instrumentation (mass spectrometer), keywords and other provided annotations. The PRIDE Archive web services were first made available in April 2014. The API has already been adopted by a few applications and standalone tools such as PeptideShaker, PRIDE Inspector, the Unipept web application and the Python-based BioServices package. This application is free and open to all users with no login requirement and can be accessed at http://www.ebi.ac.uk/pride/ws/archive/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Cost-effective data storage/archival subsystem for functional PACS

NASA Astrophysics Data System (ADS)

Chen, Y. P.; Kim, Yongmin

1993-09-01

Not the least of the requirements of a workable PACS is the ability to store and archive vast amounts of information. A medium-size hospital will generate between 1 and 2 TBytes of data annually on a fully functional PACS. A high-speed image transmission network coupled with a comparably high-speed central data storage unit can make local memory and magnetic disks in the PACS workstations less critical and, in an extreme case, unnecessary. Under these circumstances, the capacity and performance of the central data storage subsystem and database is critical in determining the response time at the workstations, thus significantly affecting clinical acceptability. The central data storage subsystem not only needs to provide sufficient capacity to store about ten days worth of images (five days worth of new studies, and on the average, about one comparison study for each new study), but also supplies images to the requesting workstation in a timely fashion. The database must provide fast retrieval responses upon users' requests for images. This paper analyzes both advantages and disadvantages of multiple parallel transfer disks versus RAID disks for short-term central data storage subsystem, as well as optical disk jukebox versus digital recorder tape subsystem for long-term archive. Furthermore, an example high-performance cost-effective storage subsystem which integrates both the RAID disks and high-speed digital tape subsystem as a cost-effective PACS data storage/archival unit are presented.
The Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) after fifteen years: Review of global products

NASA Astrophysics Data System (ADS)

Abrams, Michael; Tsu, Hiroji; Hulley, Glynn; Iwao, Koki; Pieri, David; Cudahy, Tom; Kargel, Jeffrey

2015-06-01

The Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) is a 15-channel imaging instrument operating on NASA's Terra satellite. A joint project between the U.S. National Aeronautics and Space Administration and Japan's Ministry of Economy, Trade, and Industry, ASTER has been acquiring data for 15 years, since March 2000. The archive now contains over 2.8 million scenes; for the majority of them, a stereo pair was collected using nadir and backward telescopes imaging in the NIR wavelength. The majority of users require only a few to a few dozen scenes for their work. Studies have ranged over numerous scientific disciplines, and many practical applications have benefited from ASTER's unique data. A few researchers have been able to mine the entire ASTER archive, that is now global in extent due to the long duration of the mission. Six examples of global products are described in this contribution: the ASTER Global Digital Elevation Model (GDEM), the most complete, highest resolution DEM available to all users; the ASTER Emissivity Database (ASTER GED), a global 5-band emissivity map of the land surface; the ASTER Global Urban Area Map (AGURAM), a 15-m resolution database of over 3500 cities; the ASTER Volcano Archive (AVA), an archive of over 1500 active volcanoes; ASTER Geoscience products of the continent of Australia; and the Global Ice Monitoring from Space (GLIMS) project.
CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction

PubMed Central

Puton, Tomasz; Kozlowski, Lukasz P.; Rother, Kristian M.; Bujnicki, Janusz M.

2013-01-01

We present a continuous benchmarking approach for the assessment of RNA secondary structure prediction methods implemented in the CompaRNA web server. As of 3 October 2012, the performance of 28 single-sequence and 13 comparative methods has been evaluated on RNA sequences/structures released weekly by the Protein Data Bank. We also provide a static benchmark generated on RNA 2D structures derived from the RNAstrand database. Benchmarks on both data sets offer insight into the relative performance of RNA secondary structure prediction methods on RNAs of different size and with respect to different types of structure. According to our tests, on the average, the most accurate predictions obtained by a comparative approach are generated by CentroidAlifold, MXScarna, RNAalifold and TurboFold. On the average, the most accurate predictions obtained by single-sequence analyses are generated by CentroidFold, ContextFold and IPknot. The best comparative methods typically outperform the best single-sequence methods if an alignment of homologous RNA sequences is available. This article presents the results of our benchmarks as of 3 October 2012, whereas the rankings presented online are continuously updated. We will gladly include new prediction methods and new measures of accuracy in the new editions of CompaRNA benchmarks. PMID:23435231
The semantic pathfinder: using an authoring metaphor for generic multimedia indexing.

PubMed

Snoek, Cees G M; Worring, Marcel; Geusebroek, Jan-Mark; Koelma, Dennis C; Seinstra, Frank J; Smeulders, Arnold W M

2006-10-01

This paper presents the semantic pathfinder architecture for generic indexing of multimedia archives. The semantic pathfinder extracts semantic concepts from video by exploring different paths through three consecutive analysis steps, which we derive from the observation that produced video is the result of an authoring-driven process. We exploit this authoring metaphor for machine-driven understanding. The pathfinder starts with the content analysis step. In this analysis step, we follow a data-driven approach of indexing semantics. The style analysis step is the second analysis step. Here, we tackle the indexing problem by viewing a video from the perspective of production. Finally, in the context analysis step, we view semantics in context. The virtue of the semantic pathfinder is its ability to learn the best path of analysis steps on a per-concept basis. To show the generality of this novel indexing approach, we develop detectors for a lexicon of 32 concepts and we evaluate the semantic pathfinder against the 2004 NIST TRECVID video retrieval benchmark, using a news archive of 64 hours. Top ranking performance in the semantic concept detection task indicates the merit of the semantic pathfinder for generic indexing of multimedia archives.
Benchmarking a Visual-Basic based multi-component one-dimensional reactive transport modeling tool

NASA Astrophysics Data System (ADS)

Torlapati, Jagadish; Prabhakar Clement, T.

2013-01-01

We present the details of a comprehensive numerical modeling tool, RT1D, which can be used for simulating biochemical and geochemical reactive transport problems. The code can be run within the standard Microsoft EXCEL Visual Basic platform, and it does not require any additional software tools. The code can be easily adapted by others for simulating different types of laboratory-scale reactive transport experiments. We illustrate the capabilities of the tool by solving five benchmark problems with varying levels of reaction complexity. These literature-derived benchmarks are used to highlight the versatility of the code for solving a variety of practical reactive transport problems. The benchmarks are described in detail to provide a comprehensive database, which can be used by model developers to test other numerical codes. The VBA code presented in the study is a practical tool that can be used by laboratory researchers for analyzing both batch and column datasets within an EXCEL platform.
[The use of benchmarking to manage the healthcare supply chain: effects on purchasing cost and quality].

PubMed

Naranjo-Gil, David; Ruiz-Muñoz, David

2015-01-01

Healthcare supply expenses consume a large part of the financial resources allocated to public health. The aim of this study was to analyze the use of a benchmarking process in the management of hospital purchases, as well as its effect on product cost reduction and quality improvement. Data were collected through a survey conducted in 29 primary healthcare districts from 2010 to 2011, and through a healthcare database on the prices, quality, delivery time and supplier characteristics of 5373 products. The use of benchmarking processes reduced or eliminated products with a low quality and high price. These processes increased the quality of products by 10.57% and reduced their purchase price by 28.97%. The use of benchmarking by healthcare centers can reduce expenditure and allow more efficient management of the healthcare supply chain. It also facilitated the acquisition of products at lower prices and higher quality. Copyright © 2014 SESPAS. Published by Elsevier Espana. All rights reserved.
Puncture-proof picture archiving and communication system.

PubMed

Willis, C E; McCluggage, C W; Orand, M R; Parker, B R

2001-06-01

As we become increasingly dependent on our picture archiving and communications system (PACS) for the clinical practice of medicine, the demand for improved reliability becomes urgent. Borrowing principles from the discipline of Reliability Engineering, we have identified components of our system that constitute single points of failure and have endeavored to eliminate these through redundant components and manual work-around procedures. To assess the adequacy of our preparations, we have identified a set of plausible events that could interfere with the function of one or more of our PACS components. These events could be as simple as the loss of the network connection to a single component or as broad as the loss of our central data center. We have identified the need to continue to operate during adverse conditions, as well as the requirement to recover rapidly from major disruptions in service. This assessment led us to modify the physical locations of central PACS components within our physical plant. We are also taking advantage of actual disruptive events coincident with a major expansion of our facility to test our recovery procedures. Based on our recognition of the vital nature of our electronic images for patient care, we are now recording electronic images in two copies on disparate media. The image database is critical to both continued operations and recovery. Restoration of the database from periodic tape backups with a 24-hour cycle time may not support our clinical scenario: acquisition modalities have a limited local storage capacity, some of which will not contain the daily workload. Restoration of the database from the archived media is an exceedingly slow process, that will likely not meet our requirement to restore clinical operations without significant delay. Our PACS vendor is working on concurrent image databases that would be capable of nearly immediate switchover and recovery.
Exoplanet Orbit Database | Exoplanet Data Explorer

Science.gov Websites

, Strasbourg, France, NASA's Astrophysics Data System, the NASA Exoplanet Archive (and, formerly, the NASA/IPAC /Caltech. This research received generous funding from NASA and the NSF.
A comprehensive clinical research database based on CDISC ODM and i2b2.

PubMed

Meineke, Frank A; Stäubert, Sebastian; Löbe, Matthias; Winter, Alfred

2014-01-01

We present a working approach for a clinical research database as part of an archival information system. The CDISC ODM standard is target for clinical study and research relevant routine data, thus decoupling the data ingest process from the access layer. The presented research database is comprehensive as it covers annotating, mapping and curation of poorly annotated source data. Besides a conventional relational database the medical data warehouse i2b2 serves as main frontend for end-users. The system we developed is suitable to support patient recruitment, cohort identification and quality assurance in daily routine.
High Tech High School Interns Develop a Mid-Ocean Ridge Database for Research and Education

NASA Astrophysics Data System (ADS)

Staudigel, D.; Delaney, R.; Staudigel, H.; Koppers, A. A.; Miller, S. P.

2004-12-01

Mid-ocean ridges (MOR) represent one of the most important geographical and geological features on planet Earth. MORs are the locations where plates spread apart, they are the locations of the majority of the Earths' volcanoes that harbor some of the most extreme life forms. These concepts attract much research, but mid-ocean ridges are still effectively underrepresented in the Earth science class rooms. As two High Tech High School students, we began an internship at Scripps to develop a database for mid-ocean ridges as a resource for science and education. This Ridge Catalog will be accessible via http://earthref.org/databases/RC/ and applies a similar structure, design and data archival principle as the Seamount Catalog under EarthRef.org. Major research goals of this project include the development of (1) an archival structure for multibeam and sidescan data, standard bathymetric maps (including ODP-DSDP drill site and dredge locations) or any other arbitrary digital objects relating to MORs, and (2) to compile a global data set for some of the most defining characteristics of every ridge segment including ridge segment length, depth and azimuth and half spreading rates. One of the challenges included the need of making MOR data useful to the scientist as well as the teacher in the class room. Since the basic structure follows the design of the Seamount Catalog closely, we could move our attention to the basic data population of the database. We have pulled together multibeam data for the MOR segments from various public archives (SIOExplorer, SIO-GDC, NGDC, Lamont), and pre-processed it for public use. In particular, we have created individual bathymetric maps for each ridge segment, while merging the multibeam data with global satellite bathymetry data from Smith & Sandwell (1997). The global scale of this database will give it the ability to be used for any number of applications, from cruise planning to data
Human Thermal Model Evaluation Using the JSC Human Thermal Database

NASA Technical Reports Server (NTRS)

Bue, Grant; Makinen, Janice; Cognata, Thomas

2012-01-01

Human thermal modeling has considerable long term utility to human space flight. Such models provide a tool to predict crew survivability in support of vehicle design and to evaluate crew response in untested space environments. It is to the benefit of any such model not only to collect relevant experimental data to correlate it against, but also to maintain an experimental standard or benchmark for future development in a readily and rapidly searchable and software accessible format. The Human thermal database project is intended to do just so; to collect relevant data from literature and experimentation and to store the data in a database structure for immediate and future use as a benchmark to judge human thermal models against, in identifying model strengths and weakness, to support model development and improve correlation, and to statistically quantify a model s predictive quality. The human thermal database developed at the Johnson Space Center (JSC) is intended to evaluate a set of widely used human thermal models. This set includes the Wissler human thermal model, a model that has been widely used to predict the human thermoregulatory response to a variety of cold and hot environments. These models are statistically compared to the current database, which contains experiments of human subjects primarily in air from a literature survey ranging between 1953 and 2004 and from a suited experiment recently performed by the authors, for a quantitative study of relative strength and predictive quality of the models.
DICOM-compliant PACS with CD-based image archival

NASA Astrophysics Data System (ADS)

Cox, Robert D.; Henri, Christopher J.; Rubin, Richard K.; Bret, Patrice M.

1998-07-01

This paper describes the design and implementation of a low- cost PACS conforming to the DICOM 3.0 standard. The goal was to provide an efficient image archival and management solution on a heterogeneous hospital network as a basis for filmless radiology. The system follows a distributed, client/server model and was implemented at a fraction of the cost of a commercial PACS. It provides reliable archiving on recordable CD and allows access to digital images throughout the hospital and on the Internet. Dedicated servers have been designed for short-term storage, CD-based archival, data retrieval and remote data access or teleradiology. The short-term storage devices provide DICOM storage and query/retrieve services to scanners and workstations and approximately twelve weeks of 'on-line' image data. The CD-based archival and data retrieval processes are fully automated with the exception of CD loading and unloading. The system employs lossless compression on both short- and long-term storage devices. All servers communicate via the DICOM protocol in conjunction with both local and 'master' SQL-patient databases. Records are transferred from the local to the master database independently, ensuring that storage devices will still function if the master database server cannot be reached. The system features rules-based work-flow management and WWW servers to provide multi-platform remote data access. The WWW server system is distributed on the storage, retrieval and teleradiology servers allowing viewing of locally stored image data directly in a WWW browser without the need for data transfer to a central WWW server. An independent system monitors disk usage, processes, network and CPU load on each server and reports errors to the image management team via email. The PACS was implemented using a combination of off-the-shelf hardware, freely available software and applications developed in-house. The system has enabled filmless operation in CT, MR and ultrasound within the radiology department and throughout the hospital. The use of WWW technology has enabled the development of an intuitive we- based teleradiology and image management solution that provides complete access to image data.
[Electronic poison information management system].

PubMed

Kabata, Piotr; Waldman, Wojciech; Kaletha, Krystian; Sein Anand, Jacek

2013-01-01

We describe deployment of electronic toxicological information database in poison control center of Pomeranian Center of Toxicology. System was based on Google Apps technology, by Google Inc., using electronic, web-based forms and data tables. During first 6 months from system deployment, we used it to archive 1471 poisoning cases, prepare monthly poisoning reports and facilitate statistical analysis of data. Electronic database usage made Poison Center work much easier.
Illuminating the Depths of the MagIC (Magnetics Information Consortium) Database

NASA Astrophysics Data System (ADS)

Koppers, A. A. P.; Minnett, R.; Jarboe, N.; Jonestrask, L.; Tauxe, L.; Constable, C.

2015-12-01

The Magnetics Information Consortium (http://earthref.org/MagIC/) is a grass-roots cyberinfrastructure effort envisioned by the paleo-, geo-, and rock magnetic scientific community. Its mission is to archive their wealth of peer-reviewed raw data and interpretations from magnetics studies on natural and synthetic samples. Many of these valuable data are legacy datasets that were never published in their entirety, some resided in other databases that are no longer maintained, and others were never digitized from the field notebooks and lab work. Due to the volume of data collected, most studies, modern and legacy, only publish the interpreted results and, occasionally, a subset of the raw data. MagIC is making an extraordinary effort to archive these data in a single data model, including the raw instrument measurements if possible. This facilitates the reproducibility of the interpretations, the re-interpretation of the raw data as the community introduces new techniques, and the compilation of heterogeneous datasets that are otherwise distributed across multiple formats and physical locations. MagIC has developed tools to assist the scientific community in many stages of their workflow. Contributors easily share studies (in a private mode if so desired) in the MagIC Database with colleagues and reviewers prior to publication, publish the data online after the study is peer reviewed, and visualize their data in the context of the rest of the contributions to the MagIC Database. From organizing their data in the MagIC Data Model with an online editable spreadsheet, to validating the integrity of the dataset with automated plots and statistics, MagIC is continually lowering the barriers to transforming dark data into transparent and reproducible datasets. Additionally, this web application generalizes to other databases in MagIC's umbrella website (EarthRef.org) so that the Geochemical Earth Reference Model (http://earthref.org/GERM/) portal, Seamount Biogeosciences Network (http://earthref.org/SBN/), EarthRef Digital Archive (http://earthref.org/ERDA/) and EarthRef Reference Database (http://earthref.org/ERR/) benefit from its development.
Ontology for Semantic Data Integration in the Domain of IT Benchmarking.

PubMed

Pfaff, Matthias; Neubig, Stefan; Krcmar, Helmut

2018-01-01

A domain-specific ontology for IT benchmarking has been developed to bridge the gap between a systematic characterization of IT services and their data-based valuation. Since information is generally collected during a benchmark exercise using questionnaires on a broad range of topics, such as employee costs, software licensing costs, and quantities of hardware, it is commonly stored as natural language text; thus, this information is stored in an intrinsically unstructured form. Although these data form the basis for identifying potentials for IT cost reductions, neither a uniform description of any measured parameters nor the relationship between such parameters exists. Hence, this work proposes an ontology for the domain of IT benchmarking, available at https://w3id.org/bmontology. The design of this ontology is based on requirements mainly elicited from a domain analysis, which considers analyzing documents and interviews with representatives from Small- and Medium-Sized Enterprises and Information and Communications Technology companies over the last eight years. The development of the ontology and its main concepts is described in detail (i.e., the conceptualization of benchmarking events, questionnaires, IT services, indicators and their values) together with its alignment with the DOLCE-UltraLite foundational ontology.
Science information systems: Archive, access, and retrieval

NASA Technical Reports Server (NTRS)

Campbell, William J.

1991-01-01

The objective of this research is to develop technology for the automated characterization and interactive retrieval and visualization of very large, complex scientific data sets. Technologies will be developed for the following specific areas: (1) rapidly archiving data sets; (2) automatically characterizing and labeling data in near real-time; (3) providing users with the ability to browse contents of databases efficiently and effectively; (4) providing users with the ability to access and retrieve system independent data sets electronically; and (5) automatically alerting scientists to anomalies detected in data.
Essential Nursing References.

ERIC Educational Resources Information Center

Nursing and Health Care Perspectives, 2000

2000-01-01

This partially annotated bibliography contains these categories: abstract sources, archives, audiovisuals, bibliographies, databases, dictionaries, directories, drugs/toxicology/environmental health, grant resources, histories, indexes, Internet resources, reviews, statistical sources, and writers' manuals and guides. A supplement lists Canadian…
29 CFR 4.3 - Wage determinations.

Code of Federal Regulations, 2014 CFR

2014-07-01

... section 2(a) thereof will be set forth in wage determinations issued by the Administrator. Wage... determinations that are no longer current may be accessed in the “Archived SCA WD” database of WDOL for...
29 CFR 4.3 - Wage determinations.

Code of Federal Regulations, 2011 CFR

2011-07-01

... section 2(a) thereof will be set forth in wage determinations issued by the Administrator. Wage... determinations that are no longer current may be accessed in the “Archived SCA WD” database of WDOL for...
29 CFR 4.3 - Wage determinations.

Code of Federal Regulations, 2012 CFR

2012-07-01

... section 2(a) thereof will be set forth in wage determinations issued by the Administrator. Wage... determinations that are no longer current may be accessed in the “Archived SCA WD” database of WDOL for...

29 CFR 4.3 - Wage determinations.

Code of Federal Regulations, 2013 CFR

2013-07-01

... section 2(a) thereof will be set forth in wage determinations issued by the Administrator. Wage... determinations that are no longer current may be accessed in the “Archived SCA WD” database of WDOL for...
RefSeq microbial genomes database: new representation and annotation strategy.

PubMed

Tatusova, Tatiana; Ciufo, Stacy; Fedorov, Boris; O'Neill, Kathleen; Tolstoy, Igor

2014-01-01

The source of the microbial genomic sequences in the RefSeq collection is the set of primary sequence records submitted to the International Nucleotide Sequence Database public archives. These can be accessed through the Entrez search and retrieval system at http://www.ncbi.nlm.nih.gov/genome. Next-generation sequencing has enabled researchers to perform genomic sequencing at rates that were unimaginable in the past. Microbial genomes can now be sequenced in a matter of hours, which has led to a significant increase in the number of assembled genomes deposited in the public archives. This huge increase in DNA sequence data presents new challenges for the annotation, analysis and visualization bioinformatics tools. New strategies have been developed for the annotation and representation of reference genomes and sequence variations derived from population studies and clinical outbreaks.
SeaWiFS technical report series. Volume 20: The SeaWiFS bio-optical archive and storage system (SeaBASS), part 1

NASA Technical Reports Server (NTRS)

Hooker, Stanford B. (Editor); Mcclain, Charles R.; Firestone, James K.; Westphal, Todd L.; Yeh, Eueng-Nan; Ge, Yuntao; Firestone, Elaine R.

1994-01-01

This document provides an overview of the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) Bio-Optical Archive and Storage System (SeaBASS), which will serve as a repository for numerous data sets of interest to the SeaWiFS Science Team and other approved investigators in the oceanographic community. The data collected will be those data sets suitable for the development and evaluation of bio-optical algorithms which include results from SeaWiFS Intercalibration Round-Robin Experiments (SIRREXs), prelaunch characterization of the SeaWiFS instrument by its manufacturer -- Hughes/Santa Barbara Research Center (SBRC), Marine Optical Characterization Experiment (MOCE) cruises, Marine Optical Buoy (MOBY) deployments and refurbishments, and field studies of other scientists outside of NASA. The primary goal of the data system is to provide a simple mechanism for querying the available archive and requesting specific items, while assuring that the data is made available only to authorized users. The design, construction, and maintenance of SeaBASS is the responsibility of the SeaWiFS Calibration and Validation Team (CVT). This report is concerned with documenting the execution of this task by the CVT and consists of a series of chapters detailing the various data sets involved. The topics presented are as follows: 1) overview of the SeaBASS file architecture, 2) the bio-optical data system, 3) the historical pigment database, 4) the SIRREX database, and 5) the SBRC database.
Query by example video based on fuzzy c-means initialized by fixed clustering center

NASA Astrophysics Data System (ADS)

Hou, Sujuan; Zhou, Shangbo; Siddique, Muhammad Abubakar

2012-04-01

Currently, the high complexity of video contents has posed the following major challenges for fast retrieval: (1) efficient similarity measurements, and (2) efficient indexing on the compact representations. A video-retrieval strategy based on fuzzy c-means (FCM) is presented for querying by example. Initially, the query video is segmented and represented by a set of shots, each shot can be represented by a key frame, and then we used video processing techniques to find visual cues to represent the key frame. Next, because the FCM algorithm is sensitive to the initializations, here we initialized the cluster center by the shots of query video so that users could achieve appropriate convergence. After an FCM cluster was initialized by the query video, each shot of query video was considered a benchmark point in the aforesaid cluster, and each shot in the database possessed a class label. The similarity between the shots in the database with the same class label and benchmark point can be transformed into the distance between them. Finally, the similarity between the query video and the video in database was transformed into the number of similar shots. Our experimental results demonstrated the performance of this proposed approach.
Last Deglacial Sea Level: A Curated Database of Indicators of Past Sea Levels from Biological and Geomorphological Archives

NASA Astrophysics Data System (ADS)

Hibbert, F. D.; Williams, F. H.; Fallon, S.; Rohling, E. J.

2017-12-01

The last deglacial was an interval of rapid climate and sea-level change, including the collapse of large continental ice sheets. This database collates carefully assessed sea-level data from peer-reviewed sources for the interval 0 to 25 thousand years ago (ka), from the last glacial maximum to the present interglacial conditions. In addition to facilitating site-specific reconstructions of past sea levels, the database provides a suite of data beyond the range of modern/instrumental variability that may help hone future sea-level projections. The database is global in scope, internally consistent, and contains U-series and radiocarbon dated indicators from both biological and geomorpohological archives. We focus on far-field data (i.e., away from the sites of the former continental ice sheets), but some key intermediate (i.e., from the Caribbean) data are also included. All primary fields (i.e., sample location, elevation, age and context) possess quantified uncertainties, which - in conjunction with available metadata - allows the reconstructed sea levels to be interpreted within both their uncertainties and geological context. Consistent treatment of each of the individual records in the database, and incorporation of fully expressed uncertainties, allows datasets to be easily compared. The compilation contains 145 studies from 40 locations (>2,000 data points) and includes all raw information and metadata.
Auto-consistent test of Galaxy star formation histories derived from resolved stellar population and integral spectroscopy

NASA Astrophysics Data System (ADS)

Rodrigues, M.; Patricio, V.; Rothberg, B.; Sanchez-Janssen, R.; Vale Asari, N.

We present the first results of our observational project 'Starfish' (STellar Population From Integrated Spectrum). The goal of this project is to calibrate, for the first time, the properties of stellar populations derived from integrated spectra with the same properties derived from direct imaging of stellar populations in the same set of galaxies. These properties include the star-formation history (SFH), stellar mass, age, and metallicity. To date, such calibrations have been demonstrated only in star clusters, globular clusters with single stellar populations, not in complex and composite objects such as galaxies. We are currently constructing a library of integrated spectra obtained from a sample of 38 nearby dwarf galaxies obtained with GEMINI/GMOS-N&S (25h) and VLT/VIMOS-IFU (43h). These are to be compared with color magnitude diagrams (CMDs) of the same galaxies constructed from archival HST imaging sensitive to at least 1.5 magnitudes below the tip of the red giant branch. From this comparison we will assess the systematics and uncertainties from integrated spectral techniques. The spectra library will be made publicly available to the community via a dedicated web-page and Vizier database. This dataset will provide a unique benchmark for testing fitting procedures and stellar population models for both nearby and distant galaxies. http://www.sc.eso.org/˜marodrig/Starfish/
Chroni - an Android Application for Geochronologists to Access Archived Sample Analyses from the NSF-Funded Geochron.Org Data Repository.

NASA Astrophysics Data System (ADS)

Nettles, J. J.; Bowring, J. F.

2014-12-01

NSF requires data management plans as part of funding proposals and geochronologists, among other scientists, are archiving their data and results to the public cloud archives managed by the NSF-funded Integrated Earth Data Applications, or IEDA. GeoChron is a database for geochronology housed within IEDA. The software application U-Pb_Redux developed at the Cyber Infrastructure Research and Development Lab for the Earth Sciences (CIRDLES.org) at the College of Charleston provides seamless connectivity to GeoChron for uranium-lead (U-Pb) geochronologists to automatically upload and retrieve their data and results. U-Pb_Redux also manages publication-quality documents including report tables and graphs. CHRONI is a lightweight mobile application for Android devices that provides easy access to these archived data and results. With CHRONI, U-Pb geochronologists can view archived data and analyses downloaded from the Geochron database, or any other location, in a customizable format. CHRONI uses the same extensible markup language (XML) schema and documents used by U-Pb_Redux and GeoChron. Report Settings are special XML files that can be customized in U-Pb_Redux, stored in the cloud, and then accessed and used in CHRONI to create the same customized data display on the mobile device. In addition to providing geologists effortless and mobile access to archived data and analyses, CHRONI allows users to manage their GeoChron credentials, quickly download private and public files via a specified IEDA International Geo Sample Number (IGSN) or URL, and view specialized graphics associated with particular IGSNs. Future versions of CHRONI will be developed to support iOS compatible devices. CHRONI is an open source project under the Apache 2 license and is hosted at https://github.com/CIRDLES/CHRONI. We encourage community participation in its continued development.
The Starchive: An open access, open source archive of nearby and young stars and their planets

NASA Astrophysics Data System (ADS)

Tanner, Angelle; Gelino, Chris; Elfeki, Mario

2015-12-01

Historically, astronomers have utilized a piecemeal set of archives such as SIMBAD, the Washington Double Star Catalog, various exoplanet encyclopedias and electronic tables from the literature to cobble together stellar and exo-planetary parameters in the absence of corresponding images and spectra. As the search for planets around young stars through direct imaging, transits and infrared/optical radial velocity surveys blossoms, there is a void in the available set of to create comprehensive lists of the stellar parameters of nearby stars especially for important parameters such as metallicity and stellar activity indicators. For direct imaging surveys, we need better resources for downloading existing high contrast images to help confirm new discoveries and find ideal target stars. Once we have discovered new planets, we need a uniform database of stellar and planetary parameters from which to look for correlations to better understand the formation and evolution of these systems. As a solution to these issues, we are developing the Starchive - an open access stellar archive in the spirit of the open exoplanet catalog, the Kepler Community Follow-up Program and many others. The archive will allow users to download various datasets, upload new images, spectra and metadata and will contain multiple plotting tools to use in presentations and data interpretations. While we will highly regulate and constantly validate the data being placed into our archive the open nature of its design is intended to allow the database to be expanded efficiently and have a level of versatility which is necessary in today's fast moving, big data community. Finally, the front-end scripts will be placed on github and users will be encouraged to contribute new plotting tools. Here, I will introduce the community to the content and expected capabilities of the archive and query the audience for community feedback.
A Complete Public Archive for the Einstein Imaging Proportional Counter

NASA Technical Reports Server (NTRS)

Helfand, David J.

1996-01-01

Consistent with our proposal to the Astrophysics Data Program in 1992, we have completed the design, construction, documentation, and distribution of a flexible and complete archive of the data collected by the Einstein Imaging Proportional Counter. Along with software and data delivered to the High Energy Astrophysics Science Archive Research Center at Goddard Space Flight Center, we have compiled and, where appropriate, published catalogs of point sources, soft sources, hard sources, extended sources, and transient flares detected in the database along with extensive analyses of the instrument's backgrounds and other anomalies. We include in this document a brief summary of the archive's functionality, a description of the scientific catalogs and other results, a bibliography of publications supported in whole or in part under this contract, and a list of personnel whose pre- and post-doctoral education consisted in part in participation in this project.
The electronic Rothamsted Archive (e-RA), an online resource for data from the Rothamsted long-term experiments

PubMed Central

Perryman, Sarah A. M.; Castells-Brooke, Nathalie I. D.; Glendining, Margaret J.; Goulding, Keith W. T.; Hawkesford, Malcolm J.; Macdonald, Andy J.; Ostler, Richard J.; Poulton, Paul R.; Rawlings, Christopher J.; Scott, Tony; Verrier, Paul J.

2018-01-01

The electronic Rothamsted Archive, e-RA (www.era.rothamsted.ac.uk) provides a permanent managed database to both securely store and disseminate data from Rothamsted Research’s long-term field experiments (since 1843) and meteorological stations (since 1853). Both historical and contemporary data are made available via this online database which provides the scientific community with access to a unique continuous record of agricultural experiments and weather measured since the mid-19th century. Qualitative information, such as treatment and management practices, plans and soil information, accompanies the data and are made available on the e-RA website. e-RA was released externally to the wider scientific community in 2013 and this paper describes its development, content, curation and the access process for data users. Case studies illustrate the diverse applications of the data, including its original intended purposes and recent unforeseen applications. Usage monitoring demonstrates the data are of increasing interest. Future developments, including adopting FAIR data principles, are proposed as the resource is increasingly recognised as a unique archive of data relevant to sustainable agriculture, agroecology and the environment. PMID:29762552
SCOPE - Stellar Classification Online Public Exploration

NASA Astrophysics Data System (ADS)

Harenberg, Steven

2010-01-01

The Astronomical Photographic Data Archive (APDA) has been established to be the primary North American archive for the collections of astronomical photographic plates. Located at the Pisgah Astronomical Research Institute (PARI) in Rosman, NC, the archive contains hundreds of thousands stellar spectra, many of which have never before been classified. To help classify the vast number of stars, the public is invited to participate in a distributed computing online environment called Stellar Classification Online - Public Exploration (SCOPE). Through a website, the participants will have a tutorial on stellar spectra and practice classifying. After practice, the participants classify spectra on photographic plates uploaded online from APDA. These classifications will be recorded in a database where the results from many users will be statistically analyzed. Stars with known spectral types will be included to test the reliability of classifications. The process of building the database of stars from APDA, which the citizen scientist will be able to classify, includes: scanning the photographic plates, orienting the plate to correct for the change in right ascension/declination using Aladin, stellar HD catalog identification using Simbad, marking the boundaries for each spectrum, and setting up the image for use on the website. We will describe the details of this process.
Commercial Building Energy Saver, Web App

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong, Tianzhen; Piette, Mary; Lee, Sang Hoon

The CBES App is a web-based toolkit for use by small businesses and building owners and operators of small and medium size commercial buildings to perform energy benchmarking and retrofit analysis for buildings. The CBES App analyzes the energy performance of user's building for pre-and posto-retrofit, in conjunction with user's input data, to identify recommended retrofit measures, energy savings and economic analysis for the selected measures. The CBES App provides energy benchmarking, including getting an EnergyStar score using EnergyStar API and benchmarking against California peer buildings using the EnergyIQ API. The retrofit analysis includes a preliminary analysis by looking upmore » retrofit measures from a pre-simulated database DEEP, and a detailed analysis creating and running EnergyPlus models to calculate energy savings of retrofit measures. The CBES App builds upon the LBNL CBES API.« less
A new UKIDSS proper motion survey and key early results, including new benchmark systems

NASA Astrophysics Data System (ADS)

Smith, L.; Lucas, P.; Burningham, B.; Jones, H.; Pinfield, D.; Smart, R.; Andrei, A.

We present a proper motion catalogue for the 1500 deg2 of 2 epoch J-band UKIDSS Large Area Survey (LAS) data, which includes 120,000 stellar sources with motions detected above the 5sigma level. Our upper limit on proper motion detection is 3\\farcs3 yr-1 and typical uncertainties are of order 10 mas yr-1 for bright sources from data with a modest 1.8-7.0 year epoch baseline. We developed a bespoke proper motion pipeline which applies a source-unique second order polynomial transformation to UKIDSS array coordinates to counter potential local non-uniformity in the focal plane. Our catalogue agrees well with the proper motion data supplied in the current WFCAM Science Archive (WSA) tenth data release (DR10) catalogue where there is overlap, and in various optical catalogues, but it benefits from some improvements, such as a larger matching radius and relative to absolute proper motion correction. We present proper motion results for 128 T dwarfs in the UKIDSS LAS and key early results of projects utilising our catalogue, in particular searches for brown dwarf benchmark systems through cross matches with existing proper motion catalogues. We report the discovery of two new T dwarf benchmark systems.
49 CFR 8.23 - Classified information transferred to the Department of Transportation.

Code of Federal Regulations, 2013 CFR

2013-10-01

... transferred to the National Archives and Records Administration (NARA) will be declassified or downgraded by... with the Archivist in efforts to establish a Government-wide database of information that has been...
49 CFR 8.23 - Classified information transferred to the Department of Transportation.

Code of Federal Regulations, 2012 CFR

2012-10-01

... transferred to the National Archives and Records Administration (NARA) will be declassified or downgraded by... with the Archivist in efforts to establish a Government-wide database of information that has been...
49 CFR 8.23 - Classified information transferred to the Department of Transportation.

Code of Federal Regulations, 2011 CFR

2011-10-01

... transferred to the National Archives and Records Administration (NARA) will be declassified or downgraded by... with the Archivist in efforts to establish a Government-wide database of information that has been...
49 CFR 8.23 - Classified information transferred to the Department of Transportation.

Code of Federal Regulations, 2010 CFR

2010-10-01

... transferred to the National Archives and Records Administration (NARA) will be declassified or downgraded by... with the Archivist in efforts to establish a Government-wide database of information that has been...
Examples of Use of SINBAD Database for Nuclear Data and Code Validation

NASA Astrophysics Data System (ADS)

Kodeli, Ivan; Žerovnik, Gašper; Milocco, Alberto

2017-09-01

The SINBAD database currently contains compilations and evaluations of over 100 shielding benchmark experiments. The SINBAD database is widely used for code and data validation. Materials covered include: Air, N. O, H2O, Al, Be, Cu, graphite, concrete, Fe, stainless steel, Pb, Li, Ni, Nb, SiC, Na, W, V and mixtures thereof. Over 40 organisations from 14 countries and 2 international organisations have contributed data and work in support of SINBAD. Examples of the use of the database in the scope of different international projects, such as the Working Party on Evaluation Cooperation of the OECD and the European Fusion Programme demonstrate the merit and possible usage of the database for the validation of modern nuclear data evaluations and new computer codes.
Benchmarking desktop and mobile handwriting across COTS devices: The e-BioSign biometric database

PubMed Central

Tolosana, Ruben; Vera-Rodriguez, Ruben; Fierrez, Julian; Morales, Aythami; Ortega-Garcia, Javier

2017-01-01

This paper describes the design, acquisition process and baseline evaluation of the new e-BioSign database, which includes dynamic signature and handwriting information. Data is acquired from 5 different COTS devices: three Wacom devices (STU-500, STU-530 and DTU-1031) specifically designed to capture dynamic signatures and handwriting, and two general purpose tablets (Samsung Galaxy Note 10.1 and Samsung ATIV 7). For the two Samsung tablets, data is collected using both pen stylus and also the finger in order to study the performance of signature verification in a mobile scenario. Data was collected in two sessions for 65 subjects, and includes dynamic information of the signature, the full name and alpha numeric sequences. Skilled forgeries were also performed for signatures and full names. We also report a benchmark evaluation based on e-BioSign for person verification under three different real scenarios: 1) intra-device, 2) inter-device, and 3) mixed writing-tool. We have experimented the proposed benchmark using the main existing approaches for signature verification: feature- and time functions-based. As a result, new insights into the problem of signature biometrics in sensor-interoperable scenarios have been obtained, namely: the importance of specific methods for dealing with device interoperability, and the necessity of a deeper analysis on signatures acquired using the finger as the writing tool. This e-BioSign public database allows the research community to: 1) further analyse and develop signature verification systems in realistic scenarios, and 2) investigate towards a better understanding of the nature of the human handwriting when captured using electronic COTS devices in realistic conditions. PMID:28475590
Benchmarking desktop and mobile handwriting across COTS devices: The e-BioSign biometric database.

PubMed

Tolosana, Ruben; Vera-Rodriguez, Ruben; Fierrez, Julian; Morales, Aythami; Ortega-Garcia, Javier

2017-01-01

This paper describes the design, acquisition process and baseline evaluation of the new e-BioSign database, which includes dynamic signature and handwriting information. Data is acquired from 5 different COTS devices: three Wacom devices (STU-500, STU-530 and DTU-1031) specifically designed to capture dynamic signatures and handwriting, and two general purpose tablets (Samsung Galaxy Note 10.1 and Samsung ATIV 7). For the two Samsung tablets, data is collected using both pen stylus and also the finger in order to study the performance of signature verification in a mobile scenario. Data was collected in two sessions for 65 subjects, and includes dynamic information of the signature, the full name and alpha numeric sequences. Skilled forgeries were also performed for signatures and full names. We also report a benchmark evaluation based on e-BioSign for person verification under three different real scenarios: 1) intra-device, 2) inter-device, and 3) mixed writing-tool. We have experimented the proposed benchmark using the main existing approaches for signature verification: feature- and time functions-based. As a result, new insights into the problem of signature biometrics in sensor-interoperable scenarios have been obtained, namely: the importance of specific methods for dealing with device interoperability, and the necessity of a deeper analysis on signatures acquired using the finger as the writing tool. This e-BioSign public database allows the research community to: 1) further analyse and develop signature verification systems in realistic scenarios, and 2) investigate towards a better understanding of the nature of the human handwriting when captured using electronic COTS devices in realistic conditions.

Human Thermal Model Evaluation Using the JSC Human Thermal Database

NASA Technical Reports Server (NTRS)

Cognata, T.; Bue, G.; Makinen, J.

2011-01-01

The human thermal database developed at the Johnson Space Center (JSC) is used to evaluate a set of widely used human thermal models. This database will facilitate a more accurate evaluation of human thermoregulatory response using in a variety of situations, including those situations that might otherwise prove too dangerous for actual testing--such as extreme hot or cold splashdown conditions. This set includes the Wissler human thermal model, a model that has been widely used to predict the human thermoregulatory response to a variety of cold and hot environments. These models are statistically compared to the current database, which contains experiments of human subjects primarily in air from a literature survey ranging between 1953 and 2004 and from a suited experiment recently performed by the authors, for a quantitative study of relative strength and predictive quality of the models. Human thermal modeling has considerable long term utility to human space flight. Such models provide a tool to predict crew survivability in support of vehicle design and to evaluate crew response in untested environments. It is to the benefit of any such model not only to collect relevant experimental data to correlate it against, but also to maintain an experimental standard or benchmark for future development in a readily and rapidly searchable and software accessible format. The Human thermal database project is intended to do just so; to collect relevant data from literature and experimentation and to store the data in a database structure for immediate and future use as a benchmark to judge human thermal models against, in identifying model strengths and weakness, to support model development and improve correlation, and to statistically quantify a model s predictive quality.
Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.

PubMed

Ravikumar, Komandur Elayavilli; Wagholikar, Kavishwar B; Li, Dingcheng; Kocher, Jean-Pierre; Liu, Hongfang

2015-06-06

Advances in the next generation sequencing technology has accelerated the pace of individualized medicine (IM), which aims to incorporate genetic/genomic information into medicine. One immediate need in interpreting sequencing data is the assembly of information about genetic variants and their corresponding associations with other entities (e.g., diseases or medications). Even with dedicated effort to capture such information in biological databases, much of this information remains 'locked' in the unstructured text of biomedical publications. There is a substantial lag between the publication and the subsequent abstraction of such information into databases. Multiple text mining systems have been developed, but most of them focus on the sentence level association extraction with performance evaluation based on gold standard text annotations specifically prepared for text mining systems. We developed and evaluated a text mining system, MutD, which extracts protein mutation-disease associations from MEDLINE abstracts by incorporating discourse level analysis, using a benchmark data set extracted from curated database records. MutD achieves an F-measure of 64.3% for reconstructing protein mutation disease associations in curated database records. Discourse level analysis component of MutD contributed to a gain of more than 10% in F-measure when compared against the sentence level association extraction. Our error analysis indicates that 23 of the 64 precision errors are true associations that were not captured by database curators and 68 of the 113 recall errors are caused by the absence of associated disease entities in the abstract. After adjusting for the defects in the curated database, the revised F-measure of MutD in association detection reaches 81.5%. Our quantitative analysis reveals that MutD can effectively extract protein mutation disease associations when benchmarking based on curated database records. The analysis also demonstrates that incorporating discourse level analysis significantly improved the performance of extracting the protein-mutation-disease association. Future work includes the extension of MutD for full text articles.
An automated, web-enabled and searchable database system for archiving electrogram and related data from implantable cardioverter defibrillators.

PubMed

Zong, W; Wang, P; Leung, B; Moody, G B; Mark, R G

2002-01-01

The advent of implantable cardioverter defibrillators (ICDs) has resulted in significant reductions in mortality in patients at high risk for sudden cardiac death. Extensive related basic research and clinical investigation continue. ICDs typically record intracardiac electrograms and inter-beat intervals along with device settings during episodes of device delivery of therapy. Researchers wishing to study these data further have until now been limited to viewing paper plots. In support of multi-center clinical studies of patients with ICDs, we have developed a web based searchable ICD data archiving system, which allows users to use a web browser to upload ICD data from diskettes to a server where the data are automatically processed and archived. Users can view and download the archived ICD data directly via the web. The entire system is built from open source software. At present more than 500 patient ICD data sets have been uploaded to and archived in the system. This project will be of value not only to those who wish to conduct research using ICD data, but also to clinicians who need to archive and review ICD data collected from their patients.
Database systems for knowledge-based discovery.

PubMed

Jagarlapudi, Sarma A R P; Kishan, K V Radha

2009-01-01

Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery.
Organizations challenged by global database development

USGS Publications Warehouse

Sturdevant, J.A.; Eidenshink, J.C.; Loveland, Thomas R.

1991-01-01

Several international programs have identified the need for a global 1-kilometer spatial database for land cover and land characterization studies. In 1992, the US Geological Survey (USGS) EROS Data Center (EDC), the European Space Agency (ESA), the National Oceanic and Atmospheric Administration (NOAA) and the National Aeronautics and Space Administration (NASA) will collect and archive all 1-kilometer Advanced Very High Resolution Radiometer (AVHRR) data acquired during afternoon orbital passes over land.
Globalization, Polanyi, and the Chinese Yuan

DTIC Science & Technology

2007-12-01

international markets. The U.S. Bureau of Economic Analysis database and a doctorate study from the Centre for Strategic Economic Studies provides FDI...Archive/2004/Sep/23-184387.html (accessed June 12, 2007). 99 ŖND LD: APEC Finance Chiefs Paper over Forex Flexibility Issue," Kyodo News...108th Congress (2003): To authorize appropriate action if the negotiations with the People’s Republic of China. GovTrack.us ( database of federal
Leukemia and brain tumors among children after radiation exposure from CT scans: design and methodological opportunities of the Dutch Pediatric CT Study.

PubMed

Meulepas, Johanna M; Ronckers, Cécile M; Smets, Anne M J B; Nievelstein, Rutger A J; Jahnen, Andreas; Lee, Choonsik; Kieft, Mariëtte; Laméris, Johan S; van Herk, Marcel; Greuter, Marcel J W; Jeukens, Cécile R L P N; van Straten, Marcel; Visser, Otto; van Leeuwen, Flora E; Hauptmann, Michael

2014-04-01

Computed tomography (CT) scans are indispensable in modern medicine; however, the spectacular rise in global use coupled with relatively high doses of ionizing radiation per examination have raised radiation protection concerns. Children are of particular concern because they are more sensitive to radiation-induced cancer compared with adults and have a long lifespan to express harmful effects which may offset clinical benefits of performing a scan. This paper describes the design and methodology of a nationwide study, the Dutch Pediatric CT Study, regarding risk of leukemia and brain tumors in children after radiation exposure from CT scans. It is a retrospective record-linkage cohort study with an expected number of 100,000 children who received at least one electronically archived CT scan covering the calendar period since the introduction of digital archiving until 2012. Information on all archived CT scans of these children will be obtained, including date of examination, scanned body part and radiologist's report, as well as the machine settings required for organ dose estimation. We will obtain cancer incidence by record linkage with external databases. In this article, we describe several approaches to the collection of data on archived CT scans, the estimation of radiation doses and the assessment of confounding. The proposed approaches provide useful strategies for data collection and confounder assessment for general retrospective record-linkage studies, particular those using hospital databases on radiological procedures for the assessment of exposure to ionizing or non-ionizing radiation.
Integration Of An MR Image Network Into A Clinical PACS

NASA Astrophysics Data System (ADS)

Ratib, Osman M.; Mankovich, Nicholas J.; Taira, Ricky K.; Cho, Paul S.; Huang, H. K.

1988-06-01

A direct link between a clinical pediatric PACS module and a FONAR MRI image network was implemented. The original MR network combines together the MR scanner, a remote viewing station and a central archiving station. The pediatric PACS directly connects to the archiving unit through an Ethernet TCP-IP network adhering to FONAR's protocol. The PACS communication software developed supports the transfer of patient studies and the patient information directly from the MR archive database to the pediatric PACS. In the first phase of our project we developed a package to transfer data between a VAX-111750 and the IBM PC I AT-based MR archive database through the Ethernet network. This system served as a model for PACS-to-modality network communication. Once testing was complete on this research network, the software and network hardware was moved to the clinical pediatric VAX for full PACS integration. In parallel to the direct transmission of digital images to the Pediatric PACS, a broadband communication system in video format was developed for real-time broadcasting of images originating from the MR console to 8 remote viewing stations distributed in the radiology department. These analog viewing stations allow the radiologists to directly monitor patient positioning and to select the scan levels during a patient examination from remote locations in the radiology department. This paper reports (1) the technical details of this implementation, (2) the merits of this network development scheme, and (3) the performance statistics of the network-to-PACS interface.
Archival Research Capabilities of the WFIRST Data Set

NASA Astrophysics Data System (ADS)

Szalay, Alexander

WFIRST's unique combination of a large (~0.3 deg2) field of view and HST-like angular resolution and sensitivity in the near infrared will produce spectacular new insights into the origins of stars, galaxies, and structure in the cosmos. We propose a WFIRST Archive Science Investigation Team (SIT-F) to define an archival, query, and analysis system that will enable scientific discovery in all relevant areas of astrophysics and maximize the overall scientific yield of the mission. Guest investigators (GIs), guest observers (GOs), the WFIRST SIT's, WFIRST Science Center(s), and astronomers using data from other surveys will all benefit from the extensive, easy, fast and reliable use of the WFIRST archives. We propose to develop the science requirements for the archive and work to understand its interactions with other elements of the WFIRST mission. To accomplish this, we will conduct case studies to derive performance requirements for the WFIRST archives. These will clarify what is needed for GIs to make important scientific discoveries across a broad range of astrophysics. While other SITs will primarily address the science capabilities of the WFIRST instruments, we will look ahead to the science enabling capabilities of the WFIRST archives. We will demonstrate how the archive can be optimized to take advantage of the extraordinary science capabilities of the WFIRST instruments as well as major space and ground observatories to maximize the science return of the mission. We will use the "20 queries" methodology, formulated by Jim Gray, to cover the most important science analysis patterns and use these to establish the performance required of the WFIRST archive. The case studies will be centered on studying galaxy evolution as a function of cosmic time, environment and intrinsic properties. The analyses will require massive angular and spatial cross correlations between key galaxy properties to search for new fundamental scaling relations that may only become apparent when exploring a database of 108 galaxies with multiband photometry and grism spectroscopy. The case studies will require (i) the creation of a unified WFIRST object catalog consisting of data cross-matched to external catalogs, (ii) an easy-to-access, scalable database, utilizing the latest data discovery and querying techniques, (iii) in situ analyses of large and/or complex data, (iv) identification of links to supporting data and enabling queries spanning WFIRST and other databases, (v) combining simulations with modeling software. To accomplish these objectives, we will prototype a system capable of executing complex user-defined scripts including database access to a shared computational facility with tools for joining WFIRST to other surveys, also enabling comparisons to physical models. Our organizational plan divides the work into several general areas where our team members have specific expertise: (a) apply the 20 queries methodology to derive performance and functionality requirements, (b) develop a practical interactive server-side query system, built on our SDSS experience, (c) apply advanced cross-matching techniques, (d) create mock WFIRST imaging and grism data, (e) develop high level cross correlation tools, (e) optimize scripting systems using high-level languages (iPython), (f) perform close integration of cosmological simulations with observational data, (g) apply advanced machine learning techniques. Our efforts will be coordinated with the WFIRST Science Center (WSC), the other SITs, and the broader community in a manner consistent with direction and review of the Project Office. We will publish our results as milestones are reached, and issue progress reports on a regular basis. We will represent SIT-F at all relevant meetings including meetings of the other SITs (SITs A-E), and participate in "Big Data" conferences to interact with others in the field and learn new techniques that might be applicable to WFIRST.
Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants.

PubMed

Tanabe, Akifumi S; Toju, Hirokazu

2013-01-01

Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.
Two New Computational Methods for Universal DNA Barcoding: A Benchmark Using Barcode Sequences of Bacteria, Archaea, Animals, Fungi, and Land Plants

PubMed Central

Tanabe, Akifumi S.; Toju, Hirokazu

2013-01-01

Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used “1-nearest-neighbor” (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research. PMID:24204702
2011 Tohoku, Japan tsunami data available from the National Oceanic and Atmospheric Administration/National Geophysical Data Center

NASA Astrophysics Data System (ADS)

Dunbar, P. K.; Mccullough, H. L.; Mungov, G.; Harris, E.

2012-12-01

The U.S. National Oceanic and Atmospheric Administration (NOAA) has primary responsibility for providing tsunami warnings to the Nation, and a leadership role in tsunami observations and research. A key component of this effort is easy access to authoritative data on past tsunamis, a responsibility of the National Geophysical Data Center (NGDC) and collocated World Service for Geophysics. Archive responsibilities include the global historical tsunami database, coastal tide-gauge data from US/NOAA operated stations, the Deep-ocean Assessment and Reporting of Tsunami (DART®) data, damage photos, as well as other related hazards data. Taken together, this integrated archive supports tsunami forecast, warning, research, mitigation and education efforts of NOAA and the Nation. Understanding the severity and timing of tsunami effects is important for tsunami hazard mitigation and warning. The global historical tsunami database includes the date, time, and location of the source event, magnitude of the source, event validity, maximum wave height, the total number of fatalities and dollar damage. The database contains additional information on run-ups (locations where tsunami waves were observed by eyewitnesses, field reconnaissance surveys, tide gauges, or deep ocean sensors). The run-up table includes arrival times, distance from the source, measurement type, maximum wave height, and the number of fatalities and damage for the specific run-up location. Tide gauge data are required for modeling the interaction of tsunami waves with the coast and for verifying propagation and inundation models. NGDC is the long-term archive for all NOAA coastal tide gauge data and is currently archiving 15-second to 1-minute water level data from the NOAA Center for Operational Oceanographic Products and Services (CO-OPS) and the NOAA Tsunami Warning Centers. DART® buoys, which are essential components of tsunami warning systems, are now deployed in all oceans, giving coastal communities faster and more accurate tsunami warnings. NOAA's National Data Buoy Center disseminates real-time DART® data and NGDC processes and archives post-event 15-second high-resolution bottom pressure time series data. An event-specific archive of DART® observations recorded during recent significant tsunamis, including the March 2011 Tohoku, Japan event, are now available through new tsunami event pages integrated with the NGDC global historical tsunami database. These pages are developed to deliver comprehensive summaries of each tsunami event, including socio-economic impacts, tsunami travel time maps, raw observations, de-tided residuals, spectra of the tsunami signal compared to the energy of the background noise, and wavelets. These data are invaluable to tsunami researchers and educators as they are essential to providing a more thorough understanding of tsunamis and their propagation in the open ocean and subsequent inundation of coastal communities. NGDC has collected 289 tide gauge observations, 34 Deep-ocean Assessment and Reporting of Tsunami (DART®) and bottom pressure recorder (BPR) station observations, and over 5,000 eyewitness reports and post-tsunami field survey measurements for the 2011 Tohoku event.
[Data supporting quality circle management of inpatient depression treatment].

PubMed

Brand, S; Härter, M; Sitta, P; van Calker, D; Menke, R; Heindl, A; Herold, K; Kudling, R; Luckhaus, C; Rupprecht, U; Sanner, Dirk; Schmitz, D; Schramm, E; Berger, M; Gaebel, W; Schneider, F

2005-07-01

Several quality assurance initiatives in health care have been undertaken during the past years. The next step consists of systematically combining single initiatives in order to built up a strategic quality management. In a German multicenter study, the quality of inpatient depression treatment was measured in ten psychiatric hospitals. Half of the hospitals received comparative feedback on their individual results in comparison to the other hospitals (bench marking). Those bench markings were used by each hospital as a statistic basis for in-house quality work, to improve the quality of depression treatment. According to hospital differences concerning procedure and outcome, different goals were chosen. There were also differences with respect to structural characteristics, strategies, and outcome. The feedback from participants about data-based quality circles in general and the availability of bench-marking data was positive. The necessity of carefully choosing quality circle members and professional moderation became obvious. Data-based quality circles including bench-marking have proven to be useful for quality management in inpatient depression care.
32 CFR 2001.42 - Standards for security equipment.

Code of Federal Regulations, 2011 CFR

2011-07-01

... OVERSIGHT OFFICE, NATIONAL ARCHIVES AND RECORDS ADMINISTRATION CLASSIFIED NATIONAL SECURITY INFORMATION... Administration (GSA) shall, in coordination with agency heads originating classified information, establish and publish uniform standards, specifications, qualified product lists or databases, and supply schedules for...
32 CFR 2001.42 - Standards for security equipment.

Code of Federal Regulations, 2012 CFR

2012-07-01

... OVERSIGHT OFFICE, NATIONAL ARCHIVES AND RECORDS ADMINISTRATION CLASSIFIED NATIONAL SECURITY INFORMATION... Administration (GSA) shall, in coordination with agency heads originating classified information, establish and publish uniform standards, specifications, qualified product lists or databases, and supply schedules for...
32 CFR 2001.42 - Standards for security equipment.

Code of Federal Regulations, 2013 CFR

2013-07-01

... OVERSIGHT OFFICE, NATIONAL ARCHIVES AND RECORDS ADMINISTRATION CLASSIFIED NATIONAL SECURITY INFORMATION... Administration (GSA) shall, in coordination with agency heads originating classified information, establish and publish uniform standards, specifications, qualified product lists or databases, and supply schedules for...
32 CFR 2001.42 - Standards for security equipment.

Code of Federal Regulations, 2014 CFR

2014-07-01

... OVERSIGHT OFFICE, NATIONAL ARCHIVES AND RECORDS ADMINISTRATION CLASSIFIED NATIONAL SECURITY INFORMATION... Administration (GSA) shall, in coordination with agency heads originating classified information, establish and publish uniform standards, specifications, qualified product lists or databases, and supply schedules for...
GAUDI: A Preparatory Archive for the COROT Mission

NASA Astrophysics Data System (ADS)

Solano, E.; Catala, C.; Garrido, R.; Poretti, E.; Janot-Pacheco, E.; Gutiérrez, R.; González, R.; Mantegazza, L.; Neiner, C.; Fremat, Y.; Charpinet, S.; Weiss, W.; Amado, P. J.; Rainer, M.; Tsymbal, V.; Lyashko, D.; Ballereau, D.; Bouret, J. C.; Hua, T.; Katz, D.; Lignières, F.; Lüftinger, T.; Mittermayer, P.; Nesvacil, N.; Soubiran, C.; van't Veer-Menneret, C.; Goupil, M. J.; Costa, V.; Rolland, A.; Antonello, E.; Bossi, M.; Buzzoni, A.; Rodrigo, C.; Aerts, C.; Butler, C. J.; Guenther, E.; Hatzes, A.

2005-01-01

The GAUDI database (Ground-based Asteroseismology Uniform Database Interface) is a preparatory archive for the COROT (Convection, Rotation, and Planetary Transits) mission developed at the Laboratorio de Astrofísica Espacial y Física Fundamental (Laboratory for Space Astrophysics and Theoretical Physics, Spain). Its intention is to make the ground-based observations obtained in preparation of the asteroseismology program available in a simple and efficient way. It contains spectroscopic and photometric data together with inferred physical parameters for more than 1500 objects gathered since 1998 January 1998 in 6 years of observational campaigns. In this paper, the main functions and characteristics of the system are described. Based on observations collected at La Silla (ESO proposals 67.D-0169, 69.D-0166, and 70.D-0110), Telescopio Nazionale Galileo (proposal 6-20-068), Observatoire de Haute-Provence, the South African Astronomical Observatory, Tautenburg Observatory, and Sierra Nevada Observatory.
COMET Multimedia modules and objects in the digital library system

NASA Astrophysics Data System (ADS)

Spangler, T. C.; Lamos, J. P.

2003-12-01

Over the past ten years of developing Web- and CD-ROM-based training materials, the Cooperative Program for Operational Meteorology, Education and Training (COMET) has created a unique archive of almost 10,000 multimedia objects and some 50 web based interactive multimedia modules on various aspects of weather and weather forecasting. These objects and modules, containing illustrations, photographs, animations,video sequences, audio files, are potentially a valuable resource for university faculty and students, forecasters, emergency managers, public school educators, and other individuals and groups needing such materials for educational use. The COMET Modules are available on the COMET educational web site http://www.meted.ucar.edu, and the COMET Multimedia Database (MMDB) makes a collection of the multimedia objects available in a searchable online database for viewing and download over the Internet. Some 3200 objects are already available at the MMDB Website: http://archive.comet.ucar.edu/moria/
Genetically optimizing weather predictions

NASA Astrophysics Data System (ADS)

Potter, S. B.; Staats, Kai; Romero-Colmenero, Encarni

2016-07-01

humidity, air pressure, wind speed and wind direction) into a database. Built upon this database, we have developed a remarkably simple approach to derive a functional weather predictor. The aim is provide up to the minute local weather predictions in order to e.g. prepare dome environment conditions ready for night time operations or plan, prioritize and update weather dependent observing queues. In order to predict the weather for the next 24 hours, we take the current live weather readings and search the entire archive for similar conditions. Predictions are made against an averaged, subsequent 24 hours of the closest matches for the current readings. We use an Evolutionary Algorithm to optimize our formula through weighted parameters. The accuracy of the predictor is routinely tested and tuned against the full, updated archive to account for seasonal trends and total, climate shifts. The live (updated every 5 minutes) SALT weather predictor can be viewed here: http://www.saao.ac.za/ sbp/suthweather_predict.html

LBT Distributed Archive: Status and Features

NASA Astrophysics Data System (ADS)

Knapic, C.; Smareglia, R.; Thompson, D.; Grede, G.

2011-07-01

After the first release of the LBT Distributed Archive, this successful collaboration is continuing within the LBT corporation. The IA2 (Italian Center for Astronomical Archive) team had updated the LBT DA with new features in order to facilitate user data retrieval while abiding by VO standards. To facilitate the integration of data from any new instruments, we have migrated to a new database, developed new data distribution software, and enhanced features in the LBT User Interface. The DBMS engine has been changed to MySQL. Consequently, the data handling software now uses java thread technology to update and synchronize the main storage archives on Mt. Graham and in Tucson, as well as archives in Trieste and Heidelberg, with all metadata and proprietary data. The LBT UI has been updated with additional features allowing users to search by instrument and some of the more important characteristics of the images. Finally, instead of a simple cone search service over all LBT image data, new instrument specific SIAP and cone search services have been developed. They will be published in the IVOA framework later this fall.
Life Sciences Data Archive (LSDA)

NASA Technical Reports Server (NTRS)

Fitts, M.; Johnson-Throop, Kathy; Thomas, D.; Shackelford, K.

2008-01-01

In the early days of spaceflight, space life sciences data were been collected and stored in numerous databases, formats, media-types and geographical locations. While serving the needs of individual research teams, these data were largely unknown/unavailable to the scientific community at large. As a result, the Space Act of 1958 and the Science Data Management Policy mandated that research data collected by the National Aeronautics and Space Administration be made available to the science community at large. The Biomedical Informatics and Health Care Systems Branch of the Space Life Sciences Directorate at JSC and the Data Archive Project at ARC, with funding from the Human Research Program through the Exploration Medical Capability Element, are fulfilling these requirements through the systematic population of the Life Sciences Data Archive. This program constitutes a formal system for the acquisition, archival and distribution of data for Life Sciences-sponsored experiments and investigations. The general goal of the archive is to acquire, preserve, and distribute these data using a variety of media which are accessible and responsive to inquiries from the science communities.
Managing an Archive of Images

NASA Technical Reports Server (NTRS)

Andres, Vince; Walter, David; Hallal, Charles; Jones, Helene; Callac, Chris

2004-01-01

The SSC Multimedia Archive is an automated electronic system to manage images, acquired both by film and digital cameras, for the Public Affairs Office (PAO) at Stennis Space Center (SSC). Previously, the image archive was based on film photography and utilized a manual system that, by today s standards, had become inefficient and expensive. Now, the SSC Multimedia Archive, based on a server at SSC, contains both catalogs and images for pictures taken both digitally and with a traditional, film-based camera, along with metadata about each image. After a "shoot," a photographer downloads the images into the database. Members of the PAO can use a Web-based application to search, view and retrieve images, approve images for publication, and view and edit metadata associated with the images. Approved images are archived and cross-referenced with appropriate descriptions and information. Security is provided by allowing administrators to explicitly grant access privileges to personnel to only access components of the system that they need to (i.e., allow only photographers to upload images, only PAO designated employees may approve images).
BioBenchmark Toyama 2012: an evaluation of the performance of triple stores on biological data

PubMed Central

2014-01-01

Background Biological databases vary enormously in size and data complexity, from small databases that contain a few million Resource Description Framework (RDF) triples to large databases that contain billions of triples. In this paper, we evaluate whether RDF native stores can be used to meet the needs of a biological database provider. Prior evaluations have used synthetic data with a limited database size. For example, the largest BSBM benchmark uses 1 billion synthetic e-commerce knowledge RDF triples on a single node. However, real world biological data differs from the simple synthetic data much. It is difficult to determine whether the synthetic e-commerce data is efficient enough to represent biological databases. Therefore, for this evaluation, we used five real data sets from biological databases. Results We evaluated five triple stores, 4store, Bigdata, Mulgara, Virtuoso, and OWLIM-SE, with five biological data sets, Cell Cycle Ontology, Allie, PDBj, UniProt, and DDBJ, ranging in size from approximately 10 million to 8 billion triples. For each database, we loaded all the data into our single node and prepared the database for use in a classical data warehouse scenario. Then, we ran a series of SPARQL queries against each endpoint and recorded the execution time and the accuracy of the query response. Conclusions Our paper shows that with appropriate configuration Virtuoso and OWLIM-SE can satisfy the basic requirements to load and query biological data less than 8 billion or so on a single node, for the simultaneous access of 64 clients. OWLIM-SE performs best for databases with approximately 11 million triples; For data sets that contain 94 million and 590 million triples, OWLIM-SE and Virtuoso perform best. They do not show overwhelming advantage over each other; For data over 4 billion Virtuoso works best. 4store performs well on small data sets with limited features when the number of triples is less than 100 million, and our test shows its scalability is poor; Bigdata demonstrates average performance and is a good open source triple store for middle-sized (500 million or so) data set; Mulgara shows a little of fragility. PMID:25089180
BioBenchmark Toyama 2012: an evaluation of the performance of triple stores on biological data.

PubMed

Wu, Hongyan; Fujiwara, Toyofumi; Yamamoto, Yasunori; Bolleman, Jerven; Yamaguchi, Atsuko

2014-01-01

Biological databases vary enormously in size and data complexity, from small databases that contain a few million Resource Description Framework (RDF) triples to large databases that contain billions of triples. In this paper, we evaluate whether RDF native stores can be used to meet the needs of a biological database provider. Prior evaluations have used synthetic data with a limited database size. For example, the largest BSBM benchmark uses 1 billion synthetic e-commerce knowledge RDF triples on a single node. However, real world biological data differs from the simple synthetic data much. It is difficult to determine whether the synthetic e-commerce data is efficient enough to represent biological databases. Therefore, for this evaluation, we used five real data sets from biological databases. We evaluated five triple stores, 4store, Bigdata, Mulgara, Virtuoso, and OWLIM-SE, with five biological data sets, Cell Cycle Ontology, Allie, PDBj, UniProt, and DDBJ, ranging in size from approximately 10 million to 8 billion triples. For each database, we loaded all the data into our single node and prepared the database for use in a classical data warehouse scenario. Then, we ran a series of SPARQL queries against each endpoint and recorded the execution time and the accuracy of the query response. Our paper shows that with appropriate configuration Virtuoso and OWLIM-SE can satisfy the basic requirements to load and query biological data less than 8 billion or so on a single node, for the simultaneous access of 64 clients. OWLIM-SE performs best for databases with approximately 11 million triples; For data sets that contain 94 million and 590 million triples, OWLIM-SE and Virtuoso perform best. They do not show overwhelming advantage over each other; For data over 4 billion Virtuoso works best. 4store performs well on small data sets with limited features when the number of triples is less than 100 million, and our test shows its scalability is poor; Bigdata demonstrates average performance and is a good open source triple store for middle-sized (500 million or so) data set; Mulgara shows a little of fragility.
An Archival COS Study of Multi-phase Galactic Outflows and Their Dependence on Host Galaxy Properties

NASA Astrophysics Data System (ADS)

Chisholm, John

2013-10-01

Galactic outflows have become vital for understanding galaxy evolution. Outflows have been used to explain the mass-metallicity relation, the star formation history of the universe, and the shape of the baryonic mass function. However, few studies have focused on the basic question of how outflow velocities depend upon the physical properties of their host galaxies. Here we propose an archival project utilizing 52 COS spectra of local star-forming galaxies spanning four decades of star formation rate, and stellar mass. We will preform a self-consistent analysis of trends between galactic properties {star formation rate, stellar mass, specific star formation rate and star formation rate surface density} and outflow velocities measured from interstellar metal absorption lines {e.g., CII 1335}. We will extend this analysis to different gas phases - cold, warm, and hot - to gain a more comprehensive understanding of the physics of multi-phase outflows. The trends we observe will provide insights into the feedback process and will be crucial new benchmarks for simulations.
Considering Decision Variable Diversity in Multi-Objective Optimization: Application in Hydrologic Model Calibration

NASA Astrophysics Data System (ADS)

Sahraei, S.; Asadzadeh, M.

2017-12-01

Any modern multi-objective global optimization algorithm should be able to archive a well-distributed set of solutions. While the solution diversity in the objective space has been explored extensively in the literature, little attention has been given to the solution diversity in the decision space. Selection metrics such as the hypervolume contribution and crowding distance calculated in the objective space would guide the search toward solutions that are well-distributed across the objective space. In this study, the diversity of solutions in the decision-space is used as the main selection criteria beside the dominance check in multi-objective optimization. To this end, currently archived solutions are clustered in the decision space and the ones in less crowded clusters are given more chance to be selected for generating new solution. The proposed approach is first tested on benchmark mathematical test problems. Second, it is applied to a hydrologic model calibration problem with more than three objective functions. Results show that the chance of finding more sparse set of high-quality solutions increases, and therefore the analyst would receive a well-diverse set of options with maximum amount of information. Pareto Archived-Dynamically Dimensioned Search, which is an efficient and parsimonious multi-objective optimization algorithm for model calibration, is utilized in this study.
A Knowledge Database on Thermal Control in Manufacturing Processes

NASA Astrophysics Data System (ADS)

Hirasawa, Shigeki; Satoh, Isao

A prototype version of a knowledge database on thermal control in manufacturing processes, specifically, molding, semiconductor manufacturing, and micro-scale manufacturing has been developed. The knowledge database has search functions for technical data, evaluated benchmark data, academic papers, and patents. The database also displays trends and future roadmaps for research topics. It has quick-calculation functions for basic design. This paper summarizes present research topics and future research on thermal control in manufacturing engineering to collate the information to the knowledge database. In the molding process, the initial mold and melt temperatures are very important parameters. In addition, thermal control is related to many semiconductor processes, and the main parameter is temperature variation in wafers. Accurate in-situ temperature measurment of wafers is important. And many technologies are being developed to manufacture micro-structures. Accordingly, the knowledge database will help further advance these technologies.
AIRSAR Automated Web-based Data Processing and Distribution System

NASA Technical Reports Server (NTRS)

Chu, Anhua; vanZyl, Jakob; Kim, Yunjin; Lou, Yunling; Imel, David; Tung, Wayne; Chapman, Bruce; Durden, Stephen

2005-01-01

In this paper, we present an integrated, end-to-end synthetic aperture radar (SAR) processing system that accepts data processing requests, submits processing jobs, performs quality analysis, delivers and archives processed data. This fully automated SAR processing system utilizes database and internet/intranet web technologies to allow external users to browse and submit data processing requests and receive processed data. It is a cost-effective way to manage a robust SAR processing and archival system. The integration of these functions has reduced operator errors and increased processor throughput dramatically.
Current Status of an Implementation of a System Monitoring for Seamless Auxiliary Data at the Geodetic Observatory Wettzell

NASA Astrophysics Data System (ADS)

Neidhardt, Alexander; Kirschbauer, Katharina; Plötz, Christian; Schönberger, Matthias; Böer, Armin; Wettzell VLBI Team

2016-12-01

The first test implementation of an auxiliary data archive is tested at the Geodetic Observatory Wetttzell. It is software which follows on the Wettzell SysMon, extending the database and data sensors with the functionalities of a professional monitoring environment, named Zabbix. Some extensions to the remote control server on the NASA Field System PC enable the inclusion of data from external antennas. The presentation demonstrates the implementation and discusses the current possibilities to encourage other antennas to join the auxiliary archive.
BAPA Database: a Landslide Inventory in the Principality of Asturias (NW Spain) by Using Press Archives and Free Cartographic Servers

NASA Astrophysics Data System (ADS)

Valenzuela, P.; Domínguez-Cuesta, M. J.; Jiménez-Sánchez, M.; Mora García, M. A.

2015-12-01

Due to its geological and climatic conditions, landslides are very common and widespread phenomena in the Principality of Asturias (NW of Spain), causing economic losses and, sometimes, human victims. In this scenario, temporal prediction of instabilities becomes particularly important. Although previous knowledge indicates that rainfall is the main trigger, the lack of data hinders the proper temporal forecast of landslides in the region. To resolve this deficiency, a new landslide inventory is being developed: the BAPA (Base de datos de Argayos del Principado de Asturias-Principality of Asturias Landslide Database). Data collection is mainly performed through the gathering of local newspaper archives, with special emphasis on the registration of spatial and temporal information. Moreover, a BAPA App and a BAPA website (http://geol.uniovi.es/BAPA) have been developed to easily obtain additional information from authorities and private individuals. Presently, dataset covers the period 1980-2015, registering more than 2000 individual landslide events. Fifty-two per cent of the records provide accurate dates, showing the usefulness of press archives as temporal records. The use of free cartographic servers, such as Google Maps, Google Street View and Iberpix (Government of Spain), combined with the spatial descriptions and photographs contained in the press releases, makes it possible to determine the exact location in fifty-eight per cent of the records. Field work performed to date has allowed the validation of the methodology proposed to obtain spatial data. In addition, BAPA database contain information about: source, typology of landslides, triggers, damages and costs.
County Business Patterns: United States, 2002.

ERIC Educational Resources Information Center

US Department of Commerce, 2004

2004-01-01

In this report, subnational economic data by industry, including Educational Services, is provided. County Business Patterns is useful for studying the economic activity of small areas; analyzing economic changes over time; and as a benchmark for statistical series, surveys, and databases between economic censuses. The number of establishments,…
A Guided Tour of Saada

NASA Astrophysics Data System (ADS)

Michel, L.; Motch, C.; Nguyen Ngoc, H.; Pineau, F. X.

2009-09-01

Saada (http://amwdb.u-strasbg.fr/saada) is a tool for helping astronomers build local archives without writing any code (Michel et al. 2004). Databases created by Saada can host collections of heterogeneous data files. These data collections can also be published in the VO. An overview of the main Saada features is presented in this demo: creation of a basic database, creation of relationships, data searches using SaadaQL, metadata tagging, and use of VO services.
Benchmarking Big Data Systems and the BigData Top100 List.

PubMed

Baru, Chaitanya; Bhandarkar, Milind; Nambiar, Raghunath; Poess, Meikel; Rabl, Tilmann

2013-03-01

"Big data" has become a major force of innovation across enterprises of all sizes. New platforms with increasingly more features for managing big datasets are being announced almost on a weekly basis. Yet, there is currently a lack of any means of comparability among such platforms. While the performance of traditional database systems is well understood and measured by long-established institutions such as the Transaction Processing Performance Council (TCP), there is neither a clear definition of the performance of big data systems nor a generally agreed upon metric for comparing these systems. In this article, we describe a community-based effort for defining a big data benchmark. Over the past year, a Big Data Benchmarking Community has become established in order to fill this void. The effort focuses on defining an end-to-end application-layer benchmark for measuring the performance of big data applications, with the ability to easily adapt the benchmark specification to evolving challenges in the big data space. This article describes the efforts that have been undertaken thus far toward the definition of a BigData Top100 List. While highlighting the major technical as well as organizational challenges, through this article, we also solicit community input into this process.
AntiJen: a quantitative immunology database integrating functional, thermodynamic, kinetic, biophysical, and cellular data

PubMed Central

Toseland, Christopher P; Clayton, Debra J; McSparron, Helen; Hemsley, Shelley L; Blythe, Martin J; Paine, Kelly; Doytchinova, Irini A; Guan, Pingping; Hattotuwagama, Channa K; Flower, Darren R

2005-01-01

AntiJen is a database system focused on the integration of kinetic, thermodynamic, functional, and cellular data within the context of immunology and vaccinology. Compared to its progenitor JenPep, the interface has been completely rewritten and redesigned and now offers a wider variety of search methods, including a nucleotide and a peptide BLAST search. In terms of data archived, AntiJen has a richer and more complete breadth, depth, and scope, and this has seen the database increase to over 31,000 entries. AntiJen provides the most complete and up-to-date dataset of its kind. While AntiJen v2.0 retains a focus on both T cell and B cell epitopes, its greatest novelty is the archiving of continuous quantitative data on a variety of immunological molecular interactions. This includes thermodynamic and kinetic measures of peptide binding to TAP and the Major Histocompatibility Complex (MHC), peptide-MHC complexes binding to T cell receptors, antibodies binding to protein antigens and general immunological protein-protein interactions. The database also contains quantitative specificity data from position-specific peptide libraries and biophysical data, in the form of diffusion co-efficients and cell surface copy numbers, on MHCs and other immunological molecules. The uses of AntiJen include the design of vaccines and diagnostics, such as tetramers, and other laboratory reagents, as well as helping parameterize the bioinformatic or mathematical in silico modeling of the immune system. The database is accessible from the URL: . PMID:16305757
gPhoton: The GALEX Photon Data Archive

NASA Astrophysics Data System (ADS)

Million, Chase; Fleming, Scott W.; Shiao, Bernie; Seibert, Mark; Loyd, Parke; Tucker, Michael; Smith, Myron; Thompson, Randy; White, Richard L.

2016-12-01

gPhoton is a new database product and software package that enables analysis of GALEX ultraviolet data at the photon level. The project’s stand-alone, pure-Python calibration pipeline reproduces the functionality of the original mission pipeline to reduce raw spacecraft data to lists of time-tagged, sky-projected photons, which are then hosted in a publicly available database by the Mikulski Archive at Space Telescope. This database contains approximately 130 terabytes of data describing approximately 1.1 trillion sky-projected events with a timestamp resolution of five milliseconds. A handful of Python and command-line modules serve as a front end to interact with the database and to generate calibrated light curves and images from the photon-level data at user-defined temporal and spatial scales. The gPhoton software and source code are in active development and publicly available under a permissive license. We describe the motivation, design, and implementation of the calibration pipeline, database, and tools, with emphasis on divergence from prior work, as well as challenges created by the large data volume. We summarize the astrometric and photometric performance of gPhoton relative to the original mission pipeline. For a brief example of short time-domain science capabilities enabled by gPhoton, we show new flares from the known M-dwarf flare star CR Draconis. The gPhoton software has permanent object identifiers with the ASCL (ascl:1603.004) and DOI (doi:10.17909/T9CC7G). This paper describes the software as of version v1.27.2.
Depth-area-duration characteristics of storm rainfall in Texas using Multi-Sensor Precipitation Estimates

NASA Astrophysics Data System (ADS)

McEnery, J. A.; Jitkajornwanich, K.

2012-12-01

This presentation will describe the methodology and overall system development by which a benchmark dataset of precipitation information has been used to characterize the depth-area-duration relations in heavy rain storms occurring over regions of Texas. Over the past two years project investigators along with the National Weather Service (NWS) West Gulf River Forecast Center (WGRFC) have developed and operated a gateway data system to ingest, store, and disseminate NWS multi-sensor precipitation estimates (MPE). As a pilot project of the Integrated Water Resources Science and Services (IWRSS) initiative, this testbed uses a Standard Query Language (SQL) server to maintain a full archive of current and historic MPE values within the WGRFC service area. These time series values are made available for public access as web services in the standard WaterML format. Having this volume of information maintained in a comprehensive database now allows the use of relational analysis capabilities within SQL to leverage these multi-sensor precipitation values and produce a valuable derivative product. The area of focus for this study is North Texas and will utilize values that originated from the West Gulf River Forecast Center (WGRFC); one of three River Forecast Centers currently represented in the holdings of this data system. Over the past two decades, NEXRAD radar has dramatically improved the ability to record rainfall. The resulting hourly MPE values, distributed over an approximate 4 km by 4 km grid, are considered by the NWS to be the "best estimate" of rainfall. The data server provides an accepted standard interface for internet access to the largest time-series dataset of NEXRAD based MPE values ever assembled. An automated script has been written to search and extract storms over the 18 year period of record from the contents of this massive historical precipitation database. Not only can it extract site-specific storms, but also duration-specific storms and storms separated by user defined inter-event periods. A separate storm database has been created to store the selected output. By storing output within tables in a separate database, we can make use of powerful SQL capabilities to perform flexible pattern analysis. Previous efforts have made use of historic data from limited clusters of irregularly spaced physical gauges. Spatial extent of the observational network has been a limiting factor. The relatively dense distribution of MPE provides a virtual mesh of observations stretched over the landscape. This work combines a unique hydrologic data resource with programming and database analysis to characterize storm depth-area-duration relationships.
Mapping the Martian Meteorology

NASA Technical Reports Server (NTRS)

Allison, Michael; Ross, J. D.; Soloman, N.

1999-01-01

The Mars-adapted version of the NASA/GISS general circulation model (GCM) has been applied to the hourly/daily simulation of the planet's meteorology over several seasonal orbits. The current running version of the model includes a diurnal solar cycle, CO2 sublimation, and a mature parameterization of upper level wave drag with a vertical domain extending from the surface up to the 6 micro b level. The benchmark simulations provide a four-dimensional archive for the comparative evaluation of various schemes for the retrieval of winds from anticipated polar orbiter measurements of temperatures by the Pressure Modulator Infrared Radiometer.
Second Annual Conference on Astronomical Data Analysis Software and Systems. Abstracts

NASA Technical Reports Server (NTRS)

1992-01-01

Abstracts from the conference are presented. The topics covered include the following: next generation software systems and languages; databases, catalogs, and archives; user interfaces/visualization; real-time data acquisition/scheduling; and IRAF/STSDAS/PROS status reports.
Continuous quality improvement for the clinical decision unit.

PubMed

Mace, Sharon E

2004-01-01

Clinical decision units (CDUs) are a relatively new and growing area of medicine in which patients undergo rapid evaluation and treatment. Continuous quality improvement (CQI) is important for the establishment and functioning of CDUs. CQI in CDUs has many advantages: better CDU functioning, fulfillment of Joint Commission on Accreditation of Healthcare Organizations mandates, greater efficiency/productivity, increased job satisfaction, better performance improvement, data availability, and benchmarking. Key elements include a database with volume indicators, operational policies, clinical practice protocols (diagnosis specific/condition specific), monitors, benchmarks, and clinical pathways. Examples of these important parameters are given. The CQI process should be individualized for each CDU and hospital.

Recommendations for a service framework to access astronomical archives

NASA Technical Reports Server (NTRS)

Travisano, J. J.; Pollizzi, J.

1992-01-01

There are a large number of astronomical archives and catalogs on-line for network access, with many different user interfaces and features. Some systems are moving towards distributed access, supplying users with client software for their home sites which connects to servers at the archive site. Many of the issues involved in defining a standard framework of services that archive/catalog suppliers can use to achieve a basic level of interoperability are described. Such a framework would simplify the development of client and server programs to access the wide variety of astronomical archive systems. The primary services that are supplied by current systems include: catalog browsing, dataset retrieval, name resolution, and data analysis. The following issues (and probably more) need to be considered in establishing a standard set of client/server interfaces and protocols: Archive Access - dataset retrieval, delivery, file formats, data browsing, analysis, etc.; Catalog Access - database management systems, query languages, data formats, synchronous/asynchronous mode of operation, etc.; Interoperability - transaction/message protocols, distributed processing mechanisms (DCE, ONC/SunRPC, etc), networking protocols, etc.; Security - user registration, authorization/authentication mechanisms, etc.; Service Directory - service registration, lookup, port/task mapping, parameters, etc.; Software - public vs proprietary, client/server software, standard interfaces to client/server functions, software distribution, operating system portability, data portability, etc. Several archive/catalog groups, notably the Astrophysics Data System (ADS), are already working in many of these areas. In the process of developing StarView, which is the user interface to the Space Telescope Data Archive and Distribution Service (ST-DADS), these issues and the work of others were analyzed. A framework of standard interfaces for accessing services on any archive system which would benefit archive user and supplier alike is proposed.
Feature maps driven no-reference image quality prediction of authentically distorted images

NASA Astrophysics Data System (ADS)

Ghadiyaram, Deepti; Bovik, Alan C.

2015-03-01

Current blind image quality prediction models rely on benchmark databases comprised of singly and synthetically distorted images, thereby learning image features that are only adequate to predict human perceived visual quality on such inauthentic distortions. However, real world images often contain complex mixtures of multiple distortions. Rather than a) discounting the effect of these mixtures of distortions on an image's perceptual quality and considering only the dominant distortion or b) using features that are only proven to be efficient for singly distorted images, we deeply study the natural scene statistics of authentically distorted images, in different color spaces and transform domains. We propose a feature-maps-driven statistical approach which avoids any latent assumptions about the type of distortion(s) contained in an image, and focuses instead on modeling the remarkable consistencies in the scene statistics of real world images in the absence of distortions. We design a deep belief network that takes model-based statistical image features derived from a very large database of authentically distorted images as input and discovers good feature representations by generalizing over different distortion types, mixtures, and severities, which are later used to learn a regressor for quality prediction. We demonstrate the remarkable competence of our features for improving automatic perceptual quality prediction on a benchmark database and on the newly designed LIVE Authentic Image Quality Challenge Database and show that our approach of combining robust statistical features and the deep belief network dramatically outperforms the state-of-the-art.
Documentation of the U.S. Geological Survey Oceanographic Time-Series Measurement Database

USGS Publications Warehouse

Montgomery, Ellyn T.; Martini, Marinna A.; Lightsom, Frances L.; Butman, Bradford

2008-01-02

This report describes the instrumentation and platforms used to make the measurements; the methods used to process, apply quality-control criteria, and archive the data; the data storage format, and how the data are released and distributed. The report also includes instructions on how to access the data from the online database at http://stellwagen.er.usgs.gov/. As of 2016, the database contains about 5,000 files, which may include observations of current velocity, wave statistics, ocean temperature, conductivity, pressure, and light transmission at one or more depths over some duration of time.
An Investigation of the Fine Spatial Structure of Meteor Streams Using the Relational Database ``Meteor''

NASA Astrophysics Data System (ADS)

Karpov, A. V.; Yumagulov, E. Z.

2003-05-01

We have restored and ordered the archive of meteor observations carried out with a meteor radar complex ``KGU-M5'' since 1986. A relational database has been formed under the control of the Database Management System (DBMS) Oracle 8. We also improved and tested a statistical method for studying the fine spatial structure of meteor streams with allowance for the specific features of application of the DBMS. Statistical analysis of the results of observations made it possible to obtain information about the substance distribution in the Quadrantid, Geminid, and Perseid meteor streams.
[Current status of DNA databases in the forensic field: new progress, new legal needs].

PubMed

Baeta, Miriam; Martínez-Jarreta, Begoña

2009-01-01

One of the most polemic issues regarding the use of deoxyribonucleic acid (DNA) in the legal sphere, refers to the creation of DNA databases. Until relatively recently, Spain did not have a law to support the establishment of a national DNA profile bank for forensic purposes, and preserve the fundamental rights of subjects whose data are archived therein. The regulatory law of police databases regarding identifiers obtained from DNA approved in 2007, covers this void in the Spanish legislation and responds to the incessant need to adapt the laws to continuous scientific and technological progress.
A data and information system for processing, archival, and distribution of data for global change research

NASA Technical Reports Server (NTRS)

Graves, Sara J.

1994-01-01

Work on this project was focused on information management techniques for Marshall Space Flight Center's EOSDIS Version 0 Distributed Active Archive Center (DAAC). The centerpiece of this effort has been participation in EOSDIS catalog interoperability research, the result of which is a distributed Information Management System (IMS) allowing the user to query the inventories of all the DAAC's from a single user interface. UAH has provided the MSFC DAAC database server for the distributed IMS, and has contributed to definition and development of the browse image display capabilities in the system's user interface. Another important area of research has been in generating value-based metadata through data mining. In addition, information management applications for local inventory and archive management, and for tracking data orders were provided.
Author! author!: creating a digital archive of publications in a hospital library setting.

PubMed

Rourke, Diane; Samsundar, Devica Ramjit; Shalini, Channapatna

2005-01-01

Baptist Hospital of Miami has been honoring its staff authors annually during National Library Week since 1979, at the time the library was relocated. Upon "doing the math" and realizing that twenty-five years had passed, a special event was planned to celebrate the occasion in 2004. A merger of four hospitals in 1995 to form Baptist Health South Florida, and an addition of a fifth hospital in 2003 added into the complexity of these publications. Organizing the event led to the conclusion that there had to be a "better way" to manage the publication archive. This paper will include a look back at the event's past, present efforts to develop an archival database, and future plans to make articles available electronically to users, copyright permitting.
Nonsurgical Strategies in Patients With NET Liver Metastases: A Protocol of Four Systematic Reviews.

PubMed

Limani, Perparim; Tschuor, Christoph; Gort, Laura; Balmer, Bettina; Gu, Alexander; Ceresa, Christos; Raptis, Dimitri Aristotle; Lesurtel, Mickael; Puhan, Milo; Breitenstein, Stefan

2014-03-07

Patients diagnosed with neuroendocrine tumors (NETs) with hepatic metastases generally have a worse prognosis as compared with patients with nonmetastasized NETs. Due to tumor location and distant metastases, a surgical approach is often not possible and nonsurgical therapeutic strategies may apply. The aim of these systematic reviews is to evaluate the role of nonsurgical therapy options for patients with nonresectable liver metastases of NETs. An objective group of librarians will provide an electronic search strategy to examine the MEDLINE, EMBASE, and The Cochrane Library (Cochrane Database of Systematic Reviews, Database of Abstracts of Reviews of Effects, Cochrane Central Register of Controlled Trials [CENTRAL]) databases. There will be no restriction concerning language and publication date. The qualitative and quantitative synthesis of the systematic review will be conducted with randomized controlled trials (RCT), prospective, and retrospective comparative cohort, and case-control studies. Case series will be collected in a separate database and only used for descriptive purposes. This study is ongoing and presents a protocol of four systematic reviews to assess the role of nonsurgical treatment options in patients with neuroendocrine liver metastases. These systematic reviews, performed according to this protocol, will assess the value of noninvasive therapy options for patients with nonresectable liver metastases of NETs in combination with invasive techniques, such as percutaneous liver-directed techniques and local ablation techniques. International Prospective Register of Systematic Reviews (PROSPERO): CRD42012002657; http://www.metaxis.com/PROSPERO/full_doc.asp?RecordID=2657 (Archived by WebCite at http://www.webcitation.org/6NDlYi37O); CRD42012002658; http://www.metaxis.com/PROSPERO/full_doc.asp?RecordID=2658 (Archived by WebCite at http://www.webcitation.org/6NDlfWSuD); CRD42012002659; www.metaxis.com/PROSPERO/full_doc.asp?RecordID=2659 (Arichived by Webcite at http://www.webcitation.org/6NDlmWAFM); and CRD42012002660; http://www.metaxis.com/PROSPERO/full_doc.asp?RecordID=2660 (Archived by WebCite at http://www.webcitation.org/6NDmnylzp).
Enriching Great Britain's National Landslide Database by searching newspaper archives

NASA Astrophysics Data System (ADS)

Taylor, Faith E.; Malamud, Bruce D.; Freeborough, Katy; Demeritt, David

2015-11-01

Our understanding of where landslide hazard and impact will be greatest is largely based on our knowledge of past events. Here, we present a method to supplement existing records of landslides in Great Britain by searching an electronic archive of regional newspapers. In Great Britain, the British Geological Survey (BGS) is responsible for updating and maintaining records of landslide events and their impacts in the National Landslide Database (NLD). The NLD contains records of more than 16,500 landslide events in Great Britain. Data sources for the NLD include field surveys, academic articles, grey literature, news, public reports and, since 2012, social media. We aim to supplement the richness of the NLD by (i) identifying additional landslide events, (ii) acting as an additional source of confirmation of events existing in the NLD and (iii) adding more detail to existing database entries. This is done by systematically searching the Nexis UK digital archive of 568 regional newspapers published in the UK. In this paper, we construct a robust Boolean search criterion by experimenting with landslide terminology for four training periods. We then apply this search to all articles published in 2006 and 2012. This resulted in the addition of 111 records of landslide events to the NLD over the 2 years investigated (2006 and 2012). We also find that we were able to obtain information about landslide impact for 60-90% of landslide events identified from newspaper articles. Spatial and temporal patterns of additional landslides identified from newspaper articles are broadly in line with those existing in the NLD, confirming that the NLD is a representative sample of landsliding in Great Britain. This method could now be applied to more time periods and/or other hazards to add richness to databases and thus improve our ability to forecast future events based on records of past events.
Emergency preparedness for the accidental release of radionuclides from the Uljin Nuclear Power Plant in Korea.

PubMed

Park, Soon-Ung; Lee, In-Hye; Joo, Seung Jin; Ju, Jae-Won

2017-12-01

Site specific radionuclide dispersion databases were archived for the emergency response to the hypothetical releases of 137 Cs from the Uljin nuclear power plant in Korea. These databases were obtained with the horizontal resolution of 1.5 km in the local domain centered the power plant site by simulations of the Lagrangian Particle Dispersion Model (LPDM) with the Unified Model (UM)-Local Data Assimilation Prediction System (LDAPS). The Eulerian Dispersion Model-East Asia (EDM-EA) with the UM-Global Data Assimilation Prediction System (UM-GDAPS) meteorological models was used to get dispersion databases in the regional domain. The LPDM model was performed for a year with a 5-day interval yielding 72 synoptic time-scale cases in a year. For each case hourly mean near surface concentrations, hourly mean column integrated concentrations, hourly total depositions for 5 consecutive days were archived by the LPDM model in the local domain and by the EDM-EA model in the regional domain of Asia. Among 72 synoptic cases in a year the worst synoptic case that showed the highest mean surface concentration averaged for 5 days in the LPDM model domain was chosen to illustrate the emergency preparedness to the hypothetical accident at the site. The simulated results by the LPDM model with the 137 Cs emission rate of the Fukushima nuclear power plant accident for the first 5-day period were found to be able to provide prerequisite information for the emergency response to the early phase of the accident whereas those of the EDM-EA model could provide information required for the environmental impact assessment of the accident in the regional domain. The archived site-specific database of 72 synoptic cases in a year could have a great potential to be used as a prognostic information on the emergency preparedness for the early phase of accident. Copyright © 2017 Elsevier Ltd. All rights reserved.
[Does implementation of benchmarking in quality circles improve the quality of care of patients with asthma and reduce drug interaction?].

PubMed

Kaufmann-Kolle, Petra; Szecsenyi, Joachim; Broge, Björn; Haefeli, Walter Emil; Schneider, Antonius

2011-01-01

The purpose of this cluster-randomised controlled trial was to evaluate the efficacy of quality circles (QCs) working either with general data-based feedback or with an open benchmark within the field of asthma care and drug-drug interactions. Twelve QCs, involving 96 general practitioners from 85 practices, were randomised. Six QCs worked with traditional anonymous feedback and six with an open benchmark. Two QC meetings supported with feedback reports were held covering the topics "drug-drug interactions" and "asthma"; in both cases discussions were guided by a trained moderator. Outcome measures included health-related quality of life and patient satisfaction with treatment, asthma severity and number of potentially inappropriate drug combinations as well as the general practitioners' satisfaction in relation to the performance of the QC. A significant improvement in the treatment of asthma was observed in both trial arms. However, there was only a slight improvement regarding inappropriate drug combinations. There were no relevant differences between the group with open benchmark (B-QC) and traditional quality circles (T-QC). The physicians' satisfaction with the QC performance was significantly higher in the T-QCs. General practitioners seem to take a critical perspective about open benchmarking in quality circles. Caution should be used when implementing benchmarking in a quality circle as it did not improve healthcare when compared to the traditional procedure with anonymised comparisons. Copyright © 2011. Published by Elsevier GmbH.
Visual Systems for Interactive Exploration and Mining of Large-Scale Neuroimaging Data Archives

PubMed Central

Bowman, Ian; Joshi, Shantanu H.; Van Horn, John D.

2012-01-01

While technological advancements in neuroimaging scanner engineering have improved the efficiency of data acquisition, electronic data capture methods will likewise significantly expedite the populating of large-scale neuroimaging databases. As they do and these archives grow in size, a particular challenge lies in examining and interacting with the information that these resources contain through the development of compelling, user-driven approaches for data exploration and mining. In this article, we introduce the informatics visualization for neuroimaging (INVIZIAN) framework for the graphical rendering of, and dynamic interaction with the contents of large-scale neuroimaging data sets. We describe the rationale behind INVIZIAN, detail its development, and demonstrate its usage in examining a collection of over 900 T1-anatomical magnetic resonance imaging (MRI) image volumes from across a diverse set of clinical neuroimaging studies drawn from a leading neuroimaging database. Using a collection of cortical surface metrics and means for examining brain similarity, INVIZIAN graphically displays brain surfaces as points in a coordinate space and enables classification of clusters of neuroanatomically similar MRI images and data mining. As an initial step toward addressing the need for such user-friendly tools, INVIZIAN provides a highly unique means to interact with large quantities of electronic brain imaging archives in ways suitable for hypothesis generation and data mining. PMID:22536181
The Global Streamflow Indices and Metadata Archive (GSIM) - Part 1: The production of a daily streamflow archive and metadata

NASA Astrophysics Data System (ADS)

Do, Hong Xuan; Gudmundsson, Lukas; Leonard, Michael; Westra, Seth

2018-04-01

This is the first part of a two-paper series presenting the Global Streamflow Indices and Metadata archive (GSIM), a worldwide collection of metadata and indices derived from more than 35 000 daily streamflow time series. This paper focuses on the compilation of the daily streamflow time series based on 12 free-to-access streamflow databases (seven national databases and five international collections). It also describes the development of three metadata products (freely available at https://doi.pangaea.de/10.1594/PANGAEA.887477): (1) a GSIM catalogue collating basic metadata associated with each time series, (2) catchment boundaries for the contributing area of each gauge, and (3) catchment metadata extracted from 12 gridded global data products representing essential properties such as land cover type, soil type, and climate and topographic characteristics. The quality of the delineated catchment boundary is also made available and should be consulted in GSIM application. The second paper in the series then explores production and analysis of streamflow indices. Having collated an unprecedented number of stations and associated metadata, GSIM can be used to advance large-scale hydrological research and improve understanding of the global water cycle.
Binding Sites Analyser (BiSA): Software for Genomic Binding Sites Archiving and Overlap Analysis

PubMed Central

Khushi, Matloob; Liddle, Christopher; Clarke, Christine L.; Graham, J. Dinny

2014-01-01

Genome-wide mapping of transcription factor binding and histone modification reveals complex patterns of interactions. Identifying overlaps in binding patterns by different factors is a major objective of genomic studies, but existing methods to archive large numbers of datasets in a personalised database lack sophistication and utility. Therefore we have developed transcription factor DNA binding site analyser software (BiSA), for archiving of binding regions and easy identification of overlap with or proximity to other regions of interest. Analysis results can be restricted by chromosome or base pair overlap between regions or maximum distance between binding peaks. BiSA is capable of reporting overlapping regions that share common base pairs; regions that are nearby; regions that are not overlapping; and average region sizes. BiSA can identify genes located near binding regions of interest, genomic features near a gene or locus of interest and statistical significance of overlapping regions can also be reported. Overlapping results can be visualized as Venn diagrams. A major strength of BiSA is that it is supported by a comprehensive database of publicly available transcription factor binding sites and histone modifications, which can be directly compared to user data. The documentation and source code are available on http://bisa.sourceforge.net PMID:24533055
Emissions & Generation Resource Integrated Database (eGRID), eGRID2012

EPA Pesticide Factsheets

The Emissions & Generation Resource Integrated Database (eGRID) is a comprehensive source of data on the environmental characteristics of almost all electric power generated in the United States. These environmental characteristics include air emissions for nitrogen oxides, sulfur dioxide, carbon dioxide, methane, and nitrous oxide; emissions rates; net generation; resource mix; and many other attributes. eGRID2012 Version 1.0 is the eighth edition of eGRID, which contains the complete release of year 2009 data, as well as year 2007, 2005, and 2004 data. For year 2009 data, all the data are contained in a single Microsoft Excel workbook, which contains boiler, generator, plant, state, power control area, eGRID subregion, NERC region, U.S. total and grid gross loss factor tabs. Full documentation, summary data, eGRID subregion and NERC region representational maps, and GHG emission factors are also released in this edition. The fourth edition of eGRID, eGRID2002 Version 2.01, containing year 1996 through 2000 data is located on the eGRID Archive page (http://www.epa.gov/cleanenergy/energy-resources/egrid/archive.html). The current edition of eGRID and the archived edition of eGRID contain the following years of data: 1996 - 2000, 2004, 2005, and 2007. eGRID has no other years of data.
Gramene 2016: comparative plant genomics and pathway resources

PubMed Central

Tello-Ruiz, Marcela K.; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M.; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A.; Huerta, Laura; Keays, Maria; Tang, Y. Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J.; Jaiswal, Pankaj; Ware, Doreen

2016-01-01

Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803
Integrating the IA2 Astronomical Archive in the VO: The VO-Dance Engine

NASA Astrophysics Data System (ADS)

Molinaro, M.; Laurino, O.; Smareglia, R.

2012-09-01

Virtual Observatory (VO) protocols and standards are getting mature and the astronomical community asks for astrophysical data to be easily reachable. This means data centers have to intensify their efforts to provide the data they manage not only through proprietary portals and services but also through interoperable resources developed on the basis of the IVOA (International Virtual Observatory Alliance) recommendations. Here we present the work and ideas developed at the IA2 (Italian Astronomical Archive) data center hosted by the INAF-OATs (Italian Institute for Astrophysics - Trieste Astronomical Observatory) to reach this goal. The core point is the development of an application that from existing DB and archive structures can translate their content to VO compliant resources: VO-Dance (written in Java). This application, in turn, relies on a database (potentially DBMS independent) to store the translation layer information of each resource and auxiliary content (UCDs, field names, authorizations, policies, etc.). The last token is an administrative interface (currently developed using the Django python framework) to allow the data center administrators to set up and maintain resources. This deployment, platform independent, with database and administrative interface highly customizable, means the package, when stable and easily distributable, can be also used by single astronomers or groups to set up their own resources from their public datasets.
The ATLAS TAGS database distribution and management - Operational challenges of a multi-terabyte distributed database

NASA Astrophysics Data System (ADS)

Viegas, F.; Malon, D.; Cranshaw, J.; Dimitrov, G.; Nowak, M.; Nairz, A.; Goossens, L.; Gallas, E.; Gamboa, C.; Wong, A.; Vinek, E.

2010-04-01

The TAG files store summary event quantities that allow a quick selection of interesting events. This data will be produced at a nominal rate of 200 Hz, and is uploaded into a relational database for access from websites and other tools. The estimated database volume is 6TB per year, making it the largest application running on the ATLAS relational databases, at CERN and at other voluntary sites. The sheer volume and high rate of production makes this application a challenge to data and resource management, in many aspects. This paper will focus on the operational challenges of this system. These include: uploading the data from files to the CERN's and remote sites' databases; distributing the TAG metadata that is essential to guide the user through event selection; controlling resource usage of the database, from the user query load to the strategy of cleaning and archiving of old TAG data.
Better Living Through Metadata: Examining Archive Usage

NASA Astrophysics Data System (ADS)

Becker, G.; Winkelman, S.; Rots, A.

2013-10-01

The primary purpose of an observatory's archive is to provide access to the data through various interfaces. User interactions with the archive are recorded in server logs, which can be used to answer basic questions like: Who has downloaded dataset X? When did she do this? Which tools did she use? The answers to questions like these fill in patterns of data access (e.g., how many times dataset X has been downloaded in the past three years). Analysis of server logs provides metrics of archive usage and provides feedback on interface use which can be used to guide future interface development. The Chandra X-ray Observatory is fortunate in that a database to track data access and downloads has been continuously recording such transactions for years; however, it is overdue for an update. We will detail changes we hope to effect and the differences the changes may make to our usage metadata picture. We plan to gather more information about the geographic location of users without compromising privacy; create improved archive statistics; and track and assess the impact of web “crawlers” and other scripted access methods on the archive. With the improvements to our download tracking we hope to gain a better understanding of the dissemination of Chandra's data; how effectively it is being done; and perhaps discover ideas for new services.
The NASA Navigator Program Ground Based Archives at the Michelson Science Center: Supporting the Search for Habitable Planets

NASA Astrophysics Data System (ADS)

Berriman, G. B.; Ciardi, D. R.; Good, J. C.; Laity, A. C.; Zhang, A.

2006-07-01

At ADASS XIV, we described how the W. M. Keck Observatory Archive (KOA) re-uses and extends the component based architecture of the NASA/IPAC Infrared Science Archive (IRSA) to ingest and serve level 0 observations made with HIRES, the High Resolution Echelle Spectrometer. Since August 18, the KOA has ingested 325 GB of data from 135 nights of observations. The architecture exploits a service layer between the mass storage layer and the user interface. This service layer consists of standalone utilities called through a simple executive that perform generic query and retrieval functions, such as query generation, database table sub-setting, and return page generation etc. It has been extended to implement proprietary access to data through deployment of query management middleware developed for the National Virtual Observatory. The MSC archives have recently extended this design to query and retrieve complex data sets describing the properties of potential target stars for the Terrestrial Planet Finder (TPF) missions. The archives can now support knowledge based retrieval, as well as data retrieval. This paper describes how extensions to the IRSA architecture, which is applicable across all wavelengths and astronomical datatypes, supports the design and development of the MSC NP archives at modest cost.

Radio data archiving system

NASA Astrophysics Data System (ADS)

Knapic, C.; Zanichelli, A.; Dovgan, E.; Nanni, M.; Stagni, M.; Righini, S.; Sponza, M.; Bedosti, F.; Orlati, A.; Smareglia, R.

2016-07-01

Radio Astronomical Data models are becoming very complex since the huge possible range of instrumental configurations available with the modern Radio Telescopes. What in the past was the last frontiers of data formats in terms of efficiency and flexibility is now evolving with new strategies and methodologies enabling the persistence of a very complex, hierarchical and multi-purpose information. Such an evolution of data models and data formats require new data archiving techniques in order to guarantee data preservation following the directives of Open Archival Information System and the International Virtual Observatory Alliance for data sharing and publication. Currently, various formats (FITS, MBFITS, VLBI's XML description files and ancillary files) of data acquired with the Medicina and Noto Radio Telescopes can be stored and handled by a common Radio Archive, that is planned to be released to the (inter)national community by the end of 2016. This state-of-the-art archiving system for radio astronomical data aims at delegating as much as possible to the software setting how and where the descriptors (metadata) are saved, while the users perform user-friendly queries translated by the web interface into complex interrogations on the database to retrieve data. In such a way, the Archive is ready to be Virtual Observatory compliant and as much as possible user-friendly.
ALICE Data Release: A Revaluation of HST-NICMOS Coronagraphic Images

NASA Astrophysics Data System (ADS)

Hagan, J. Brendan; Choquet, Élodie; Soummer, Rémi; Vigan, Arthur

2018-04-01

The Hubble Space Telescope NICMOS instrument was used from 1997 to 2008 to perform coronagraphic observations of about 400 targets. Most of them were part of surveys looking for substellar companions or resolved circumstellar disks to young nearby stars, making the NICMOS coronagraphic archive a valuable database for exoplanets and disks studies. As part of the Archival Legacy Investigations of Circumstellar Environments program, we have consistently reprocessed a large fraction of the NICMOS coronagrahic archive using advanced starlight subtraction methods. We present here the high-level science products of these re-analyzed data, which we delivered back to the community through the Mikulski Archive for Space Telescopes: doi:10.17909/T9W89V. We also present the second version of the HCI-FITS format (for High-Contrast Imaging FITS format), which we developed as a standard format for data exchange of imaging reduced science products. These re-analyzed products are openly available for population statistics studies, characterization of specific targets, or detected point-source identification.
The Challenges in Metadata Management: 20+ Years of ESO Data

NASA Astrophysics Data System (ADS)

Vera, I.; Da Rocha, C.; Dobrzycki, A.; Micol, A.; Vuong, M.

2015-09-01

The European Southern Observatory Science Archive Facility has been in operations for more than 20 years. It contains data produced by ESO telescopes as well as the metadata needed for characterizing and distributing those data. This metadata is used to build the different archive services provided by the Archive. Over these years, services have been added, modified or even decommissioned creating a cocktail of new, evolved and legacy data systems. The challenge for the Archive is to harmonize the differences of those data systems to provide the community with a homogeneous experience when using ESO data. In this paper, we present ESO experience in three particular challenging areas. First discussion is dedicated to the problem of metadata quality over the time, second discusses how to integrate obsolete data models on the current services and finally we will present the challenges of ever growing databases. We describe our experience dealing with those issues and the solutions adopted to mitigate them.
A New Archive of UKIRT Legacy Data at CADC

NASA Astrophysics Data System (ADS)

Bell, G. S.; Currie, M. J.; Redman, R. O.; Purves, M.; Jenness, T.

2014-05-01

We describe a new archive of legacy data from the United Kingdom Infrared Telescope (UKIRT) at the Canadian Astronomy Data Centre (CADC) containing all available data from the Cassegrain instruments. The desire was to archive the raw data in as close to the original format as possible, so where the data followed our current convention of having a single data file per observation, it was archived without alteration, except for minor fixes to headers of data in FITS format to allow it to pass fitsverify and be accepted by CADC. Some of the older data comprised multiple integrations in separate files per observation, stored in either Starlink NDF or Figaro DST format. These were placed inside HDS container files, and DST files were rearranged into NDF format. The describing the observations is ingested into the CAOM-2 repository via an intermediate MongoDB header database, which will also be used to guide the ORAC-DR pipeline in generating reduced data products.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes.

PubMed

Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar

2012-01-01

DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. The database is available for free at http://mail.nbfgr.res.in/fbis/
Korean Variant Archive (KOVA): a reference database of genetic variations in the Korean population.

PubMed

Lee, Sangmoon; Seo, Jihae; Park, Jinman; Nam, Jae-Yong; Choi, Ahyoung; Ignatius, Jason S; Bjornson, Robert D; Chae, Jong-Hee; Jang, In-Jin; Lee, Sanghyuk; Park, Woong-Yang; Baek, Daehyun; Choi, Murim

2017-06-27

Despite efforts to interrogate human genome variation through large-scale databases, systematic preference toward populations of Caucasian descendants has resulted in unintended reduction of power in studying non-Caucasians. Here we report a compilation of coding variants from 1,055 healthy Korean individuals (KOVA; Korean Variant Archive). The samples were sequenced to a mean depth of 75x, yielding 101 singleton variants per individual. Population genetics analysis demonstrates that the Korean population is a distinct ethnic group comparable to other discrete ethnic groups in Africa and Europe, providing a rationale for such independent genomic datasets. Indeed, KOVA conferred 22.8% increased variant filtering power in addition to Exome Aggregation Consortium (ExAC) when used on Korean exomes. Functional assessment of nonsynonymous variant supported the presence of purifying selection in Koreans. Analysis of copy number variants detected 5.2 deletions and 10.3 amplifications per individual with an increased fraction of novel variants among smaller and rarer copy number variable segments. We also report a list of germline variants that are associated with increased tumor susceptibility. This catalog can function as a critical addition to the pre-existing variant databases in pursuing genetic studies of Korean individuals.
The BIG Data Center: from deposition to integration to translation

PubMed Central

2017-01-01

Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. PMID:27899658
Diagnostic report acquisition unit for the Mayo/IBM PACS project

NASA Astrophysics Data System (ADS)

Brooks, Everett G.; Rothman, Melvyn L.

1991-07-01

The Mayo Clinic and IBM Rochester have jointly developed a picture archive and control system (PACS) for use with Mayo's MRI and Neuro-CT imaging modalities. One of the challenges of developing a useful PACS involves integrating the diagnostic reports with the electronic images so they can be displayed simultaneously. By the time a diagnostic report is generated for a particular case, its images have already been captured and archived by the PACS. To integrate the report with the images, the authors have developed an IBM Personal System/2 computer (PS/2) based diagnostic report acquisition unit (RAU). A typed copy of the report is transmitted via facsimile to the RAU where it is stacked electronically with other reports that have been sent previously but not yet processed. By processing these reports at the RAU, the information they contain is integrated with the image database and a copy of the report is archived electronically on an IBM Application System/400 computer (AS/400). When a user requests a set of images for viewing, the report is automatically integrated with the image data. By using a hot key, the user can toggle on/off the report on the display screen. This report describes process, hardware, and software employed to integrate the diagnostic report information into the PACS, including how the report images are captured, transmitted, and entered into the AS/400 database. Also described is how the archived reports and their associated medical images are located and merged for retrieval and display. The methods used to detect and process error conditions are also discussed.
7 CFR 1.7 - Agency response to requests for records.

Code of Federal Regulations, 2010 CFR

2010-01-01

... Archives and Records Administration (“NARA”), the agency shall inform the requester of this fact and shall...) Database at http://www/nara.gov/nara.nail.html, or by calling NARA at (301) 713-6800. If the agency has no...
A novel method for efficient archiving and retrieval of biomedical images using MPEG-7

NASA Astrophysics Data System (ADS)

Meyer, Joerg; Pahwa, Ash

2004-10-01

Digital archiving and efficient retrieval of radiological scans have become critical steps in contemporary medical diagnostics. Since more and more images and image sequences (single scans or video) from various modalities (CT/MRI/PET/digital X-ray) are now available in digital formats (e.g., DICOM-3), hospitals and radiology clinics need to implement efficient protocols capable of managing the enormous amounts of data generated daily in a typical clinical routine. We present a method that appears to be a viable way to eliminate the tedious step of manually annotating image and video material for database indexing. MPEG-7 is a new framework that standardizes the way images are characterized in terms of color, shape, and other abstract, content-related criteria. A set of standardized descriptors that are automatically generated from an image is used to compare an image to other images in a database, and to compute the distance between two images for a given application domain. Text-based database queries can be replaced with image-based queries using MPEG-7. Consequently, image queries can be conducted without any prior knowledge of the keys that were used as indices in the database. Since the decoding and matching steps are not part of the MPEG-7 standard, this method also enables searches that were not planned by the time the keys were generated.
Design and Application of a Community Land Benchmarking System for Earth System Models

NASA Astrophysics Data System (ADS)

Mu, M.; Hoffman, F. M.; Lawrence, D. M.; Riley, W. J.; Keppel-Aleks, G.; Koven, C. D.; Kluzek, E. B.; Mao, J.; Randerson, J. T.

2015-12-01

Benchmarking has been widely used to assess the ability of climate models to capture the spatial and temporal variability of observations during the historical era. For the carbon cycle and terrestrial ecosystems, the design and development of an open-source community platform has been an important goal as part of the International Land Model Benchmarking (ILAMB) project. Here we developed a new benchmarking software system that enables the user to specify the models, benchmarks, and scoring metrics, so that results can be tailored to specific model intercomparison projects. Evaluation data sets included soil and aboveground carbon stocks, fluxes of energy, carbon and water, burned area, leaf area, and climate forcing and response variables. We used this system to evaluate simulations from the 5th Phase of the Coupled Model Intercomparison Project (CMIP5) with prognostic atmospheric carbon dioxide levels over the period from 1850 to 2005 (i.e., esmHistorical simulations archived on the Earth System Grid Federation). We found that the multi-model ensemble had a high bias in incoming solar radiation across Asia, likely as a consequence of incomplete representation of aerosol effects in this region, and in South America, primarily as a consequence of a low bias in mean annual precipitation. The reduced precipitation in South America had a larger influence on gross primary production than the high bias in incoming light, and as a consequence gross primary production had a low bias relative to the observations. Although model to model variations were large, the multi-model mean had a positive bias in atmospheric carbon dioxide that has been attributed in past work to weak ocean uptake of fossil emissions. In mid latitudes of the northern hemisphere, most models overestimate latent heat fluxes in the early part of the growing season, and underestimate these fluxes in mid-summer and early fall, whereas sensible heat fluxes show the opposite trend.
RESIS-II: An Updated Version of the Original Reservoir Sedimentation Survey Information System (RESIS) Database

USGS Publications Warehouse

Ackerman, Katherine V.; Mixon, David M.; Sundquist, Eric T.; Stallard, Robert F.; Schwarz, Gregory E.; Stewart, David W.

2009-01-01

The Reservoir Sedimentation Survey Information System (RESIS) database, originally compiled by the Soil Conservation Service (now the Natural Resources Conservation Service) in collaboration with the Texas Agricultural Experiment Station, is the most comprehensive compilation of data from reservoir sedimentation surveys throughout the conterminous United States (U.S.). The database is a cumulative historical archive that includes data from as early as 1755 and as late as 1993. The 1,823 reservoirs included in the database range in size from farm ponds to the largest U.S. reservoirs (such as Lake Mead). Results from 6,617 bathymetric surveys are available in the database. This Data Series provides an improved version of the original RESIS database, termed RESIS-II, and a report describing RESIS-II. The RESIS-II relational database is stored in Microsoft Access and includes more precise location coordinates for most of the reservoirs than the original database but excludes information on reservoir ownership. RESIS-II is anticipated to be a template for further improvements in the database.
Identification of mycotoxins by UHPLC–QTOF MS in airborne fungi and fungi isolated from industrial paper and antique documents from the Archive of Bogotá

DOE Office of Scientific and Technical Information (OSTI.GOV)

Castillo, Nancy I.; Ibáñez, María; Beltrán, Eduardo

Mold deterioration of historical documents in archives and libraries is a frequent and complex phenomenon that may have important economic and cultural consequences. In addition, exposure to toxic fungal metabolites might produce health problems. In this work, samples of broths of fungal species isolated from the documentary material and from indoor environmental samples of the Archive of Bogotá have been analyzed to investigate the presence of mycotoxins. High resolution mass spectrometry made possible to search for a large number of mycotoxins, even without reference standards available at the laboratory. For this purpose, a screening strategy based on ultra-high pressure liquidmore » chromatography coupled to quadrupole-time of flight mass spectrometry (UHPLC–QTOF MS) under MS{sup E} mode was applied. A customized home-made database containing elemental composition for around 600 mycotoxins was compiled. The presence of the (de)protonated molecule measured at its accurate mass was evaluated in the samples. When a peak was detected, collision induced dissociation fragments and characteristic isotopic ions were also evaluated and used for tentative identification, based on structure compatibility and comparison with literature data (if existing). Up to 44 mycotoxins were tentatively identified by UHPLC–QTOF MS. 34 of these tentative compounds were confirmed by subsequent analysis using a targeted LC–MS/MS method, supporting the strong potential of QTOF MS for identification/elucidation purposes. The presence of mycotoxins in these samples might help to reinforce safety measures for researchers and staff who work on reception, restoration and conservation of archival material, not only at the Archive of Bogotá but worldwide. - Highlights: • Mold deterioration of historical documents is a frequent and complex phenomenon. • Samples of broths of fungal species isolated from Archive of Bogotá analyzed. • UHPLC–QTOF MS (MS{sup E}) applied for mycotoxins screening, without reference standards. • Customized home-made database for around 600 mycotoxins compiled. • 44 mycotoxins tentatively identified, 34 of which confirmed by LC–MS/MS.« less
An Unbiased Method To Build Benchmarking Sets for Ligand-Based Virtual Screening and its Application To GPCRs

PubMed Central

2015-01-01

Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the “artificial enrichment” and “analogue bias” of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD. PMID:24749745
An unbiased method to build benchmarking sets for ligand-based virtual screening and its application to GPCRs.

PubMed

Xia, Jie; Jin, Hongwei; Liu, Zhenming; Zhang, Liangren; Wang, Xiang Simon

2014-05-27

Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the "artificial enrichment" and "analogue bias" of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD.
National Weather- RFC Development Management

Science.gov Websites

Map News Organization Search NWS ALL NOAA Go RFC Development Management Presentations Projects & ; Plans RFC Development Program RFC Archive Database Documentation Outline Workshops Contact Us resources and services. Description Graphic The RFC Development Management component of the Office of
Novel carboxamides as potential mosquito reprellents.

USDA-ARS?s Scientific Manuscript database

A model was developed using 167 carboxamide compounds, from the US Department of Agriculture archival database, that were tested as arthropod repellents over the past 60 years. An artificial neural network utilizing CODESSA PRO descriptors was used to construct a Quantitative Structure-Activity Re...
NITARP: An Example of Effective Data-Based Research in the Classroom

NASA Astrophysics Data System (ADS)

Orr, Laura; Rowe, Jamie L.; Lineberger, Howard; Duranko, Gary; Gorjian, Varoujan

2018-01-01

The use of data in the classroom is a core component of both project based learning and STEM based education. Authentic student driven research using real-world data is a primary focus of both teaching strategies. To make the educational outcome effective and long lasting, the type and quality of data used in the lessons is important. The NASA/IPAC Teacher Archive Research Project (NITARP) program encapsulates this in very meaningful ways by providing both teachers and students the opportunity to ask deep meaningful questions, collaborate with peers, and arrive at meaningful conclusions. Teachers trained in the use of scientific archives and the application of those archives for authentic research is critical for this type of learning to be successful.In this study we use the NITARP program as an example of effective STEM project based learning using archived scientific data. We explore the components of the program that are most effective, the effects on teacher competency and ease of use with students, and use in the classroom. For each area we also explore alternate sources of teacher support, data archives, and techniques for implementation in classrooms for various topics and skill levels.
Life Sciences Data Archive (LSDA) in the Post-Shuttle Era

NASA Technical Reports Server (NTRS)

Fitts, Mary A.; Johnson-Throop, Kathy; Havelka, Jacque; Thomas, Diedre

2009-01-01

Now, more than ever before, NASA is realizing the value and importance of their intellectual assets. Principles of knowledge management, the systematic use and reuse of information/experience/expertise to achieve a specific goal, are being applied throughout the agency. LSDA is also applying these solutions, which rely on a combination of content and collaboration technologies, to enable research teams to create, capture, share, and harness knowledge to do the things they do well, even better. In the early days of spaceflight, space life sciences data were been collected and stored in numerous databases, formats, media-types and geographical locations. These data were largely unknown/unavailable to the research community. The Biomedical Informatics and Health Care Systems Branch of the Space Life Sciences Directorate at JSC and the Data Archive Project at ARC, with funding from the Human Research Program through the Exploration Medical Capability Element, are fulfilling these requirements through the systematic population of the Life Sciences Data Archive. This project constitutes a formal system for the acquisition, archival and distribution of data for HRP-related experiments and investigations. The general goal of the archive is to acquire, preserve, and distribute these data and be responsive to inquiries from the science communities.
The ESIS query environment pilot project

NASA Technical Reports Server (NTRS)

Fuchs, Jens J.; Ciarlo, Alessandro; Benso, Stefano

1993-01-01

The European Space Information System (ESIS) was originally conceived to provide the European space science community with simple and efficient access to space data archives, facilities with which to examine and analyze the retrieved data, and general information services. To achieve that ESIS will provide the scientists with a discipline specific environment for querying in a uniform and transparent manner data stored in geographically dispersed archives. Furthermore it will provide discipline specific tools for displaying and analyzing the retrieved data. The central concept of ESIS is to achieve a more efficient and wider usage of space scientific data, while maintaining the physical archives at the institutions which created them, and has the best background for ensuring and maintaining the scientific validity and interest of the data. In addition to coping with the physical distribution of data, ESIS is to manage also the heterogenity of the individual archives' data models, formats and data base management systems. Thus the ESIS system shall appear to the user as a single database, while it does in fact consist of a collection of dispersed and locally managed databases and data archives. The work reported in this paper is one of the results of the ESIS Pilot Project which is to be completed in 1993. More specifically it presents the pilot ESIS Query Environment (ESIS QE) system which forms the data retrieval and data dissemination axis of the ESIS system. The others are formed by the ESIS Correlation Environment (ESIS CE) and the ESIS Information Services. The ESIS QE Pilot Project is carried out for the European Space Agency's Research and Information center, ESRIN, by a Consortium consisting of Computer Resources International, Denmark, CISET S.p.a, Italy, the University of Strasbourg, France and the Rutherford Appleton Laboratories in the U.K. Furthermore numerous scientists both within ESA and space science community in Europe have been involved in defining the core concepts of the ESIS system.

Amateur Planetary Radio Data Archived for Science and Education: Radio Jove

NASA Astrophysics Data System (ADS)

Thieman, J.; Cecconi, B.; Sky, J.; Garcia, L. N.; King, T. A.; Higgins, C. A.; Fung, S. F.

2015-12-01

The Radio Jove Project is a hands-on educational activity in which students, teachers, and the general public build simple radio telescopes, usually from a kit, to observe single frequency decameter wavelength radio emissions from Jupiter, the Sun, the galaxy, and the Earth usually with simple dipole antennas. Some of the amateur observers have upgraded their receivers to spectrographs and their antennas have become more sophisticated as well. The data records compare favorably to more sophisticated professional radio telescopes such as the Long Wavelength Array (LWA) and the Nancay Decametric Array. Since these data are often carefully calibrated and recorded around the clock in widely scattered locations they represent a valuable database useful not only to amateur radio astronomers but to the professional science community as well. Some interesting phenomena have been noted in the data that are of interest to the professionals familiar with such records. The continuous monitoring of radio emissions from Jupiter could serve as useful "ground truth" data during the coming Juno mission's radio observations of Jupiter. Radio Jove has long maintained an archive for thousands of Radio Jove observations, but the database was intended for use by the Radio Jove participants only. Now, increased scientific interest in the use of these data has resulted in several proposals to translate the data into a science community data format standard and store the data in professional archives. Progress is being made in translating Radio Jove data to the Common Data Format (CDF) and also in generating new observations in that format as well. Metadata describing the Radio Jove data would follow the Space Physics Archive Search and Extract (SPASE) standard. The proposed archive to be used for long term preservation would be the Planetary Data System (PDS). Data sharing would be achieved through the PDS and the Paris Astronomical Data Centre (PADC) and the Virtual Wave Observatory (VWO). We believe that Radio Jove represents another fertile area for citizen science to contribute to overall scientific investigation.
Archiving Microgravity Flight Data and Samples

NASA Technical Reports Server (NTRS)

1996-01-01

To obtain help in evaluating its current strategy for archiving data and samples obtained in microgravity research, NASA's Microgravity Science and Applications Division (MSAD) asked the Space Studies Board's Committee on Microgravity Research for guidance on the following questions: What data should be archived and where should it be kept? In what form should the data be maintained (electronic files, photographs, hard copy, samples)? What should the general format of the database be? To what extent should it be universally accessible and through what mechanisms? Should there be a period of time for which principal investigators have proprietary access? If so, how long should proprietary data be stored? What provisions should be made for data obtained from ground-based experiments? What should the deadline be for investigators placing their data in the archive? How long should data be saved? How long should data be easily accessible? As a prelude to making recommendations for optimum selection and storage of microgravity data and samples, the committee in this report briefly describes NASA's past archiving practices and outlines MSAD's current archiving strategy. Although the committee found that only a limited number of experiments have thus far been archived, it concluded that the general archiving strategy, characterized by MSAD as minimalist, appears viable. A central focus of attention is the Experiment Data Management Plan (EDMP), MSAD's recently instituted data management and archiving framework for flight experiments. Many of the report's recommendations are aimed at enhancing the effectiveness of the EDMP approach, which the committee regards as an appropriate data management method for MSAD. Other recommendations provide guidance on broader issues related to the questions listed above. This report does not address statutory or regulatory records retention requirements.
Archiving and Exchange of a Computerized Marine Seismic Database: The ROSE (Rivera Ocean Seismic Experiment) Data Archive System

DTIC Science & Technology

1983-12-01

END IF 322 : C WRITE VAHIABLES, mCLDDING KEYWORDS, TO CATBUF 323: C 324: . CATX Bir70UT(CATLrH,CATBOT,112,rEOF) 325...SCORR.SD.SIZE,RC0RR,RD,RANGE 7012 . FORMAT(6F10.4) 91 92 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338...SIZEMX 320: 2013 FORMATC Max shot size:’,F10.4,’ Enter size:’) 118 321 322 323 14 324 2014 325 326 327 15 328 2015 329 330 331 16
Extended Edited Synoptic Cloud Reports from Ships and Land Stations Over the Globe, 1952-2009 (NDP-026C)

DOE Data Explorer

Hahn, C. J. [University of Arizona; Warren, S. G. [University of Washington; Eastman, R.

1999-08-01

This database contains surface synoptic weather reports for the entire globe, gathered from various available data sets. The reports were processed, edited, and rewritten to provide a single dataset of individual observations of clouds, spanning the 57 years 1952-2008 for ship data and the 39 years 1971-2009 for land station data. In addition to the cloud portion of the synoptic report, each edited report also includes the associated pressure, present weather, wind, air temperature, and dew point (and sea surface temperature over oceans). This data set is called the "Extended Edited Cloud Report Archive" (EECRA). The EECRA is based solely on visual cloud observations from weather stations, reported in the WMO synoptic code (WMO, 1974). Reports must contain cloud-type information to be included in the archive. Past data sources include those from the Fleet Numerical Oceanographic Center (FNOC, 1971-1976) and the National Centers for Environmental Prediction (NCEP, 1977-1996). This update uses data from a new source, the 'Integrated Surface Database' (ISD, 1997-2009; Smith et al., 2011). Our past analyses of the EECRA identified a subset of 5388 weather stations that were determined to produce reliable day and night observations of cloud amount and type. The update contains observations only from this subset of stations. Details concerning processing, previous problems, contents, and comments are available in the archive's original documentation . The EECRA contains about 81 million cloud observations from ships and 380 million from land stations. The data files have been compressed using unix. Unix/linux users can "uncompress" or "gunzip" the files after downloading. If you're interested in the NDP-026C database, then you'll also want to explore its related data products, NDP-026D and NDP-026E.
Digital Image Support in the ROADNet Real-time Monitoring Platform

NASA Astrophysics Data System (ADS)

Lindquist, K. G.; Hansen, T. S.; Newman, R. L.; Vernon, F. L.; Nayak, A.; Foley, S.; Fricke, T.; Orcutt, J.; Rajasekar, A.

2004-12-01

The ROADNet real-time monitoring infrastructure has allowed researchers to integrate geophysical monitoring data from a wide variety of signal domains. Antelope-based data transport, relational-database buffering and archiving, backup/replication/archiving through the Storage Resource Broker, and a variety of web-based distribution tools create a powerful monitoring platform. In this work we discuss our use of the ROADNet system for the collection and processing of digital image data. Remote cameras have been deployed at approximately 32 locations as of September 2004, including the SDSU Santa Margarita Ecological Reserve, the Imperial Beach pier, and the Pinon Flats geophysical observatory. Fire monitoring imagery has been obtained through a connection to the HPWREN project. Near-real-time images obtained from the R/V Roger Revelle include records of seafloor operations by the JASON submersible, as part of a maintenance mission for the H2O underwater seismic observatory. We discuss acquisition mechanisms and the packet architecture for image transport via Antelope orbservers, including multi-packet support for arbitrarily large images. Relational database storage supports archiving of timestamped images, image-processing operations, grouping of related images and cameras, support for motion-detect triggers, thumbnail images, pre-computed video frames, support for time-lapse movie generation and storage of time-lapse movies. Available ROADNet monitoring tools include both orbserver-based display of incoming real-time images and web-accessible searching and distribution of images and movies driven by the relational database (http://mercali.ucsd.edu/rtapps/rtimbank.php). An extension to the Kepler Scientific Workflow System also allows real-time image display via the Ptolemy project. Custom time-lapse movies may be made from the ROADNet web pages.
A Support Database System for Integrated System Health Management (ISHM)

NASA Technical Reports Server (NTRS)

Schmalzel, John; Figueroa, Jorge F.; Turowski, Mark; Morris, John

2007-01-01

The development, deployment, operation and maintenance of Integrated Systems Health Management (ISHM) applications require the storage and processing of tremendous amounts of low-level data. This data must be shared in a secure and cost-effective manner between developers, and processed within several heterogeneous architectures. Modern database technology allows this data to be organized efficiently, while ensuring the integrity and security of the data. The extensibility and interoperability of the current database technologies also allows for the creation of an associated support database system. A support database system provides additional capabilities by building applications on top of the database structure. These applications can then be used to support the various technologies in an ISHM architecture. This presentation and paper propose a detailed structure and application description for a support database system, called the Health Assessment Database System (HADS). The HADS provides a shared context for organizing and distributing data as well as a definition of the applications that provide the required data-driven support to ISHM. This approach provides another powerful tool for ISHM developers, while also enabling novel functionality. This functionality includes: automated firmware updating and deployment, algorithm development assistance and electronic datasheet generation. The architecture for the HADS has been developed as part of the ISHM toolset at Stennis Space Center for rocket engine testing. A detailed implementation has begun for the Methane Thruster Testbed Project (MTTP) in order to assist in developing health assessment and anomaly detection algorithms for ISHM. The structure of this implementation is shown in Figure 1. The database structure consists of three primary components: the system hierarchy model, the historical data archive and the firmware codebase. The system hierarchy model replicates the physical relationships between system elements to provide the logical context for the database. The historical data archive provides a common repository for sensor data that can be shared between developers and applications. The firmware codebase is used by the developer to organize the intelligent element firmware into atomic units which can be assembled into complete firmware for specific elements.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

PubMed

Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

2017-01-01

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives

PubMed Central

Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

2017-01-01

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924
gPhoton: THE GALEX PHOTON DATA ARCHIVE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Million, Chase; Fleming, Scott W.; Shiao, Bernie

gPhoton is a new database product and software package that enables analysis of GALEX ultraviolet data at the photon level. The project’s stand-alone, pure-Python calibration pipeline reproduces the functionality of the original mission pipeline to reduce raw spacecraft data to lists of time-tagged, sky-projected photons, which are then hosted in a publicly available database by the Mikulski Archive at Space Telescope. This database contains approximately 130 terabytes of data describing approximately 1.1 trillion sky-projected events with a timestamp resolution of five milliseconds. A handful of Python and command-line modules serve as a front end to interact with the database andmore » to generate calibrated light curves and images from the photon-level data at user-defined temporal and spatial scales. The gPhoton software and source code are in active development and publicly available under a permissive license. We describe the motivation, design, and implementation of the calibration pipeline, database, and tools, with emphasis on divergence from prior work, as well as challenges created by the large data volume. We summarize the astrometric and photometric performance of gPhoton relative to the original mission pipeline. For a brief example of short time-domain science capabilities enabled by gPhoton, we show new flares from the known M-dwarf flare star CR Draconis. The gPhoton software has permanent object identifiers with the ASCL (ascl:1603.004) and DOI (doi:10.17909/T9CC7G). This paper describes the software as of version v1.27.2.« less
Detailed description of the Mayo/IBM PACS

NASA Astrophysics Data System (ADS)

Gehring, Dale G.; Persons, Kenneth R.; Rothman, Melvyn L.; Salutz, James R.; Morin, Richard L.

1991-07-01

The Mayo Clinic and IBM/Rochester have jointly developed a picture archiving system (PACS) for use with Mayo's MRI and Neuro-CT imaging modalities. The system was developed to replace the imaging system's vendor-supplied magnetic tape archiving capability. The system consists of seven MR imagers and nine CT scanners, each interfaced to the PACS via IBM Personal System/2(tm) (PS/2) computers, which act as gateways from the imaging modality to the PACS network. The PAC system operates on the token-ring component of Mayo's city-wide local area network. Also on the PACS network are four optical storage subsystems used for image archival, three optical subsystems used for image retrieval, an IBM Application System/400(tm) (AS/400) computer used for database management and multiple PS/2-based image display systems and their image servers.
The Legacy Archive for Microwave Background Data Analysis (LAMBDA)

NASA Astrophysics Data System (ADS)

Miller, Nathan; LAMBDA

2018-01-01

The Legacy Archive for Microwave Background Data Analysis (LAMBDA) provides CMB researchers with archival data for cosmology missions, software tools, and links to other sites of interest. LAMBDA is one-stop shopping for CMB researchers. It hosts data from WMAP along with many suborbital experiments. Over the past year, LAMBDA has acquired new data from SPTpol, SPIDER and ACTPol. In addition to the primary CMB, LAMBDA also provides foreground data.LAMBDA has several ongoing efforts to provide tools for CMB researchers. These tools include a web interface for CAMB and a web interface for a CMB survey footprint database and plotting tool. Additionally, we have recently developed a Docker container with standard CMB analysis tools and demonstrations in the form of Jupyter notebooks. These containers will be publically available through Docker's container repository and the source will be available on github.
Benchmarking B-Cell Epitope Prediction with Quantitative Dose-Response Data on Antipeptide Antibodies: Towards Novel Pharmaceutical Product Development

PubMed Central

Caoili, Salvador Eugenio C.

2014-01-01

B-cell epitope prediction can enable novel pharmaceutical product development. However, a mechanistically framed consensus has yet to emerge on benchmarking such prediction, thus presenting an opportunity to establish standards of practice that circumvent epistemic inconsistencies of casting the epitope prediction task as a binary-classification problem. As an alternative to conventional dichotomous qualitative benchmark data, quantitative dose-response data on antibody-mediated biological effects are more meaningful from an information-theoretic perspective in the sense that such effects may be expressed as probabilities (e.g., of functional inhibition by antibody) for which the Shannon information entropy (SIE) can be evaluated as a measure of informativeness. Accordingly, half-maximal biological effects (e.g., at median inhibitory concentrations of antibody) correspond to maximally informative data while undetectable and maximal biological effects correspond to minimally informative data. This applies to benchmarking B-cell epitope prediction for the design of peptide-based immunogens that elicit antipeptide antibodies with functionally relevant cross-reactivity. Presently, the Immune Epitope Database (IEDB) contains relatively few quantitative dose-response data on such cross-reactivity. Only a small fraction of these IEDB data is maximally informative, and many more of them are minimally informative (i.e., with zero SIE). Nevertheless, the numerous qualitative data in IEDB suggest how to overcome the paucity of informative benchmark data. PMID:24949474
Inventory of Exposure-Related Data Systems Sponsored By Federal Agencies

DTIC Science & Technology

1992-05-01

Health and Nutrition Examination Survey (NHANES) .... 1-152 National Herbicide Use Database .......................... 1-157 National Human Adipose Tissue ...Human Adipose Tissue ) ..................................... National Hydrologic Benchmark Network (see National Water Quality Networks Programs...Inorganic compounds (arsenic, iron, lead, mercury, zinc , cadmium , chromium, copper); pesticides (1982 and 1987 data available for 35 pesticides; original
Building the Foundation for Data-Based Decision Making: Creating Consensus on Language and Concepts

ERIC Educational Resources Information Center

Crum, Karen

2009-01-01

Data Based Decision Making (DBDM), the process of gathering, analyzing, applying, and sharing data in order to promote school improvement, has recently become a prominent process in the quest to assist students in attaining educational success and helping schools meet accountability benchmarks (Wayman, 2005; Poynton & Carey, 2006). This…
Data Intensive Systems (DIS) Benchmark Performance Summary

DTIC Science & Technology

2003-08-01

models assumed by today’s conventional architectures. Such applications include model- based Automatic Target Recognition (ATR), synthetic aperture...radar (SAR) codes, large scale dynamic databases/battlefield integration, dynamic sensor- based processing, high-speed cryptanalysis, high speed...distributed interactive and data intensive simulations, data-oriented problems characterized by pointer- based and other highly irregular data structures
The Application of Instructional Reform in Classrooms: Benchmarking Effective Teacher Behavior.

ERIC Educational Resources Information Center

van Tassel-Baska, Joyce; Avery, Linda D.

This paper describes the Classroom Observation Form (COF), which was used to evaluate elementary and secondary school gifted education programs in Virginia and South Carolina. The study examined the nature of classroom practice and developed a database on the nature of classroom practice in multiple settings serving high ability learners across…
Mega-precovery and data mining of near-Earth asteroids and other Solar System objects

NASA Astrophysics Data System (ADS)

Popescu, M.; Vaduvescu, O.; Char, F.; Curelaru, L.; Euronear Team

2014-07-01

The vast collection of CCD images and photographic plate archives available from the world-wide archives and telescopes is still insufficiently exploited. Within the EURONEAR project we designed two data mining software with the purpose to search very large collections of archives for images which serendipitously include known asteroids or comets in their field, with the main aims to extend the arc and improve the orbits. In this sense, ''Precovery'' (published in 2008, aiming to search all known NEAs in few archives via IMCCE's SkyBoT server) and ''Mega-Precovery'' (published in 2010, querying the IMCCE's Miriade server) were made available to the community via the EURONEAR website (euronear.imcce.fr). Briefly, Mega-Precovery aims to search one or a few known asteroids or comets in a mega-collection including millions of images from some of the largest observatory archives: ESO (15 instruments served by ESO Archive including VLT), NVO (8 instruments served by U.S. NVO Archive), CADC (11 instruments, including HST and Gemini), plus other important instrument archives: SDSS, CFHTLS, INT-WFC, Subaru-SuprimeCam and AAT-WFI, adding together 39 instruments and 4.3 million images (Mar 2014), and our Mega-Archive is growing. Here we present some of the most important results obtained with our data-mining software and some new planned search options of Mega-Precovery. Particularly, the following capabilities will be added soon: the ING archive (all imaging cameras) will be included and new search options will be made available (such as query by orbital elements and by observations) to be able to target new Solar System objects such as Virtual Impactors, bolides, planetary satellites, TNOs (besides the comets added recently). In order to better characterize the archives, we introduce the ''AOmegaA'' factor (archival etendue) proportional to the AOmega (etendue) and the number of images in an archive. With the aim to enlarge the Mega-Archive database, we invite the observatories (particularly those storing their images online and also those that own plate archives which could be scanned on request) to contact us in order to add their instrument archives (consisting of an ASCII file with telescope pointings in a simple format) to our Mega-Precovery open project. We intend for the future to synchronise our service with the Virtual Observatory.
Blood and Marrow Transplant Clinical Trials Network Report on the Development of Novel Endpoints and Selection of Promising Approaches for Graft-versus-Host Disease Prevention Trials.

PubMed

Pasquini, Marcelo C; Logan, Brent; Jones, Richard J; Alousi, Amin M; Appelbaum, Frederick R; Bolaños-Meade, Javier; Flowers, Mary E D; Giralt, Sergio; Horowitz, Mary M; Jacobsohn, David; Koreth, John; Levine, John E; Luznik, Leo; Maziarz, Richard; Mendizabal, Adam; Pavletic, Steven; Perales, Miguel-Angel; Porter, David; Reshef, Ran; Weisdorf, Daniel; Antin, Joseph H

2018-06-01

Graft-versus-host disease (GVHD) is a common complication after hematopoietic cell transplantation (HCT) and associated with significant morbidity and mortality. Preventing GVHD without chronic therapy or increasing relapse is a desired goal. Here we report a benchmark analysis to evaluate the performance of 6 GVHD prevention strategies tested at single institutions compared with a large multicenter outcomes database as a control. Each intervention was compared with the control for the incidence of acute and chronic GVHD and overall survival and against novel composite endpoints: acute and chronic GVHD, relapse-free survival (GRFS), and chronic GVHD, relapse-free survival (CRFS). Modeling GRFS and CRFS using the benchmark analysis further informed the design of 2 clinical trials testing GVHD prophylaxis interventions. This study demonstrates the potential benefit of using an outcomes database to select promising interventions for multicenter clinical trials and proposes novel composite endpoints for use in GVHD prevention trials. Copyright © 2018 The American Society for Blood and Marrow Transplantation. Published by Elsevier Inc. All rights reserved.
Parametrization of an Orbital-Based Linear-Scaling Quantum Force Field for Noncovalent Interactions

PubMed Central

2015-01-01

We parametrize a linear-scaling quantum mechanical force field called mDC for the accurate reproduction of nonbonded interactions. We provide a new benchmark database of accurate ab initio interactions between sulfur-containing molecules. A variety of nonbond databases are used to compare the new mDC method with other semiempirical, molecular mechanical, ab initio, and combined semiempirical quantum mechanical/molecular mechanical methods. It is shown that the molecular mechanical force field significantly and consistently reproduces the benchmark results with greater accuracy than the semiempirical models and our mDC model produces errors twice as small as the molecular mechanical force field. The comparisons between the methods are extended to the docking of drug candidates to the Cyclin-Dependent Kinase 2 protein receptor. We correlate the protein–ligand binding energies to their experimental inhibition constants and find that the mDC produces the best correlation. Condensed phase simulation of mDC water is performed and shown to produce O–O radial distribution functions similar to TIP4P-EW. PMID:24803856
Analysis and interpretation of diffuse x-ray emission using data from the Einstein satellite

NASA Technical Reports Server (NTRS)

Helfand, David J.

1991-01-01

An ambitious program to create a powerful and accessible archive of the HEAO-2 Imaging Proportional Counter (IPC) database was outlined. The scientific utility of that database for studies of diffuse x ray emissions was explored. Technical and scientific accomplishments are reviewed. Three papers were presented which have major new scientific findings relevant to the global structure of the interstellar medium and the origin of the cosmic x ray background. An all-sky map of diffuse x ray emission was constructed.

[Filing and reuse of research data].

PubMed

Osler, Merete; Bredahl, Lone; Ousager, Steen

2008-02-25

Currently several scientific journals only publish data from randomised clinical trials which are registered in a public database. Similar requirements on data sharing now follow grants from agencies such as the National Institute of Health. In Denmark the Health unit at the Danish Data Archive (DDA/Health) offers Danish researchers to keep their data for free on conditions that fulfil the above requirements. DDA/Health also passes on research data for reuse, and at present more than 300 studies are available in a database on sundhed.dda.dk.
Psychotherapy-based supervision models in an emerging competency-based era: a commentary.

PubMed

Falender, Carol A; Shafranske, Edward P

2010-03-01

As psychology engages in a cultural shift to competency-based education and training supervision practice is being transformed to the use of competency frames and the application of benchmark competencies. In this issue, psychotherapy-based models of supervision are conceptualized in a competency framework. This paper reflects on the translation of key components of each psychotherapy-based supervision approach in terms of foundational and functional competencies articulated in the Competencies Benchmarks (Fouad et al., 2009). The commentary concludes with a discussion of implications for supervision practice and identifies directions for future articulation and development, including evidence-based psychotherapy supervision. PsycINFO Database Record (c) 2010 APA, all rights reserved
ePCR: an R-package for survival and time-to-event prediction in advanced prostate cancer, applied to real-world patient cohorts.

PubMed

Laajala, Teemu D; Murtojärvi, Mika; Virkki, Arho; Aittokallio, Tero

2018-06-15

Prognostic models are widely used in clinical decision-making, such as risk stratification and tailoring treatment strategies, with the aim to improve patient outcomes while reducing overall healthcare costs. While prognostic models have been adopted into clinical use, benchmarking their performance has been difficult due to lack of open clinical datasets. The recent DREAM 9.5 Prostate Cancer Challenge carried out an extensive benchmarking of prognostic models for metastatic Castration-Resistant Prostate Cancer (mCRPC), based on multiple cohorts of open clinical trial data. We make available an open-source implementation of the top-performing model, ePCR, along with an extended toolbox for its further re-use and development, and demonstrate how to best apply the implemented model to real-world data cohorts of advanced prostate cancer patients. The open-source R-package ePCR and its reference documentation are available at the Central R Archive Network (CRAN): https://CRAN.R-project.org/package=ePCR. R-vignette provides step-by-step examples for the ePCR usage. Supplementary data are available at Bioinformatics online.
The HyperLeda project en route to the astronomical virtual observatory

NASA Astrophysics Data System (ADS)

Golev, V.; Georgiev, V.; Prugniel, Ph.

2002-07-01

HyperLeda (Hyper-Linked Extragalactic Databases and Archives) is aimed to study the evolution of galaxies, their kinematics and stellar populations and the structure of Local Universe. HyperLeda is involved in catalogue and software production, data-mining and massive data processing. The products are serviced to the community through web mirrors. The development of HyperLeda is distributed between different sites and is based on the background experience of the LEDA and Hypercat databases. The HyperLeda project is focused both on the European iAstro colaboration and as a unique database for studies of the physics of the extragalactic objects.
Quality control of EUVE databases

NASA Technical Reports Server (NTRS)

John, L. M.; Drake, J.

1992-01-01

The publicly accessible databases for the Extreme Ultraviolet Explorer include: the EUVE Archive mailserver; the CEA ftp site; the EUVE Guest Observer Mailserver; and the Astronomical Data System node. The EUVE Performance Assurance team is responsible for verifying that these public EUVE databases are working properly, and that the public availability of EUVE data contained therein does not infringe any data rights which may have been assigned. In this poster, we describe the Quality Assurance (QA) procedures we have developed from the approach of QA as a service organization, thus reflecting the overall EUVE philosophy of Quality Assurance integrated into normal operating procedures, rather than imposed as an external, post facto, control mechanism.
Berkeley Lab - Materials Sciences Division

Science.gov Websites

sponsors. Distinguish by scope/specific aspects of research; or by institution; or by individual. Example Sciences Division About Organization Contact Research Core Programs Materials Discovery, Design and Postdoc Forum Research Highlights Awards Publications Database Events Calendar Newsletter Archive People
Lamont-Doherty Earth Observatory |

Science.gov Websites

; Graduate Students Academic Calendar Contact Us LDEO Alumni News & Events Research News News Archive ; Tectonophysics Media Inquiries Publications Database Education K-12 Students Educators Undergraduate & Support Program About History of Lamont Alumni Map & Contacts Office of the Director Open Positions
Point of Entry

ERIC Educational Resources Information Center

Manzo, Kathleen Kennedy

2007-01-01

As part of a professional development program organized by the Save Ellis Island Foundation, the exhibits, databases, photo archives, and recorded interviews at the island's museum helps put the nation's current immigration debate into a broader historical context. Teachers at these sessions learn from scholars and park personnel about early…
37 CFR 102.3 - Records under FOIA.

Code of Federal Regulations, 2013 CFR

2013-07-01

... COMMERCE ADMINISTRATION DISCLOSURE OF GOVERNMENT INFORMATION Freedom of Information Act § 102.3 Records... extracted from an existing database is not always apparent. When responding to FOIA requests for electronic... Title 44 of the United States Code or a National Archives and Records Administration's General Records...
37 CFR 102.3 - Records under FOIA.

Code of Federal Regulations, 2012 CFR

2012-07-01

... COMMERCE ADMINISTRATION DISCLOSURE OF GOVERNMENT INFORMATION Freedom of Information Act § 102.3 Records... extracted from an existing database is not always apparent. When responding to FOIA requests for electronic... Title 44 of the United States Code or a National Archives and Records Administration's General Records...
37 CFR 102.3 - Records under FOIA.

Code of Federal Regulations, 2014 CFR

2014-07-01

... COMMERCE ADMINISTRATION DISCLOSURE OF GOVERNMENT INFORMATION Freedom of Information Act § 102.3 Records... extracted from an existing database is not always apparent. When responding to FOIA requests for electronic... Title 44 of the United States Code or a National Archives and Records Administration's General Records...
37 CFR 102.3 - Records under FOIA.

Code of Federal Regulations, 2011 CFR

2011-07-01

... COMMERCE ADMINISTRATION DISCLOSURE OF GOVERNMENT INFORMATION Freedom of Information Act § 102.3 Records... extracted from an existing database is not always apparent. When responding to FOIA requests for electronic... Title 44 of the United States Code or a National Archives and Records Administration's General Records...
37 CFR 102.3 - Records under FOIA.

Code of Federal Regulations, 2010 CFR

2010-07-01

... COMMERCE ADMINISTRATION DISCLOSURE OF GOVERNMENT INFORMATION Freedom of Information Act § 102.3 Records... extracted from an existing database is not always apparent. When responding to FOIA requests for electronic... Title 44 of the United States Code or a National Archives and Records Administration's General Records...
Mapping the Martian Meteorology

NASA Technical Reports Server (NTRS)

Allison, M.; Ross, J. D.; Solomon, N.

1999-01-01

The Mars-adapted version of the NASA/GISS general circulation model (GCM) has been applied to the hourly/daily simulation of the planet's meteorology over several seasonal orbits. The current running version of the model includes a diurnal solar cycle, CO2 sublimation, and a mature parameterization of upper level wave drag with a vertical domain extending from the surface up to the 6microb level. The benchmark simulations provide a four-dimensional archive for the comparative evaluation of various schemes for the retrieval of winds from anticipated polar orbiter measurements of temperatures by the Pressure Modulator Infrared Radiometer. Additional information is contained in the original extended abstract.
The Primate Life History Database: A unique shared ecological data resource

PubMed Central

Strier, Karen B.; Altmann, Jeanne; Brockman, Diane K.; Bronikowski, Anne M.; Cords, Marina; Fedigan, Linda M.; Lapp, Hilmar; Liu, Xianhua; Morris, William F.; Pusey, Anne E.; Stoinski, Tara S.; Alberts, Susan C.

2011-01-01

Summary The importance of data archiving, data sharing, and public access to data has received considerable attention. Awareness is growing among scientists that collaborative databases can facilitate these activities.We provide a detailed description of the collaborative life history database developed by our Working Group at the National Evolutionary Synthesis Center (NESCent) to address questions about life history patterns and the evolution of mortality and demographic variability in wild primates.Examples from each of the seven primate species included in our database illustrate the range of data incorporated and the challenges, decision-making processes, and criteria applied to standardize data across diverse field studies. In addition to the descriptive and structural metadata associated with our database, we also describe the process metadata (how the database was designed and delivered) and the technical specifications of the database.Our database provides a useful model for other researchers interested in developing similar types of databases for other organisms, while our process metadata may be helpful to other groups of researchers interested in developing databases for other types of collaborative analyses. PMID:21698066
Defining the Core Archive Data Standards of the International Planetary Data Alliance (IPDA)

NASA Technical Reports Server (NTRS)

Hughes, J. Steven; Crichton, Dan; Beebe, Reta; Guinness, Ed; Heather, David; Zender, Joe

2007-01-01

A goal of the International Planetary Data Alliance (lPDA) is to develop a set of archive data standards that enable the sharing of scientific data across international agencies and missions. To help achieve this goal, the IPDA steering committee initiated a six month proj ect to write requirements for and draft an information model based on the Planetary Data System (PDS) archive data standards. The project had a special emphasis on data formats. A set of use case scenarios were first developed from which a set of requirements were derived for the IPDA archive data standards. The special emphasis on data formats was addressed by identifying data formats that have been used by PDS nodes and other agencies in the creation of successful data sets for the Planetary Data System (PDS). The dependency of the IPDA information model on the PDS archive standards required the compilation of a formal specification of the archive standards currently in use by the PDS. An ontology modelling tool was chosen to capture the information model from various sources including the Planetary Science Data Dictionary [I] and the PDS Standards Reference [2]. Exports of the modelling information from the tool database were used to produce the information model document using an object-oriented notation for presenting the model. The tool exports can also be used for software development and are directly accessible by semantic web applications.
Benchmarking comparison and validation of MCNP photon interaction data

NASA Astrophysics Data System (ADS)

Colling, Bethany; Kodeli, I.; Lilley, S.; Packer, L. W.

2017-09-01

The objective of the research was to test available photoatomic data libraries for fusion relevant applications, comparing against experimental and computational neutronics benchmarks. Photon flux and heating was compared using the photon interaction data libraries (mcplib 04p, 05t, 84p and 12p). Suitable benchmark experiments (iron and water) were selected from the SINBAD database and analysed to compare experimental values with MCNP calculations using mcplib 04p, 84p and 12p. In both the computational and experimental comparisons, the majority of results with the 04p, 84p and 12p photon data libraries were within 1σ of the mean MCNP statistical uncertainty. Larger differences were observed when comparing computational results with the 05t test photon library. The Doppler broadening sampling bug in MCNP-5 is shown to be corrected for fusion relevant problems through use of the 84p photon data library. The recommended libraries for fusion neutronics are 84p (or 04p) with MCNP6 and 84p if using MCNP-5.
Development of a website and biobank database for the Nanosized Cancer Polymarker Biochip Project: a Multicenter Italian Experience.

PubMed

Leon, Antonette E; Fabricio, Aline S C; Benvegnù, Fabio; Michilin, Silvia; Secco, Annamaria; Spangaro, Omar; Meo, Sabrina; Gion, Massimo

2011-01-01

The Nanosized Cancer Polymarker Biochip Project (RBLA03S4SP) funded by an Italian MIUR-FIRB grant (Italian Ministry of University and Research - Investment Funds for Basic Research) has led to the creation of a free-access dynamic website, available at the web address https://serviziweb.ulss12.ve.it/firbabo, and of a centralized database with password-restricted access. The project network is composed of 9 research units (RUs) and has been active since 2005. The aim of the FIRB project was the design, production and validation of optoelectronic and chemoelectronic biosensors for the simultaneous detection of a novel class of cancer biomarkers associated with immunoglobulins of the M class (IgM) for early diagnosis of cancer. Biomarker immune complexes (BM-ICs) were assessed on samples of clinical cases and matched controls for breast, colorectal, liver, ovarian and prostate malignancies. This article describes in detail the architecture of the project website, the central database application, and the biobank developed for the FIRB Nanosized Cancer Polymarker Biochip Project. The article also illustrates many unique aspects that should be considered when developing a database within a multidisciplinary scenario. The main deliverables of the project were numerous, including the development of an online database which archived 1400 case report forms (700 cases and 700 matched controls) and more than 2700 experimental results relative to the BM-ICs assayed. The database also allowed for the traceability and retrieval of 21,000 aliquots archived in the centralized bank and stored as backup in the RUs, and for the development of a centralized biological bank in the coordinating unit with 6300 aliquots of serum. The constitution of the website and biobank database enabled optimal coordination of the RUs involved, highlighting the importance of sharing samples and scientific data in a multicenter setting for the achievement of the project goals.
The state of the art of medical imaging technology: from creation to archive and back.

PubMed

Gao, Xiaohong W; Qian, Yu; Hui, Rui

2011-01-01

Medical imaging has learnt itself well into modern medicine and revolutionized medical industry in the last 30 years. Stemming from the discovery of X-ray by Nobel laureate Wilhelm Roentgen, radiology was born, leading to the creation of large quantities of digital images as opposed to film-based medium. While this rich supply of images provides immeasurable information that would otherwise not be possible to obtain, medical images pose great challenges in archiving them safe from corrupted, lost and misuse, retrievable from databases of huge sizes with varying forms of metadata, and reusable when new tools for data mining and new media for data storing become available. This paper provides a summative account on the creation of medical imaging tomography, the development of image archiving systems and the innovation from the existing acquired image data pools. The focus of this paper is on content-based image retrieval (CBIR), in particular, for 3D images, which is exemplified by our developed online e-learning system, MIRAGE, home to a repository of medical images with variety of domains and different dimensions. In terms of novelties, the facilities of CBIR for 3D images coupled with image annotation in a fully automatic fashion have been developed and implemented in the system, resonating with future versatile, flexible and sustainable medical image databases that can reap new innovations.
The State of the Art of Medical Imaging Technology: from Creation to Archive and Back

PubMed Central

Gao, Xiaohong W; Qian, Yu; Hui, Rui

2011-01-01

Medical imaging has learnt itself well into modern medicine and revolutionized medical industry in the last 30 years. Stemming from the discovery of X-ray by Nobel laureate Wilhelm Roentgen, radiology was born, leading to the creation of large quantities of digital images as opposed to film-based medium. While this rich supply of images provides immeasurable information that would otherwise not be possible to obtain, medical images pose great challenges in archiving them safe from corrupted, lost and misuse, retrievable from databases of huge sizes with varying forms of metadata, and reusable when new tools for data mining and new media for data storing become available. This paper provides a summative account on the creation of medical imaging tomography, the development of image archiving systems and the innovation from the existing acquired image data pools. The focus of this paper is on content-based image retrieval (CBIR), in particular, for 3D images, which is exemplified by our developed online e-learning system, MIRAGE, home to a repository of medical images with variety of domains and different dimensions. In terms of novelties, the facilities of CBIR for 3D images coupled with image annotation in a fully automatic fashion have been developed and implemented in the system, resonating with future versatile, flexible and sustainable medical image databases that can reap new innovations. PMID:21915232

Observatory Bibliographies as Research Tools

NASA Astrophysics Data System (ADS)

Rots, Arnold H.; Winkelman, S. L.

2013-01-01

Traditionally, observatory bibliographies were maintained to provide insight in how successful a observatory is as measured by its prominence in the (refereed) literature. When we set up the bibliographic database for the Chandra X-ray Observatory (http://cxc.harvard.edu/cgi-gen/cda/bibliography) as part of the Chandra Data Archive ((http://cxc.harvard.edu/cda/), very early in the mission, our objective was to make it primarily a useful tool for our user community. To achieve this we are: (1) casting a very wide net in collecting Chandra-related publications; (2) including for each literature reference in the database a wealth of metadata that is useful for the users; and (3) providing specific links between the articles and the datasets in the archive that they use. As a result our users are able to browse the literature and the data archive simultaneously. As an added bonus, the rich metadata content and data links have also allowed us to assemble more meaningful statistics about the scientific efficacy of the observatory. In all this we collaborate closely with the Astrophysics Data System (ADS). Among the plans for future enhancement are the inclusion of press releases and the Chandra image gallery, linking with ADS semantic searching tools, full-text metadata mining, and linking with other observatories' bibliographies. This work is supported by NASA contract NAS8-03060 (CXC) and depends critically on the services provided by the ADS.
Glossary | STORET Legacy Data Center | US EPA

EPA Pesticide Factsheets

2014-06-06

The U.S. Environmental Protection Agency (EPA) maintains two data management systems containing water quality information for the nation's waters: the Legacy Data Center (LDC), and STORET. The LDC is a static, archived database and STORET is an operational system actively being populated with water quality data.
Gramene 2016: comparative plant genomics and pathway resources

USDA-ARS?s Scientific Manuscript database

Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the data...
Unobtrusive Social Network Data From Email

DTIC Science & Technology

2008-12-01

PERSON Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 outlook archived files and stores that data into an SQL - database. Communication...Applications ( VBA ) program was installed on the personal computers (PC) of all participants, in the session window of their Microsoft Outlook. Details of
THE SAN PEDRO RIVER SPATIAL DATA ARCHIVE, A DATABASE BROWSER FOR COMMUNITY-BASED ENVIRONMENTAL PROTECTION

EPA Science Inventory

It is currently possible to measure landscape change over large areas and determine trends in ecological and hydrological condition using advanced space-based technologies accompanied by geospatial data. Specifically, this process is being tested in a community-based watershed in...
THE SAN PEDRO SPATIAL DATA ARCHIVE, A DATABASE BROWSER FOR COMMUNITY-BASED ENVIRONMENTAL PROTECTION

EPA Science Inventory

It is currently possible to measure landscape change over large areas and determine trends in ecological and hydrological condition using advanced space-based technologies accompanied by geospatial data. Specifically, this process is being tested in a community-based watershed in...
The Internet Compendium: Subject Guides to Humanities Resources.

ERIC Educational Resources Information Center

Rosenfeld, Louis; And Others

This guide describes and evaluates the Internet's humanities resources by subject. It offers information on a multitude of listservs; Usenet newsgroups; forums; electronic journals; topical mailing lists; text archives; Freenets; bulletin boards; FAQs; newsletters; real-time chats; databases; and library catalogs. Internet users can draw upon…
Berkeley Lab - Materials Sciences Division

Science.gov Websites

Postdoc Forum Research Highlights Awards Publications Database Events Calendar Newsletter Archive People ; Finance Templates Travel One-Stop Investigators Division Staff Facilities and Centers Staff Jobs People Division, please use the links here. An outline of the Division structure is available at the Organization
Organizations - I | STORET Legacy Data Center | US EPA

EPA Pesticide Factsheets

2007-05-16

The U.S. Environmental Protection Agency (EPA) maintains two data management systems containing water quality information for the nation's waters: the Legacy Data Center (LDC), and STORET. The LDC is a static, archived database and STORET is an operational system actively being populated with water quality data.
Glossary | STORET Legacy Data Center | US EPA

EPA Pesticide Factsheets

2011-02-14

The U.S. Environmental Protection Agency (EPA) maintains two data management systems containing water quality information for the nation's waters: the Legacy Data Center (LDC), and STORET. The LDC is a static, archived database and STORET is an operational system actively being populated with water quality data.
Contacts | STORET Legacy Data Center | US EPA

EPA Pesticide Factsheets

2007-05-16

The U.S. Environmental Protection Agency (EPA) maintains two data management systems containing water quality information for the nation's waters: the Legacy Data Center (LDC), and STORET. The LDC is a static, archived database and STORET is an operational system actively being populated with water quality data.
Archiving and Distributing Seismic Data at the Southern California Earthquake Data Center (SCEDC)

NASA Astrophysics Data System (ADS)

Appel, V. L.

2002-12-01

The Southern California Earthquake Data Center (SCEDC) archives and provides public access to earthquake parametric and waveform data gathered by the Southern California Seismic Network and since January 1, 2001, the TriNet seismic network, southern California's earthquake monitoring network. The parametric data in the archive includes earthquake locations, magnitudes, moment-tensor solutions and phase picks. The SCEDC waveform archive prior to TriNet consists primarily of short-period, 100-samples-per-second waveforms from the SCSN. The addition of the TriNet array added continuous recordings of 155 broadband stations (20 samples per second or less), and triggered seismograms from 200 accelerometers and 200 short-period instruments. Since the Data Center and TriNet use the same Oracle database system, new earthquake data are available to the seismological community in near real-time. Primary access to the database and waveforms is through the Seismogram Transfer Program (STP) interface. The interface enables users to search the database for earthquake information, phase picks, and continuous and triggered waveform data. Output is available in SAC, miniSEED, and other formats. Both the raw counts format (V0) and the gain-corrected format (V1) of COSMOS (Consortium of Organizations for Strong-Motion Observation Systems) are now supported by STP. EQQuest is an interface to prepackaged waveform data sets for select earthquakes in Southern California stored at the SCEDC. Waveform data for large-magnitude events have been prepared and new data sets will be available for download in near real-time following major events. The parametric data from 1981 to present has been loaded into the Oracle 9.2.0.1 database system and the waveforms for that time period have been converted to mSEED format and are accessible through the STP interface. The DISC optical-disk system (the "jukebox") that currently serves as the mass-storage for the SCEDC is in the process of being replaced with a series of inexpensive high-capacity (1.6 Tbyte) magnetic-disk RAIDs. These systems are built with PC-technology components, using 16 120-Gbyte IDE disks, hot-swappable disk trays, two RAID controllers, dual redundant power supplies and a Linux operating system. The system is configured over a private gigabit network that connects to the two Data Center servers and spans between the Seismological Lab and the USGS. To ensure data integrity, each RAID disk system constantly checks itself against its twin and verifies file integrity using 128-bit MD5 file checksums that are stored separate from the system. The final level of data protection is a Sony AIT-3 tape backup of the files. The primary advantage of the magnetic-disk approach is faster data access because magnetic disk drives have almost no latency. This means that the SCEDC can provide better "on-demand" interactive delivery of the seismograms in the archive.
Archaeological Feature Detection from Archive Aerial Photography with a Sfm-Mvs and Image Enhancement Pipeline

NASA Astrophysics Data System (ADS)

Peppa, M. V.; Mills, J. P.; Fieber, K. D.; Haynes, I.; Turner, S.; Turner, A.; Douglas, M.; Bryan, P. G.

2018-05-01

Understanding and protecting cultural heritage involves the detection and long-term documentation of archaeological remains alongside the spatio-temporal analysis of their landscape evolution. Archive aerial photography can illuminate traces of ancient features which typically appear with different brightness values from their surrounding environment, but are not always well defined. This research investigates the implementation of the Structure-from-Motion - Multi-View Stereo image matching approach with an image enhancement algorithm to derive three epochs of orthomosaics and digital surface models from visible and near infrared historic aerial photography. The enhancement algorithm uses decorrelation stretching to improve the contrast of the orthomosaics so as archaeological features are better detected. Results include 2D / 3D locations of detected archaeological traces stored into a geodatabase for further archaeological interpretation and correlation with benchmark observations. The study also discusses the merits and difficulties of the process involved. This research is based on a European-wide project, entitled "Cultural Heritage Through Time", and the case study research was carried out as a component of the project in the UK.
Interdisciplinary Collaboration amongst Colleagues and between Initiatives with the Magnetics Information Consortium (MagIC) Database

NASA Astrophysics Data System (ADS)

Minnett, R.; Koppers, A. A. P.; Jarboe, N.; Tauxe, L.; Constable, C.; Jonestrask, L.; Shaar, R.

2014-12-01

Earth science grand challenges often require interdisciplinary and geographically distributed scientific collaboration to make significant progress. However, this organic collaboration between researchers, educators, and students only flourishes with the reduction or elimination of technological barriers. The Magnetics Information Consortium (http://earthref.org/MagIC/) is a grass-roots cyberinfrastructure effort envisioned by the geo-, paleo-, and rock magnetic scientific community to archive their wealth of peer-reviewed raw data and interpretations from studies on natural and synthetic samples. MagIC is dedicated to facilitating scientific progress towards several highly multidisciplinary grand challenges and the MagIC Database team is currently beta testing a new MagIC Search Interface and API designed to be flexible enough for the incorporation of large heterogeneous datasets and for horizontal scalability to tens of millions of records and hundreds of requests per second. In an effort to reduce the barriers to effective collaboration, the search interface includes a simplified data model and upload procedure, support for online editing of datasets amongst team members, commenting by reviewers and colleagues, and automated contribution workflows and data retrieval through the API. This web application has been designed to generalize to other databases in MagIC's umbrella website (EarthRef.org) so the Geochemical Earth Reference Model (http://earthref.org/GERM/) portal, Seamount Biogeosciences Network (http://earthref.org/SBN/), EarthRef Digital Archive (http://earthref.org/ERDA/) and EarthRef Reference Database (http://earthref.org/ERR/) will benefit from its development.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes

PubMed Central

Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar

2012-01-01

DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. Availability The database is available for free at http://mail.nbfgr.res.in/fbis/ PMID:22715304
The BIG Data Center: from deposition to integration to translation.

PubMed

2017-01-04

Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Archive for the history of psychology in Spain: The Archivo Histórico, Bibliográfico y Documental de Psicología of the Universidad Autónoma de Madrid.

PubMed

Quintana, José; Sáiz, Milagros; Balltondre, Mónica; Sáiz, Dolors

2012-11-01

In this article, we describe the content, sources, and history of the Archivo Histórico de la Facultad de Psicología (the Historical Archive for the History of Psychology) at the Universidad Autónoma de Madrid (UAM, Spain). This archive is the result of the task carried out by some professors of the Faculty of Psychology at UAM for the preservation and increase of sources for a history of psychology in Spain. Collections from the 19th to the 20th century were recovered because of the UAM effort and some other contributions. Most of the sources for a history of psychology in the Spanish context were unknown and nearly lost before Faculty of Psychology's task. Among other projects, UAM archive is acquiring classical texts of psychology by buying facsimiles from different publishing houses and, what is more relevant, they guarantee access to the sources for research purposes. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
Provenance Store Evaluation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paulson, Patrick R.; Gibson, Tara D.; Schuchardt, Karen L.

2008-03-01

Requirements for the provenance store and access API are developed. Existing RDF stores and APIs are evaluated against the requirements and performance benchmarks. The team’s conclusion is to use MySQL as a database backend, with a possible move to Oracle in the near-term future. Both Jena and Sesame’s APIs will be supported, but new code will use the Jena API
MAKER-P: a tool-kit for the creation, management, and quality control of plant genome annotations

USDA-ARS?s Scientific Manuscript database

We have optimized and extended the widely used annotation-engine MAKER for use on plant genomes. We have benchmarked the resulting software, MAKER-P, using the A. thaliana genome and the TAIR10 gene models. Here we demonstrate the ability of the MAKER-P toolkit to generate de novo repeat databases, ...
Sport and Social Inclusion: Evidence from the Performance of Public Leisure Facilities

ERIC Educational Resources Information Center

Liu, Yi-De

2009-01-01

In the UK, sport is increasingly recognized as a means for promoting social inclusion. However, evaluation, to date, is limited with regard to the achievement of social inclusion through sport. Based on the database of Sport England's National Benchmarking Service, this paper aims to investigate the extent to which public leisure facilities were…

Conflation and integration of archived geologic maps and associated uncertainties

USGS Publications Warehouse

Shoberg, Thomas G.

2016-01-01

Old, archived geologic maps are often available with little or no associated metadata. This creates special problems in terms of extracting their data to use with a modern database. This research focuses on some problems and uncertainties associated with conflating older geologic maps in regions where modern geologic maps are, as yet, non-existent as well as vertically integrating the conflated maps with layers of modern GIS data (in this case, The National Map of the U.S. Geological Survey). Ste. Genevieve County, Missouri was chosen as the test area. It is covered by six archived geologic maps constructed in the years between 1928 and 1994. Conflating these maps results in a map that is internally consistent with these six maps, is digitally integrated with hydrography, elevation and orthoimagery data, and has a 95% confidence interval useful for further data set integration.
CARMENES. Mining public archives for stellar parameters and spectra of M dwarfs with master thesis students

NASA Astrophysics Data System (ADS)

Caballero, J. A.; Montes, D.; Alonso-Floriano, F. J.; Cortés-Contreras, M.; González-Álvarez, E.; Hidalgo, D.; Holgado, G.; Martínez-Rodríguez, H.; Sanz-Forcada, J.; López-Santiago, J.

2015-05-01

We are compiling the most comprehensive database of M dwarfs ever built, CARMENCITA, the CARMENES Cool dwarf Information and daTa Archive, which will be the CARMENES 'input catalogue'. In addition to the science preparation with low- and high-resolution spectrographs and lucky imagers, we compile a huge pile of public data on over 2200 M dwarfs, and analyse them, mostly using virtual-observatory tools. Here we describe four specific actions carried out by master students. They mine public archives for additional high-resolution spectroscopy (UVES, FEROS and HARPS), multi-band photometry (FUV-NUV-u-B-g-V-r-R-i-J-H-Ks-W1-W2-W3-W4), X-ray data (ROSAT, XMM-Newton and Chandra), and periods, rotational velocities and Hα pseudo-equivalent widths. As described, there are many interdependences between all these data.
The MATISSE analysis of large spectral datasets from the ESO Archive

NASA Astrophysics Data System (ADS)

Worley, C.; de Laverny, P.; Recio-Blanco, A.; Hill, V.; Vernisse, Y.; Ordenovic, C.; Bijaoui, A.

2010-12-01

The automated stellar classification algorithm, MATISSE, has been developed at the Observatoire de la Côte d'Azur (OCA) in order to determine stellar temperatures, gravities and chemical abundances for large datasets of stellar spectra. The Gaia Data Processing and Analysis Consortium (DPAC) has selected MATISSE as one of the key programmes to be used in the analysis of the Gaia Radial Velocity Spectrometer (RVS) spectra. MATISSE is currently being used to analyse large datasets of spectra from the ESO archive with the primary goal of producing advanced data products to be made available in the ESO database via the Virtual Observatory. This is also an invaluable opportunity to identify and address issues that can be encountered with the analysis large samples of real spectra prior to the launch of Gaia in 2012. The analysis of the archived spectra of the FEROS spectrograph is currently underway and preliminary results are presented.
Epworth HealthCare cardiac surgery audit report 2011.

PubMed

Chorley, T; Baker, L

2012-10-01

2011 is the first year Epworth has contributed to Australian and New Zealand Society of Cardiac and Thoracic Surgeons cardiac surgery database. There is now a 30-day follow-up data for all cardiac surgical patients as well as benchmarking of our results with 19 public hospitals and 6 private hospitals contributing data to the Australian and New Zealand Society of Cardiac and Thoracic Surgeons. This is an extension of the John Fuller Melbourne University database that has compiled cardiac surgery data for the last 30 years. © 2012 The Authors; Internal Medicine Journal © 2012 Royal Australasian College of Physicians.
The GEISA 2009 Spectroscopic Database System and its CNES/CNRS Ether Products and Services Center Interactive Distribution

NASA Astrophysics Data System (ADS)

Jacquinet-Husson, Nicole; Crépeau, Laurent; Capelle, Virginie; Scott, Noëlle; Armante, Raymond; Chédin, Alain; Boonne, Cathy; Poulet-Crovisier, Nathalie

2010-05-01

The GEISA (1) (Gestion et Etude des Informations Spectroscopiques Atmosphériques: Management and Study of Atmospheric Spectroscopic Information) computer-accessible database, initiated in 1976, is developed and maintained at LMD (Laboratoire de Météorologie Dynamique, France) a system comprising three independent sub-databases devoted respectively to : line transition parameters, infrared and ultraviolet/visible absorption cross-sections, microphysical and optical properties of atmospheric aerosols. The updated 2009 edition (GEISA-09) archives, in its line transition parameters sub-section, 50 molecules, corresponding to 111 isotopes, for a total of 3,807,997 entries, in the spectral range from 10-6 to 35,877.031 cm-1. Detailed description of the whole database contents will be documented. GEISA and GEISA/IASI are implemented on the CNES/CNRS Ether Products and Services Centre WEB site (http://ether.ipsl.jussieu.fr), where all archived spectroscopic data can be handled through general and user friendly associated management software facilities. These facilities will be described and widely illustrated, as well. Interactive demonstrations will be given if technical possibilities are feasible at the time of the Poster Display Session. More than 350 researchers are registered for on line use of GEISA on Ether. Currently, GEISA is involved in activities (2) related to the remote sensing of the terrestrial atmosphere thanks to the sounding performances of new generation of hyperspectral Earth' atmospheric sounders, like AIRS (Atmospheric Infrared Sounder -http://www-airs.jpl.nasa.gov/), in the USA, and IASI (Infrared Atmospheric Sounding Interferometer -http://earth-sciences.cnes.fr/IASI/) in Europe, using the 4A radiative transfer model (3) (4A/LMD http://ara.lmd.polytechnique.fr; 4A/OP co-developed by LMD and NOVELTIS -http://www.noveltis.fr/) with the support of CNES (2006). Refs: (1) Jacquinet-Husson N., N.A. Scott, A. Chédin,L. Crépeau, R. Armante, V. Capelle, J. Orphal, A. Coustenis, C. Boonne, N. Poulet-Crovisier, et al. : THE GEISA SPECTROSCOPIC DATABASE: Current and future archive for Earth and planetary atmosphere studies. JQSRT 109 (2008) 1043-1059. (2) Jacquinet-Husson N., N.A. Scott, A. Chédin, K. Garceran, R. Armante, et al. : The 2003 edition of the GEISA/IASI spectroscopic database. JQSRT, 95 (2005) 429-467. (3) Scott, N.A. and A. Chedin. A fast line-by-line method for atmospheric absorption computations: The Automatized Atmospheric Absorption Atlas. J. Appl. Meteor., 20 (1981) 556-564.
Aided generation of search interfaces to astronomical archives

NASA Astrophysics Data System (ADS)

Zorba, Sonia; Bignamini, Andrea; Cepparo, Francesco; Knapic, Cristina; Molinaro, Marco; Smareglia, Riccardo

2016-07-01

Astrophysical data provider organizations that host web based interfaces to provide access to data resources have to cope with possible changes in data management that imply partial rewrites of web applications. To avoid doing this manually it was decided to develop a dynamically configurable Java EE web application that can set itself up reading needed information from configuration files. Specification of what information the astronomical archive database has to expose is managed using the TAP SCHEMA schema from the IVOA TAP recommendation, that can be edited using a graphical interface. When configuration steps are done the tool will build a war file to allow easy deployment of the application.
Internet Services for Professional Astronomy

NASA Astrophysics Data System (ADS)

Andernach, H.

A (subjective) overview of Internet resources relevant to professional astronomers is given. Special emphasis is put on databases of astronomical objects and servers providing general information, e.g. on astronomical catalogues, finding charts from sky surveys, bibliographies, directories, browsers through multi-wavelength observational archives, etc. Archives of specific observational data will be discussed in more detail in other chapters of this book, dealing with the corresponding part of the electromagnetic spectrum. About 200 different links are mentioned, and every attempt was made to make this report as up-to-date as possible. As the field is rapidly growing with improved network technology, it will be just a snapshot of the situation in mid-1998.
NCBI GEO: archive for functional genomics data sets--update.

PubMed

Barrett, Tanya; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Holko, Michelle; Yefanov, Andrey; Lee, Hyeseung; Zhang, Naigong; Robertson, Cynthia L; Serova, Nadezhda; Davis, Sean; Soboleva, Alexandra

2013-01-01

The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.
Information Retrieval Using ADABAS-NATURAL (with Applications for Television and Radio).

ERIC Educational Resources Information Center

Silbergeld, I.; Kutok, P.

1984-01-01

Describes use of the software ADABAS (general purpose database management system) and NATURAL (interactive programing language) in development and implementation of an information retrieval system for the National Television and Radio Network of Israel. General design considerations, files contained in each archive, search strategies, and keywords…
The Internet Compendium: Subject Guides to Health and Science Resources.

ERIC Educational Resources Information Center

Rosenfeld, Louis; And Others

This guide describes and evaluates the Internet's health and science resources by subject. It offers information on a multitude of listservs; Usenet newsgroups; forums; electronic journals; topical mailing lists; text archives; Freenets; bulletin boards; FAQs; newsletters; real-time chats; databases; and library catalogs. From alternative medicine…
The ICPSR and Social Science Research

ERIC Educational Resources Information Center

Johnson, Wendell G.

2008-01-01

The Inter-university Consortium for Political and Social Research (ICPSR), a unit within the Institute for Social Research at the University of Michigan, is the world's largest social science data archive. The data sets in the ICPRS database give the social sciences librarian/subject specialist an opportunity of providing value-added bibliographic…
77 FR 1066 - Information Collection Being Reviewed by the Federal Communications Commission

Federal Register 2010, 2011, 2012, 2013, 2014

2012-01-09

... provide a pending File Number for an Antenna Structure Registration (ASR). Previously ULS would only...: Application for Antenna Structure Registration. Form Number: FCC Form 854. Type of Review: Revision of a... will be archived after being keyed or scanned into the Antenna Structure Registration (ASR) database...
Nucleic Acid Database (NDB)

Science.gov Websites

the NDB archive or in the Non-Redundant list Advanced Search Search for structures based on structural features, chemical features, binding modes, citation and experimental information Featured Tools RNA 3D Motif Atlas, a representative collection of RNA 3D internal and hairpin loop motifs Non-redundant Lists
National Operational Hydrologic Remote Sensing Center - The ultimate source

Science.gov Websites

Analysis Satellite Obs Forecasts Data Archive SHEF Products Observations near City, ST Go Science Database Airborne Snow Surveys Satellite Snow Cover Mapping Snow Modeling and Data Assimilation Analyses polar-orbiting and geostationary satellite imagery. Maps are provided for the U.S. and the northern
Legacy STORET Level 5 | STORET Legacy Data Center | US ...

EPA Pesticide Factsheets

2007-05-16

The U.S. Environmental Protection Agency (EPA) maintains two data management systems containing water quality information for the nation's waters: the Legacy Data Center (LDC), and STORET. The LDC is a static, archived database and STORET is an operational system actively being populated with water quality data.
Lycoming County Women's History Curriculum Guide

ERIC Educational Resources Information Center

Rogers, Amy

2011-01-01

This project plays upon technology and an innovative way to research history -- the use of online primary source data that highlight the history of women in volunteer and reform organizations, education, the arts, the workplace, and private lives. Using an archival database from the Lycoming County Women's History collection, this curriculum…
Tank 241-AY-102 Leak Assessment Supporting Documentation: Miscellaneous Reports, Letters, Memoranda, And Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engeman, J. K.; Girardot, C. L.; Harlow, D. G.

2012-12-20

This report contains reference materials cited in RPP-ASMT -53793, Tank 241-AY-102 Leak Assessment Report, that were obtained from the National Archives Federal Records Repository in Seattle, Washington, or from other sources including the Hanford Site's Integrated Data Management System database (IDMS).
In-database processing of a large collection of remote sensing data: applications and implementation

NASA Astrophysics Data System (ADS)

Kikhtenko, Vladimir; Mamash, Elena; Chubarov, Dmitri; Voronina, Polina

2016-04-01

Large archives of remote sensing data are now available to scientists, yet the need to work with individual satellite scenes or product files constrains studies that span a wide temporal range or spatial extent. The resources (storage capacity, computing power and network bandwidth) required for such studies are often beyond the capabilities of individual geoscientists. This problem has been tackled before in remote sensing research and inspired several information systems. Some of them such as NASA Giovanni [1] and Google Earth Engine have already proved their utility for science. Analysis tasks involving large volumes of numerical data are not unique to Earth Sciences. Recent advances in data science are enabled by the development of in-database processing engines that bring processing closer to storage, use declarative query languages to facilitate parallel scalability and provide high-level abstraction of the whole dataset. We build on the idea of bridging the gap between file archives containing remote sensing data and databases by integrating files into relational database as foreign data sources and performing analytical processing inside the database engine. Thereby higher level query language can efficiently address problems of arbitrary size: from accessing the data associated with a specific pixel or a grid cell to complex aggregation over spatial or temporal extents over a large number of individual data files. This approach was implemented using PostgreSQL for a Siberian regional archive of satellite data products holding hundreds of terabytes of measurements from multiple sensors and missions taken over a decade-long span. While preserving the original storage layout and therefore compatibility with existing applications the in-database processing engine provides a toolkit for provisioning remote sensing data in scientific workflows and applications. The use of SQL - a widely used higher level declarative query language - simplifies interoperability between desktop GIS, web applications and geographic web services and interactive scientific applications (MATLAB, IPython). The system is also automatically ingesting direct readout data from meteorological and research satellites in near-real time with distributed acquisition workflows managed by Taverna workflow engine [2]. The system has demonstrated its utility in performing non-trivial analytic processing such as the computation of the Robust Satellite Technique (RST) indices [3]. It had been useful in different tasks such as studying urban heat islands, analyzing patterns in the distribution of wildfire occurrences, detecting phenomena related to seismic and earthquake activity. Initial experience has highlighted several limitations of the proposed approach yet it has demonstrated ability to facilitate the use of large archives of remote sensing data by geoscientists. 1. J.G. Acker, G. Leptoukh, Online analysis enhances use of NASA Earth science data. EOS Trans. AGU, 2007, 88(2), P. 14-17. 2. D. Hull, K. Wolsfencroft, R. Stevens, C. Goble, M.R. Pocock, P. Li and T. Oinn, Taverna: a tool for building and running workflows of services. Nucleic Acids Research. 2006. V. 34. P. W729-W732. 3. V. Tramutoli, G. Di Bello, N. Pergola, S. Piscitelli, Robust satellite techniques for remote sensing of seismically active areas // Annals of Geophysics. 2001. no. 44(2). P. 295-312.
JAMSTEC DARWIN Database Assimilates GANSEKI and COEDO

NASA Astrophysics Data System (ADS)

Tomiyama, T.; Toyoda, Y.; Horikawa, H.; Sasaki, T.; Fukuda, K.; Hase, H.; Saito, H.

2017-12-01

Introduction: Japan Agency for Marine-Earth Science and Technology (JAMSTEC) archives data and samples obtained by JAMSTEC research vessels and submersibles. As a common property of the human society, JAMSTEC archive is open for public users with scientific/educational purposes [1]. For publicizing its data and samples online, JAMSTEC is operating NUUNKUI data sites [2], a group of several databases for various data and sample types. For years, data and metadata of JAMSTEC rock samples, sediment core samples and cruise/dive observation were publicized through databases named GANSEKI, COEDO, and DARWIN, respectively. However, because they had different user interfaces and data structures, these services were somewhat confusing for unfamiliar users. Maintenance costs of multiple hardware and software were also problematic for performing sustainable services and continuous improvements. Database Integration: In 2017, GANSEKI, COEDO and DARWIN were integrated into DARWIN+ [3]. The update also included implementation of map-search function as a substitute of closed portal site. Major functions of previous systems were incorporated into the new system; users can perform the complex search, by thumbnail browsing, map area, keyword filtering, and metadata constraints. As for data handling, the new system is more flexible, allowing the entry of variety of additional data types. Data Management: After the DARWIN major update, JAMSTEC data & sample team has been dealing with minor issues of individual sample data/metadata which sometimes need manual modification to be transferred to the new system. Some new data sets, such as onboard sample photos and surface close-up photos of rock samples, are getting available online. Geochemical data of sediment core samples will supposedly be added in the near future. Reference: [1] http://www.jamstec.go.jp/e/database/data_policy.html [2] http://www.godac.jamstec.go.jp/jmedia/portal/e/ [3] http://www.godac.jamstec.go.jp/darwin/e/
An environmental database for Venice and tidal zones

NASA Astrophysics Data System (ADS)

Macaluso, L.; Fant, S.; Marani, A.; Scalvini, G.; Zane, O.

2003-04-01

The natural environment is a complex, highly variable and physically non reproducible system (not in laboratory, nor in a confined territory). Environmental experimental studies are thus necessarily based on field measurements distributed in time and space. Only extensive data collections can provide the representative samples of the system behavior which are essential for scientific advancement. The assimilation of large data collections into accessible archives must necessarily be implemented in electronic databases. In the case of tidal environments in general, and of the Venice lagoon in particular, it is useful to establish a database, freely accessible to the scientific community, documenting the dynamics of such systems and their response to anthropic pressures and climatic variability. At the Istituto Veneto di Scienze, Lettere ed Arti in Venice (Italy) two internet environmental databases has been developed: one collects information regarding in detail the Venice lagoon; the other co-ordinate the research consortium of the "TIDE" EU RTD project, that attends to three different tidal areas: Venice Lagoon (Italy), Morecambe Bay (England), and Forth Estuary (Scotland). The archives may be accessed through the URL: www.istitutoveneto.it. The first one is freely available and applies to anyone is interested. It is continuously updated and has been structured in order to promote documentation concerning Venetian environment and disseminate this information for educational purposes (see "Dissemination" section). The second one is supplied by scientists and engineers working on this tidal system for various purposes (scientific, management, conservation purposes, etc.); it applies to interested researchers and grows with their own contributions. Both intend to promote scientific communication, to contribute to the realization of a distributed information system collecting homogeneous themes, and to initiate the interconnection among databases regarding different kinds of environment.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.