Sample records for fastq file format

  1. KungFQ: a simple and powerful approach to compress fastq files.

    PubMed

    Grassi, Elena; Di Gregorio, Federico; Molineris, Ivan

    2012-01-01

    Nowadays storing data derived from deep sequencing experiments has become pivotal and standard compression algorithms do not exploit in a satisfying manner their structure. A number of reference-based compression algorithms have been developed but they are less adequate when approaching new species without fully sequenced genomes or nongenomic data. We developed a tool that takes advantages of fastq characteristics and encodes them in a binary format optimized in order to be further compressed with standard tools (such as gzip or lzma). The algorithm is straightforward and does not need any external reference file, it scans the fastq only once and has a constant memory requirement. Moreover, we added the possibility to perform lossy compression, losing some of the original information (IDs and/or qualities) but resulting in smaller files; it is also possible to define a quality cutoff under which corresponding base calls are converted to N. We achieve 2.82 to 7.77 compression ratios on various fastq files without losing information and 5.37 to 8.77 losing IDs, which are often not used in common analysis pipelines. In this paper, we compare the algorithm performance with known tools, usually obtaining higher compression levels.

  2. biobambam: tools for read pair collation based algorithms on BAM files

    PubMed Central

    2014-01-01

    Background Sequence alignment data is often ordered by coordinate (id of the reference sequence plus position on the sequence where the fragment was mapped) when stored in BAM files, as this simplifies the extraction of variants between the mapped data and the reference or of variants within the mapped data. In this order paired reads are usually separated in the file, which complicates some other applications like duplicate marking or conversion to the FastQ format which require to access the full information of the pairs. Results In this paper we introduce biobambam, a set of tools based on the efficient collation of alignments in BAM files by read name. The employed collation algorithm avoids time and space consuming sorting of alignments by read name where this is possible without using more than a specified amount of main memory. Using this algorithm tasks like duplicate marking in BAM files and conversion of BAM files to the FastQ format can be performed very efficiently with limited resources. We also make the collation algorithm available in the form of an API for other projects. This API is part of the libmaus package. Conclusions In comparison with previous approaches to problems involving the collation of alignments by read name like the BAM to FastQ or duplication marking utilities our approach can often perform an equivalent task more efficiently in terms of the required main memory and run-time. Our BAM to FastQ conversion is faster than all widely known alternatives including Picard and bamUtil. Our duplicate marking is about as fast as the closest competitor bamUtil for small data sets and faster than all known alternatives on large and complex data sets.

  3. TagDigger: user-friendly extraction of read counts from GBS and RAD-seq data.

    PubMed

    Clark, Lindsay V; Sacks, Erik J

    2016-01-01

    In genotyping-by-sequencing (GBS) and restriction site-associated DNA sequencing (RAD-seq), read depth is important for assessing the quality of genotype calls and estimating allele dosage in polyploids. However, existing pipelines for GBS and RAD-seq do not provide read counts in formats that are both accurate and easy to access. Additionally, although existing pipelines allow previously-mined SNPs to be genotyped on new samples, they do not allow the user to manually specify a subset of loci to examine. Pipelines that do not use a reference genome assign arbitrary names to SNPs, making meta-analysis across projects difficult. We created the software TagDigger, which includes three programs for analyzing GBS and RAD-seq data. The first script, tagdigger_interactive.py, rapidly extracts read counts and genotypes from FASTQ files using user-supplied sets of barcodes and tags. Input and output is in CSV format so that it can be opened by spreadsheet software. Tag sequences can also be imported from the Stacks, TASSEL-GBSv2, TASSEL-UNEAK, or pyRAD pipelines, and a separate file can be imported listing the names of markers to retain. A second script, tag_manager.py, consolidates marker names and sequences across multiple projects. A third script, barcode_splitter.py, assists with preparing FASTQ data for deposit in a public archive by splitting FASTQ files by barcode and generating MD5 checksums for the resulting files. TagDigger is open-source and freely available software written in Python 3. It uses a scalable, rapid search algorithm that can process over 100 million FASTQ reads per hour. TagDigger will run on a laptop with any operating system, does not consume hard drive space with intermediate files, and does not require programming skill to use.

  4. GTZ: a fast compression and cloud transmission tool optimized for FASTQ files.

    PubMed

    Xing, Yuting; Li, Gen; Wang, Zhenguo; Feng, Bolun; Song, Zhuo; Wu, Chengkun

    2017-12-28

    The dramatic development of DNA sequencing technology is generating real big data, craving for more storage and bandwidth. To speed up data sharing and bring data to computing resource faster and cheaper, it is necessary to develop a compression tool than can support efficient compression and transmission of sequencing data onto the cloud storage. This paper presents GTZ, a compression and transmission tool, optimized for FASTQ files. As a reference-free lossless FASTQ compressor, GTZ treats different lines of FASTQ separately, utilizes adaptive context modelling to estimate their characteristic probabilities, and compresses data blocks with arithmetic coding. GTZ can also be used to compress multiple files or directories at once. Furthermore, as a tool to be used in the cloud computing era, it is capable of saving compressed data locally or transmitting data directly into cloud by choice. We evaluated the performance of GTZ on some diverse FASTQ benchmarks. Results show that in most cases, it outperforms many other tools in terms of the compression ratio, speed and stability. GTZ is a tool that enables efficient lossless FASTQ data compression and simultaneous data transmission onto to cloud. It emerges as a useful tool for NGS data storage and transmission in the cloud environment. GTZ is freely available online at: https://github.com/Genetalks/gtz .

  5. Compression of next-generation sequencing quality scores using memetic algorithm

    PubMed Central

    2014-01-01

    Background The exponential growth of next-generation sequencing (NGS) derived DNA data poses great challenges to data storage and transmission. Although many compression algorithms have been proposed for DNA reads in NGS data, few methods are designed specifically to handle the quality scores. Results In this paper we present a memetic algorithm (MA) based NGS quality score data compressor, namely MMQSC. The algorithm extracts raw quality score sequences from FASTQ formatted files, and designs compression codebook using MA based multimodal optimization. The input data is then compressed in a substitutional manner. Experimental results on five representative NGS data sets show that MMQSC obtains higher compression ratio than the other state-of-the-art methods. Particularly, MMQSC is a lossless reference-free compression algorithm, yet obtains an average compression ratio of 22.82% on the experimental data sets. Conclusions The proposed MMQSC compresses NGS quality score data effectively. It can be utilized to improve the overall compression ratio on FASTQ formatted files. PMID:25474747

  6. FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, Joseph; Pirrung, Meg; McCue, Lee Ann

    FQC is software that facilitates large-scale quality control of FASTQ files by carrying out a QC protocol, parsing results, and aggregating quality metrics within and across experiments into an interactive dashboard. The dashboard utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.

  7. FASTdoop: a versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications.

    PubMed

    Ferraro Petrillo, Umberto; Roscigno, Gianluca; Cattaneo, Giuseppe; Giancarlo, Raffaele

    2017-05-15

    MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers versatility and efficiency. That is, it can handle collections of reads, with or without quality scores, as well as long genomic sequences while the existing routines concentrate mainly on NGS sequence data. Moreover, in the domain where a comparison is possible, the routines proposed here are faster than the available ones. In conclusion, FASTdoop is a much needed addition to Hadoop-BAM. The software and the datasets are available at http://www.di.unisa.it/FASTdoop/ . umberto.ferraro@uniroma1.it. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  8. FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, Joseph; Pirrung, Meg; McCue, Lee Ann

    FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.

  9. FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool

    DOE PAGES

    Brown, Joseph; Pirrung, Meg; McCue, Lee Ann

    2017-06-09

    FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.

  10. LFQC: a lossless compression algorithm for FASTQ files

    PubMed Central

    Nicolae, Marius; Pathak, Sudipta; Rajasekaran, Sanguthevar

    2015-01-01

    Motivation: Next Generation Sequencing (NGS) technologies have revolutionized genomic research by reducing the cost of whole genome sequencing. One of the biggest challenges posed by modern sequencing technology is economic storage of NGS data. Storing raw data is infeasible because of its enormous size and high redundancy. In this article, we address the problem of storage and transmission of large FASTQ files using innovative compression techniques. Results: We introduce a new lossless non-reference based FASTQ compression algorithm named Lossless FASTQ Compressor. We have compared our algorithm with other state of the art big data compression algorithms namely gzip, bzip2, fastqz (Bonfield and Mahoney, 2013), fqzcomp (Bonfield and Mahoney, 2013), Quip (Jones et al., 2012), DSRC2 (Roguski and Deorowicz, 2014). This comparison reveals that our algorithm achieves better compression ratios on LS454 and SOLiD datasets. Availability and implementation: The implementations are freely available for non-commercial purposes. They can be downloaded from http://engr.uconn.edu/rajasek/lfqc-v1.1.zip. Contact: rajasek@engr.uconn.edu PMID:26093148

  11. XS: a FASTQ read simulator.

    PubMed

    Pratas, Diogo; Pinho, Armando J; Rodrigues, João M O S

    2014-01-16

    The emerging next-generation sequencing (NGS) is bringing, besides the natural huge amounts of data, an avalanche of new specialized tools (for analysis, compression, alignment, among others) and large public and private network infrastructures. Therefore, a direct necessity of specific simulation tools for testing and benchmarking is rising, such as a flexible and portable FASTQ read simulator, without the need of a reference sequence, yet correctly prepared for producing approximately the same characteristics as real data. We present XS, a skilled FASTQ read simulation tool, flexible, portable (does not need a reference sequence) and tunable in terms of sequence complexity. It has several running modes, depending on the time and memory available, and is aimed at testing computing infrastructures, namely cloud computing of large-scale projects, and testing FASTQ compression algorithms. Moreover, XS offers the possibility of simulating the three main FASTQ components individually (headers, DNA sequences and quality-scores). XS provides an efficient and convenient method for fast simulation of FASTQ files, such as those from Ion Torrent (currently uncovered by other simulators), Roche-454, Illumina and ABI-SOLiD sequencing machines. This tool is publicly available at http://bioinformatics.ua.pt/software/xs/.

  12. DSRC 2--Industry-oriented compression of FASTQ files.

    PubMed

    Roguski, Lukasz; Deorowicz, Sebastian

    2014-08-01

    Modern sequencing platforms produce huge amounts of data. Archiving them raises major problems but is crucial for reproducibility of results, one of the most fundamental principles of science. The widely used gzip compressor, used for reduction of storage and transfer costs, is not a perfect solution, so a few specialized FASTQ compressors were proposed recently. Unfortunately, they are often impractical because of slow processing, lack of support for some variants of FASTQ files or instability. We propose DSRC 2 that offers compression ratios comparable with the best existing solutions, while being a few times faster and more flexible. DSRC 2 is freely available at http://sun.aei.polsl.pl/dsrc. The package contains command-line compressor, C and Python libraries for easy integration with existing software and technical documentation with examples of usage. sebastian.deorowicz@polsl.pl Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool.

    PubMed

    Brown, Joseph; Pirrung, Meg; McCue, Lee Ann

    2017-06-09

    FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data. FQC is implemented in Python 3 and Javascript, and is maintained under an MIT license. Documentation and source code is available at: https://github.com/pnnl/fqc . joseph.brown@pnnl.gov. © The Author(s) 2017. Published by Oxford University Press.

  14. AfterQC: automatic filtering, trimming, error removing and quality control for fastq data.

    PubMed

    Chen, Shifu; Huang, Tanxiao; Zhou, Yanqing; Han, Yue; Xu, Mingyan; Gu, Jia

    2017-03-14

    Some applications, especially those clinical applications requiring high accuracy of sequencing data, usually have to face the troubles caused by unavoidable sequencing errors. Several tools have been proposed to profile the sequencing quality, but few of them can quantify or correct the sequencing errors. This unmet requirement motivated us to develop AfterQC, a tool with functions to profile sequencing errors and correct most of them, plus highly automated quality control and data filtering features. Different from most tools, AfterQC analyses the overlapping of paired sequences for pair-end sequencing data. Based on overlapping analysis, AfterQC can detect and cut adapters, and furthermore it gives a novel function to correct wrong bases in the overlapping regions. Another new feature is to detect and visualise sequencing bubbles, which can be commonly found on the flowcell lanes and may raise sequencing errors. Besides normal per cycle quality and base content plotting, AfterQC also provides features like polyX (a long sub-sequence of a same base X) filtering, automatic trimming and K-MER based strand bias profiling. For each single or pair of FastQ files, AfterQC filters out bad reads, detects and eliminates sequencer's bubble effects, trims reads at front and tail, detects the sequencing errors and corrects part of them, and finally outputs clean data and generates HTML reports with interactive figures. AfterQC can run in batch mode with multiprocess support, it can run with a single FastQ file, a single pair of FastQ files (for pair-end sequencing), or a folder for all included FastQ files to be processed automatically. Based on overlapping analysis, AfterQC can estimate the sequencing error rate and profile the error transform distribution. The results of our error profiling tests show that the error distribution is highly platform dependent. Much more than just another new quality control (QC) tool, AfterQC is able to perform quality control, data filtering, error profiling and base correction automatically. Experimental results show that AfterQC can help to eliminate the sequencing errors for pair-end sequencing data to provide much cleaner outputs, and consequently help to reduce the false-positive variants, especially for the low-frequency somatic mutations. While providing rich configurable options, AfterQC can detect and set all the options automatically and require no argument in most cases.

  15. Leading edge analysis of transcriptomic changes during pseudorabies virus infection

    USDA-ARS?s Scientific Manuscript database

    Eight RNA samples taken from the tracheobronchial lymph nodes (TBLN) of pigs that were either infected or non-infected with a feral isolate of porcine pseudorabies virus (PRV) were used to investigate changes in gene expression related to the pathogen. The RNA was processed into fastq files for each...

  16. A FASTQ compressor based on integer-mapped k-mer indexing for biologist.

    PubMed

    Zhang, Yeting; Patel, Khyati; Endrawis, Tony; Bowers, Autumn; Sun, Yazhou

    2016-03-15

    Next generation sequencing (NGS) technologies have gained considerable popularity among biologists. For example, RNA-seq, which provides both genomic and functional information, has been widely used by recent functional and evolutionary studies, especially in non-model organisms. However, storing and transmitting these large data sets (primarily in FASTQ format) have become genuine challenges, especially for biologists with little informatics experience. Data compression is thus a necessity. KIC, a FASTQ compressor based on a new integer-mapped k-mer indexing method, was developed (available at http://www.ysunlab.org/kic.jsp). It offers high compression ratio on sequence data, outstanding user-friendliness with graphic user interfaces, and proven reliability. Evaluated on multiple large RNA-seq data sets from both human and plants, it was found that the compression ratio of KIC had exceeded all major generic compressors, and was comparable to those of the latest dedicated compressors. KIC enables researchers with minimal informatics training to take advantage of the latest sequence compression technologies, easily manage large FASTQ data sets, and reduce storage and transmission cost. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. NGSCheckMate: software for validating sample identity in next-generation sequencing studies within and across data types.

    PubMed

    Lee, Sejoon; Lee, Soohyun; Ouellette, Scott; Park, Woong-Yang; Lee, Eunjung A; Park, Peter J

    2017-06-20

    In many next-generation sequencing (NGS) studies, multiple samples or data types are profiled for each individual. An important quality control (QC) step in these studies is to ensure that datasets from the same subject are properly paired. Given the heterogeneity of data types, file types and sequencing depths in a multi-dimensional study, a robust program that provides a standardized metric for genotype comparisons would be useful. Here, we describe NGSCheckMate, a user-friendly software package for verifying sample identities from FASTQ, BAM or VCF files. This tool uses a model-based method to compare allele read fractions at known single-nucleotide polymorphisms, considering depth-dependent behavior of similarity metrics for identical and unrelated samples. Our evaluation shows that NGSCheckMate is effective for a variety of data types, including exome sequencing, whole-genome sequencing, RNA-seq, ChIP-seq, targeted sequencing and single-cell whole-genome sequencing, with a minimal requirement for sequencing depth (>0.5X). An alignment-free module can be run directly on FASTQ files for a quick initial check. We recommend using this software as a QC step in NGS studies. https://github.com/parklab/NGSCheckMate. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. QualComp: a new lossy compressor for quality scores based on rate distortion theory

    PubMed Central

    2013-01-01

    Background Next Generation Sequencing technologies have revolutionized many fields in biology by reducing the time and cost required for sequencing. As a result, large amounts of sequencing data are being generated. A typical sequencing data file may occupy tens or even hundreds of gigabytes of disk space, prohibitively large for many users. This data consists of both the nucleotide sequences and per-base quality scores that indicate the level of confidence in the readout of these sequences. Quality scores account for about half of the required disk space in the commonly used FASTQ format (before compression), and therefore the compression of the quality scores can significantly reduce storage requirements and speed up analysis and transmission of sequencing data. Results In this paper, we present a new scheme for the lossy compression of the quality scores, to address the problem of storage. Our framework allows the user to specify the rate (bits per quality score) prior to compression, independent of the data to be compressed. Our algorithm can work at any rate, unlike other lossy compression algorithms. We envisage our algorithm as being part of a more general compression scheme that works with the entire FASTQ file. Numerical experiments show that we can achieve a better mean squared error (MSE) for small rates (bits per quality score) than other lossy compression schemes. For the organism PhiX, whose assembled genome is known and assumed to be correct, we show that it is possible to achieve a significant reduction in size with little compromise in performance on downstream applications (e.g., alignment). Conclusions QualComp is an open source software package, written in C and freely available for download at https://sourceforge.net/projects/qualcomp. PMID:23758828

  19. MutScan: fast detection and visualization of target mutations by scanning FASTQ data.

    PubMed

    Chen, Shifu; Huang, Tanxiao; Wen, Tiexiang; Li, Hong; Xu, Mingyan; Gu, Jia

    2018-01-22

    Some types of clinical genetic tests, such as cancer testing using circulating tumor DNA (ctDNA), require sensitive detection of known target mutations. However, conventional next-generation sequencing (NGS) data analysis pipelines typically involve different steps of filtering, which may cause miss-detection of key mutations with low frequencies. Variant validation is also indicated for key mutations detected by bioinformatics pipelines. Typically, this process can be executed using alignment visualization tools such as IGV or GenomeBrowse. However, these tools are too heavy and therefore unsuitable for validating mutations in ultra-deep sequencing data. We developed MutScan to address problems of sensitive detection and efficient validation for target mutations. MutScan involves highly optimized string-searching algorithms, which can scan input FASTQ files to grab all reads that support target mutations. The collected supporting reads for each target mutation will be piled up and visualized using web technologies such as HTML and JavaScript. Algorithms such as rolling hash and bloom filter are applied to accelerate scanning and make MutScan applicable to detect or visualize target mutations in a very fast way. MutScan is a tool for the detection and visualization of target mutations by only scanning FASTQ raw data directly. Compared to conventional pipelines, this offers a very high performance, executing about 20 times faster, and offering maximal sensitivity since it can grab mutations with even one single supporting read. MutScan visualizes detected mutations by generating interactive pile-ups using web technologies. These can serve to validate target mutations, thus avoiding false positives. Furthermore, MutScan can visualize all mutation records in a VCF file to HTML pages for cloud-friendly VCF validation. MutScan is an open source tool available at GitHub: https://github.com/OpenGene/MutScan.

  20. The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4)

    DOE PAGES

    Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos; ...

    2016-02-24

    The DOE-JGI Metagenome Annotation Pipeline (MAP v.4) performs structural and functional annotation for metagenomic sequences that are submitted to the Integrated Microbial Genomes with Microbiomes (IMG/M) system for comparative analysis. The pipeline runs on nucleotide sequences provide d via the IMG submission site. Users must first define their analysis projects in GOLD and then submit the associated sequence datasets consisting of scaffolds/contigs with optional coverage information and/or unassembled reads in fasta and fastq file formats. The MAP processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNAs, as well as CRISPR elements. Structural annotation ismore » followed by functional annotation including assignment of protein product names and connection to various protein family databases.« less

  1. The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos

    The DOE-JGI Metagenome Annotation Pipeline (MAP v.4) performs structural and functional annotation for metagenomic sequences that are submitted to the Integrated Microbial Genomes with Microbiomes (IMG/M) system for comparative analysis. The pipeline runs on nucleotide sequences provide d via the IMG submission site. Users must first define their analysis projects in GOLD and then submit the associated sequence datasets consisting of scaffolds/contigs with optional coverage information and/or unassembled reads in fasta and fastq file formats. The MAP processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNAs, as well as CRISPR elements. Structural annotation ismore » followed by functional annotation including assignment of protein product names and connection to various protein family databases.« less

  2. The Genetics of Chemoreception in the Labella and Tarsi of Aedes aegypti

    DTIC Science & Technology

    2014-01-01

    molecular pathway also involved in DEET perception in the dipteran relative Drosophila melanogaster (Ditzen et al., 2008; DeGennaro et al., 2013). Transgenic...Downloads/). Output Fastq Illumina files were map- ped to the reference genome with TopHat (Trapnell et al., 2009). The unambiguous sequence alignment...Q., Pikielny, C.W., 2002. Novel genes expressing in subsets of chemosensory sensilla on the front legs of male Drosophila melanogaster . Cell. Tissue

  3. IonGAP: integrative bacterial genome analysis for Ion Torrent sequence data.

    PubMed

    Baez-Ortega, Adrian; Lorenzo-Diaz, Fabian; Hernandez, Mariano; Gonzalez-Vila, Carlos Ignacio; Roda-Garcia, Jose Luis; Colebrook, Marcos; Flores, Carlos

    2015-09-01

    We introduce IonGAP, a publicly available Web platform designed for the analysis of whole bacterial genomes using Ion Torrent sequence data. Besides assembly, it integrates a variety of comparative genomics, annotation and bacterial classification routines, based on the widely used FASTQ, BAM and SRA file formats. Benchmarking with different datasets evidenced that IonGAP is a fast, powerful and simple-to-use bioinformatics tool. By releasing this platform, we aim to translate low-cost bacterial genome analysis for microbiological prevention and control in healthcare, agroalimentary and pharmaceutical industry applications. IonGAP is hosted by the ITER's Teide-HPC supercomputer and is freely available on the Web for non-commercial use at http://iongap.hpc.iter.es. mcolesan@ull.edu.es or cflores@ull.edu.es Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. FASTQ quality control dashboard

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2016-07-25

    FQCDB builds up existing open source software, FastQC, implementing a modern web interface for across parsed output of FastQC. In addition, FQCDB is extensible as a web service to include additional plots of type line, boxplot, or heatmap, across data formatted according to guidelines. The interface is also configurable via more readable JSON format, enabling customization by non-web programmers.

  5. Canary: an atomic pipeline for clinical amplicon assays.

    PubMed

    Doig, Kenneth D; Ellul, Jason; Fellowes, Andrew; Thompson, Ella R; Ryland, Georgina; Blombery, Piers; Papenfuss, Anthony T; Fox, Stephen B

    2017-12-15

    High throughput sequencing requires bioinformatics pipelines to process large volumes of data into meaningful variants that can be translated into a clinical report. These pipelines often suffer from a number of shortcomings: they lack robustness and have many components written in multiple languages, each with a variety of resource requirements. Pipeline components must be linked together with a workflow system to achieve the processing of FASTQ files through to a VCF file of variants. Crafting these pipelines requires considerable bioinformatics and IT skills beyond the reach of many clinical laboratories. Here we present Canary, a single program that can be run on a laptop, which takes FASTQ files from amplicon assays through to an annotated VCF file ready for clinical analysis. Canary can be installed and run with a single command using Docker containerization or run as a single JAR file on a wide range of platforms. Although it is a single utility, Canary performs all the functions present in more complex and unwieldy pipelines. All variants identified by Canary are 3' shifted and represented in their most parsimonious form to provide a consistent nomenclature, irrespective of sequencing variation. Further, proximate in-phase variants are represented as a single HGVS 'delins' variant. This allows for correct nomenclature and consequences to be ascribed to complex multi-nucleotide polymorphisms (MNPs), which are otherwise difficult to represent and interpret. Variants can also be annotated with hundreds of attributes sourced from MyVariant.info to give up to date details on pathogenicity, population statistics and in-silico predictors. Canary has been used at the Peter MacCallum Cancer Centre in Melbourne for the last 2 years for the processing of clinical sequencing data. By encapsulating clinical features in a single, easily installed executable, Canary makes sequencing more accessible to all pathology laboratories. Canary is available for download as source or a Docker image at https://github.com/PapenfussLab/Canary under a GPL-3.0 License.

  6. PuLSE: Quality control and quantification of peptide sequences explored by phage display libraries.

    PubMed

    Shave, Steven; Mann, Stefan; Koszela, Joanna; Kerr, Alastair; Auer, Manfred

    2018-01-01

    The design of highly diverse phage display libraries is based on assumption that DNA bases are incorporated at similar rates within the randomized sequence. As library complexity increases and expected copy numbers of unique sequences decrease, the exploration of library space becomes sparser and the presence of truly random sequences becomes critical. We present the program PuLSE (Phage Library Sequence Evaluation) as a tool for assessing randomness and therefore diversity of phage display libraries. PuLSE runs on a collection of sequence reads in the fastq file format and generates tables profiling the library in terms of unique DNA sequence counts and positions, translated peptide sequences, and normalized 'expected' occurrences from base to residue codon frequencies. The output allows at-a-glance quantitative quality control of a phage library in terms of sequence coverage both at the DNA base and translated protein residue level, which has been missing from toolsets and literature. The open source program PuLSE is available in two formats, a C++ source code package for compilation and integration into existing bioinformatics pipelines and precompiled binaries for ease of use.

  7. Automated Sanger Analysis Pipeline (ASAP): A Tool for Rapidly Analyzing Sanger Sequencing Data with Minimum User Interference.

    PubMed

    Singh, Aditya; Bhatia, Prateek

    2016-12-01

    Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28.

  8. Transcriptomic data of pre-meiotic stage of floret development in apomictic and sexual types of guinea grass (Panicum maximum Jacq.).

    PubMed

    Radhakrishna, Auji; Dwivedi, Krishna Kumar; Srivastava, Manoj Kumar; Roy, A K; Malaviya, D R; Kaushal, P

    2018-06-01

    Guinea grass ( Panicum maximum Jacq), an important fodder crop of humid and sub-humid tropical regions, reproduces through apomixis, a method of clonal propagation through seeds. Lack of knowledge of the genetic and molecular control of this phenomena has hindered the genetic improvement of this crop. The dataset provided here represents the first RNA-Seq based assembly and analysis of florets at pre-meiotic stage from the apomictic and sexual genotypes of guinea grass. The raw sequence files in FASTQ format were deposited in the NCBI SRA database with accession number SRP115883. A total of 24.8 Gb raw sequence data, corresponding to 17,96,65,827 raw reads was obtained by paired end sequencing. We used Trinity for de-novo assembly and identified 57,647 transcripts in sexual and 49,093 transcripts in apomictic type. This transcriptome data will be useful for identification and comparative analysis of genes regulating the mode of reproduction in grasses.

  9. MAGI: a Node.js web service for fast microRNA-Seq analysis in a GPU infrastructure.

    PubMed

    Kim, Jihoon; Levy, Eric; Ferbrache, Alex; Stepanowsky, Petra; Farcas, Claudiu; Wang, Shuang; Brunner, Stefan; Bath, Tyler; Wu, Yuan; Ohno-Machado, Lucila

    2014-10-01

    MAGI is a web service for fast MicroRNA-Seq data analysis in a graphics processing unit (GPU) infrastructure. Using just a browser, users have access to results as web reports in just a few hours->600% end-to-end performance improvement over state of the art. MAGI's salient features are (i) transfer of large input files in native FASTA with Qualities (FASTQ) format through drag-and-drop operations, (ii) rapid prediction of microRNA target genes leveraging parallel computing with GPU devices, (iii) all-in-one analytics with novel feature extraction, statistical test for differential expression and diagnostic plot generation for quality control and (iv) interactive visualization and exploration of results in web reports that are readily available for publication. MAGI relies on the Node.js JavaScript framework, along with NVIDIA CUDA C, PHP: Hypertext Preprocessor (PHP), Perl and R. It is freely available at http://magi.ucsd.edu. © The Author 2014. Published by Oxford University Press.

  10. RNA-Seq workflow: gene-level exploratory analysis and differential expression

    PubMed Central

    Love, Michael I.; Anders, Simon; Kim, Vladislav; Huber, Wolfgang

    2015-01-01

    Here we walk through an end-to-end gene-level RNA-Seq differential expression workflow using Bioconductor packages. We will start from the FASTQ files, show how these were aligned to the reference genome, and prepare a count matrix which tallies the number of RNA-seq reads/fragments within each gene for each sample. We will perform exploratory data analysis (EDA) for quality assessment and to explore the relationship between samples, perform differential gene expression analysis, and visually explore the results. PMID:26674615

  11. mod_bio: Apache modules for Next-Generation sequencing data.

    PubMed

    Lindenbaum, Pierre; Redon, Richard

    2015-01-01

    We describe mod_bio, a set of modules for the Apache HTTP server that allows the users to access and query fastq, tabix, fasta and bam files through a Web browser. Those data are made available in plain text, HTML, XML, JSON and JSON-P. A javascript-based genome browser using the JSON-P communication technique is provided as an example of cross-domain Web service. https://github.com/lindenb/mod_bio. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform, Version 1.5 and 1.x.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chain, Patrick; Lo, Chien-Chi; Li, Po-E

    EDGE bioinformatics was developed to help biologists process Next Generation Sequencing data (in the form of raw FASTQ files), even if they have little to no bioinformatics expertise. EDGE is a highly integrated and interactive web-based platform that is capable of running many of the standard analyses that biologists require for viral, bacterial/archaeal, and metagenomic samples. EDGE provides the following analytical workflows: quality trimming and host removal, assembly and annotation, comparisons against known references, taxonomy classification of reads and contigs, whole genome SNP-based phylogenetic analysis, and PCR analysis. EDGE provides an intuitive web-based interface for user input, allows users tomore » visualize and interact with selected results (e.g. JBrowse genome browser), and generates a final detailed PDF report. Results in the form of tables, text files, graphic files, and PDFs can be downloaded. A user management system allows tracking of an individual’s EDGE runs, along with the ability to share, post publicly, delete, or archive their results.« less

  13. Regulatory changes raise troubling questions for genomic testing.

    PubMed

    Evans, Barbara J; Dorschner, Michael O; Burke, Wylie; Jarvik, Gail P

    2014-11-01

    By 6 October 2014, many laboratories in the United States must begin honoring new individual data access rights created by recent changes to federal privacy and laboratory regulations. These access rights are more expansive than has been widely understood and pose complex challenges for genomic testing laboratories. This article analyzes regulatory texts and guidances to explore which laboratories are affected. It offers the first published analysis of which parts of the vast trove of data generated during next-generation sequencing will be accessible to patients and research subjects. Persons tested at affected laboratories seemingly will have access, upon request, to uninterpreted gene variant information contained in their stored variant call format, binary alignment/map, and FASTQ files. A defect in the regulations will subject some non-CLIA-regulated research laboratories to these new access requirements unless the Department of Health and Human Services takes swift action to avert this apparently unintended consequence. More broadly, all affected laboratories face a long list of daunting operational, business, compliance, and bioethical issues as they adapt to this change and to the Food and Drug Administration's recently announced plan to publish draft guidance outlining a new oversight framework for lab-developed tests.

  14. Grape RNA-Seq analysis pipeline environment

    PubMed Central

    Knowles, David G.; Röder, Maik; Merkel, Angelika; Guigó, Roderic

    2013-01-01

    Motivation: The avalanche of data arriving since the development of NGS technologies have prompted the need for developing fast, accurate and easily automated bioinformatic tools capable of dealing with massive datasets. Among the most productive applications of NGS technologies is the sequencing of cellular RNA, known as RNA-Seq. Although RNA-Seq provides similar or superior dynamic range than microarrays at similar or lower cost, the lack of standard and user-friendly pipelines is a bottleneck preventing RNA-Seq from becoming the standard for transcriptome analysis. Results: In this work we present a pipeline for processing and analyzing RNA-Seq data, that we have named Grape (Grape RNA-Seq Analysis Pipeline Environment). Grape supports raw sequencing reads produced by a variety of technologies, either in FASTA or FASTQ format, or as prealigned reads in SAM/BAM format. A minimal Grape configuration consists of the file location of the raw sequencing reads, the genome of the species and the corresponding gene and transcript annotation. Grape first runs a set of quality control steps, and then aligns the reads to the genome, a step that is omitted for prealigned read formats. Grape next estimates gene and transcript expression levels, calculates exon inclusion levels and identifies novel transcripts. Grape can be run on a single computer or in parallel on a computer cluster. It is distributed with specific mapping and quantification tools, but given its modular design, any tool supporting popular data interchange formats can be integrated. Availability: Grape can be obtained from the Bioinformatics and Genomics website at: http://big.crg.cat/services/grape. Contact: david.gonzalez@crg.eu or roderic.guigo@crg.eu PMID:23329413

  15. A System Architecture for Efficient Transmission of Massive DNA Sequencing Data.

    PubMed

    Sağiroğlu, Mahmut Şamİl; Külekcİ, M Oğuzhan

    2017-11-01

    The DNA sequencing data analysis pipelines require significant computational resources. In that sense, cloud computing infrastructures appear as a natural choice for this processing. However, the first practical difficulty in reaching the cloud computing services is the transmission of the massive DNA sequencing data from where they are produced to where they will be processed. The daily practice here begins with compressing the data in FASTQ file format, and then sending these data via fast data transmission protocols. In this study, we address the weaknesses in that daily practice and present a new system architecture that incorporates the computational resources available on the client side while dynamically adapting itself to the available bandwidth. Our proposal considers the real-life scenarios, where the bandwidth of the connection between the parties may fluctuate, and also the computing power on the client side may be of any size ranging from moderate personal computers to powerful workstations. The proposed architecture aims at utilizing both the communication bandwidth and the computing resources for satisfying the ultimate goal of reaching the results as early as possible. We present a prototype implementation of the proposed architecture, and analyze several real-life cases, which provide useful insights for the sequencing centers, especially on deciding when to use a cloud service and in what conditions.

  16. ExpEdit: a webserver to explore human RNA editing in RNA-Seq experiments.

    PubMed

    Picardi, Ernesto; D'Antonio, Mattia; Carrabino, Danilo; Castrignanò, Tiziana; Pesole, Graziano

    2011-05-01

    ExpEdit is a web application for assessing RNA editing in human at known or user-specified sites supported by transcript data obtained by RNA-Seq experiments. Mapping data (in SAM/BAM format) or directly sequence reads [in FASTQ/short read archive (SRA) format] can be provided as input to carry out a comparative analysis against a large collection of known editing sites collected in DARNED database as well as other user-provided potentially edited positions. Results are shown as dynamic tables containing University of California, Santa Cruz (UCSC) links for a quick examination of the genomic context. ExpEdit is freely available on the web at http://www.caspur.it/ExpEdit/.

  17. CARGO: effective format-free compressed storage of genomic information

    PubMed Central

    Roguski, Łukasz; Ribeca, Paolo

    2016-01-01

    The recent super-exponential growth in the amount of sequencing data generated worldwide has put techniques for compressed storage into the focus. Most available solutions, however, are strictly tied to specific bioinformatics formats, sometimes inheriting from them suboptimal design choices; this hinders flexible and effective data sharing. Here, we present CARGO (Compressed ARchiving for GenOmics), a high-level framework to automatically generate software systems optimized for the compressed storage of arbitrary types of large genomic data collections. Straightforward applications of our approach to FASTQ and SAM archives require a few lines of code, produce solutions that match and sometimes outperform specialized format-tailored compressors and scale well to multi-TB datasets. All CARGO software components can be freely downloaded for academic and non-commercial use from http://bio-cargo.sourceforge.net. PMID:27131376

  18. FAST: FAST Analysis of Sequences Toolbox

    PubMed Central

    Lawrence, Travis J.; Kauffman, Kyle T.; Amrine, Katherine C. H.; Carper, Dana L.; Lee, Raymond S.; Becich, Peter J.; Canales, Claudia J.; Ardell, David H.

    2015-01-01

    FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought. PMID:26042145

  19. A ChIP-Seq Data Analysis Pipeline Based on Bioconductor Packages.

    PubMed

    Park, Seung-Jin; Kim, Jong-Hwan; Yoon, Byung-Ha; Kim, Seon-Young

    2017-03-01

    Nowadays, huge volumes of chromatin immunoprecipitation-sequencing (ChIP-Seq) data are generated to increase the knowledge on DNA-protein interactions in the cell, and accordingly, many tools have been developed for ChIP-Seq analysis. Here, we provide an example of a streamlined workflow for ChIP-Seq data analysis composed of only four packages in Bioconductor: dada2, QuasR, mosaics, and ChIPseeker. 'dada2' performs trimming of the high-throughput sequencing data. 'QuasR' and 'mosaics' perform quality control and mapping of the input reads to the reference genome and peak calling, respectively. Finally, 'ChIPseeker' performs annotation and visualization of the called peaks. This workflow runs well independently of operating systems (e.g., Windows, Mac, or Linux) and processes the input fastq files into various results in one run. R code is available at github: https://github.com/ddhb/Workflow_of_Chipseq.git.

  20. A ChIP-Seq Data Analysis Pipeline Based on Bioconductor Packages

    PubMed Central

    Park, Seung-Jin; Kim, Jong-Hwan; Yoon, Byung-Ha; Kim, Seon-Young

    2017-01-01

    Nowadays, huge volumes of chromatin immunoprecipitation-sequencing (ChIP-Seq) data are generated to increase the knowledge on DNA-protein interactions in the cell, and accordingly, many tools have been developed for ChIP-Seq analysis. Here, we provide an example of a streamlined workflow for ChIP-Seq data analysis composed of only four packages in Bioconductor: dada2, QuasR, mosaics, and ChIPseeker. ‘dada2’ performs trimming of the high-throughput sequencing data. ‘QuasR’ and ‘mosaics’ perform quality control and mapping of the input reads to the reference genome and peak calling, respectively. Finally, ‘ChIPseeker’ performs annotation and visualization of the called peaks. This workflow runs well independently of operating systems (e.g., Windows, Mac, or Linux) and processes the input fastq files into various results in one run. R code is available at github: https://github.com/ddhb/Workflow_of_Chipseq.git. PMID:28416945

  1. LongISLND: in silico sequencing of lengthy and noisy datatypes

    PubMed Central

    Lau, Bayo; Mohiyuddin, Marghoob; Mu, John C.; Fang, Li Tai; Bani Asadi, Narges; Dallett, Carolina; Lam, Hugo Y. K.

    2016-01-01

    Summary: LongISLND is a software package designed to simulate sequencing data according to the characteristics of third generation, single-molecule sequencing technologies. The general software architecture is easily extendable, as demonstrated by the emulation of Pacific Biosciences (PacBio) multi-pass sequencing with P5 and P6 chemistries, producing data in FASTQ, H5, and the latest PacBio BAM format. We demonstrate its utility by downstream processing with consensus building and variant calling. Availability and Implementation: LongISLND is implemented in Java and available at http://bioinform.github.io/longislnd Contact: hugo.lam@roche.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27667791

  2. LongISLND: in silico sequencing of lengthy and noisy datatypes.

    PubMed

    Lau, Bayo; Mohiyuddin, Marghoob; Mu, John C; Fang, Li Tai; Bani Asadi, Narges; Dallett, Carolina; Lam, Hugo Y K

    2016-12-15

    LongISLND is a software package designed to simulate sequencing data according to the characteristics of third generation, single-molecule sequencing technologies. The general software architecture is easily extendable, as demonstrated by the emulation of Pacific Biosciences (PacBio) multi-pass sequencing with P5 and P6 chemistries, producing data in FASTQ, H5, and the latest PacBio BAM format. We demonstrate its utility by downstream processing with consensus building and variant calling. LongISLND is implemented in Java and available at http://bioinform.github.io/longislnd CONTACT: hugo.lam@roche.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  3. SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.

  4. DistMap: a toolkit for distributed short read mapping on a Hadoop cluster.

    PubMed

    Pandey, Ram Vinay; Schlötterer, Christian

    2013-01-01

    With the rapid and steady increase of next generation sequencing data output, the mapping of short reads has become a major data analysis bottleneck. On a single computer, it can take several days to map the vast quantity of reads produced from a single Illumina HiSeq lane. In an attempt to ameliorate this bottleneck we present a new tool, DistMap - a modular, scalable and integrated workflow to map reads in the Hadoop distributed computing framework. DistMap is easy to use, currently supports nine different short read mapping tools and can be run on all Unix-based operating systems. It accepts reads in FASTQ format as input and provides mapped reads in a SAM/BAM format. DistMap supports both paired-end and single-end reads thereby allowing the mapping of read data produced by different sequencing platforms. DistMap is available from http://code.google.com/p/distmap/

  5. DistMap: A Toolkit for Distributed Short Read Mapping on a Hadoop Cluster

    PubMed Central

    Pandey, Ram Vinay; Schlötterer, Christian

    2013-01-01

    With the rapid and steady increase of next generation sequencing data output, the mapping of short reads has become a major data analysis bottleneck. On a single computer, it can take several days to map the vast quantity of reads produced from a single Illumina HiSeq lane. In an attempt to ameliorate this bottleneck we present a new tool, DistMap - a modular, scalable and integrated workflow to map reads in the Hadoop distributed computing framework. DistMap is easy to use, currently supports nine different short read mapping tools and can be run on all Unix-based operating systems. It accepts reads in FASTQ format as input and provides mapped reads in a SAM/BAM format. DistMap supports both paired-end and single-end reads thereby allowing the mapping of read data produced by different sequencing platforms. DistMap is available from http://code.google.com/p/distmap/ PMID:24009693

  6. FaStore - a space-saving solution for raw sequencing data.

    PubMed

    Roguski, Lukasz; Ochoa, Idoia; Hernaez, Mikel; Deorowicz, Sebastian

    2018-03-29

    The affordability of DNA sequencing has led to the generation of unprecedented volumes of raw sequencing data. These data must be stored, processed, and transmitted, which poses significant challenges. To facilitate this effort, we introduce FaStore, a specialized compressor for FASTQ files. FaStore does not use any reference sequences for compression, and permits the user to choose from several lossy modes to improve the overall compression ratio, depending on the specific needs. FaStore in the lossless mode achieves a significant improvement in compression ratio with respect to previously proposed algorithms. We perform an analysis on the effect that the different lossy modes have on variant calling, the most widely used application for clinical decision making, especially important in the era of precision medicine. We show that lossy compression can offer significant compression gains, while preserving the essential genomic information and without affecting the variant calling performance. FaStore can be downloaded from https://github.com/refresh-bio/FaStore. sebastian.deorowicz@polsl.pl. Supplementary data are available at Bioinformatics online.

  7. ImmuneDB: a system for the analysis and exploration of high-throughput adaptive immune receptor sequencing data.

    PubMed

    Rosenfeld, Aaron M; Meng, Wenzhao; Luning Prak, Eline T; Hershberg, Uri

    2017-01-15

    As high-throughput sequencing of B cells becomes more common, the need for tools to analyze the large quantity of data also increases. This article introduces ImmuneDB, a system for analyzing vast amounts of heavy chain variable region sequences and exploring the resulting data. It can take as input raw FASTA/FASTQ data, identify genes, determine clones, construct lineages, as well as provide information such as selection pressure and mutation analysis. It uses an industry leading database, MySQL, to provide fast analysis and avoid the complexities of using error prone flat-files. ImmuneDB is freely available at http://immunedb.comA demo of the ImmuneDB web interface is available at: http://immunedb.com/demo CONTACT: Uh25@drexel.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Data of first de-novo transcriptome assembly of a non-model species, hawksbill sea turtle, Eretmochelys imbricate, nesting of the Colombian Caribean.

    PubMed

    Hernández-Fernández, Javier

    2017-12-01

    The hawksbill sea turtle, Eretmochelys imbricata, is an endangered species of the Caribbean Colombian coast due to anthropic and natural factors that have decreased their population levels. Little is known about the genes that are involved in their immune system, sex determination, aging and others important functions. The data generated represents RNA sequencing and the first de-novo assembly of transcripts expressed in the blood of the hawksbill sea turtle. The raw FASTQ files were deposited in the NCBI SRA database with accession number SRX2653641. A total of 5.7 Gb raw sequence data were obtained, corresponding to 47,555,108 raw reads. Trinity was used to perform a first de-novo assembly, and we were able to identify 47,586 transcripts of the female hawksbill turtle transcriptome with an N50 of 1100 bp. The obtained transcriptome data will be useful for further studies of the physiology, biochemistry and evolution in this species.

  9. Arkas: Rapid reproducible RNAseq analysis

    PubMed Central

    Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan

    2017-01-01

    The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments.  We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways .  Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing.   Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import.  Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134

  10. Mycobacterium tuberculosis and whole genome sequencing: a practical guide and online tools available for the clinical microbiologist.

    PubMed

    Satta, G; Atzeni, A; McHugh, T D

    2017-02-01

    Whole genome sequencing (WGS) has the potential to revolutionize the diagnosis of Mycobacterium tuberculosis infection but the lack of bioinformatic expertise among clinical microbiologists is a barrier for adoption. Software products for analysis should be simple, free of charge, able to accept data directly from the sequencer (FASTQ files) and to provide the basic functionalities all-in-one. The main aim of this narrative review is to provide a practical guide for the clinical microbiologist, with little or no practical experience of WGS analysis, with a specific focus on software products tailor-made for M. tuberculosis analysis. With sequencing performed by an external provider, it is now feasible to implement WGS analysis in the routine clinical practice of any microbiology laboratory, with the potential to detect resistance weeks before traditional phenotypic culture methods, but the clinical microbiologist should be aware of the limitations of this approach. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.

  11. Rapid evaluation and quality control of next generation sequencing data with FaQCs.

    PubMed

    Lo, Chien-Chi; Chain, Patrick S G

    2014-11-19

    Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lo, Chien -Chi; Chain, Patrick S. G.

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less

  13. Light-weight reference-based compression of FASTQ data.

    PubMed

    Zhang, Yongpeng; Li, Linsen; Yang, Yanli; Yang, Xiao; He, Shan; Zhu, Zexuan

    2015-06-09

    The exponential growth of next generation sequencing (NGS) data has posed big challenges to data storage, management and archive. Data compression is one of the effective solutions, where reference-based compression strategies can typically achieve superior compression ratios compared to the ones not relying on any reference. This paper presents a lossless light-weight reference-based compression algorithm namely LW-FQZip to compress FASTQ data. The three components of any given input, i.e., metadata, short reads and quality score strings, are first parsed into three data streams in which the redundancy information are identified and eliminated independently. Particularly, well-designed incremental and run-length-limited encoding schemes are utilized to compress the metadata and quality score streams, respectively. To handle the short reads, LW-FQZip uses a novel light-weight mapping model to fast map them against external reference sequence(s) and produce concise alignment results for storage. The three processed data streams are then packed together with some general purpose compression algorithms like LZMA. LW-FQZip was evaluated on eight real-world NGS data sets and achieved compression ratios in the range of 0.111-0.201. This is comparable or superior to other state-of-the-art lossless NGS data compression algorithms. LW-FQZip is a program that enables efficient lossless FASTQ data compression. It contributes to the state of art applications for NGS data storage and transmission. LW-FQZip is freely available online at: http://csse.szu.edu.cn/staff/zhuzx/LWFQZip.

  14. The draft genomes and investigation of serotype distribution, antimicrobial resistance of group B Streptococcus strains isolated from urine in Suzhou, China.

    PubMed

    Guo, Yong; Deng, Xiao; Liang, Yuan; Zhang, Liang; Zhao, Guo-Ping; Zhou, Yan

    2018-06-26

    The group B Streptococcus (GBS) is a human commensal bacterium, which is capable of causing several infectious diseases in infants, and people with chronic diseases. GBS has been the most common cause of infections in urinary tract of the elders, but relatively few studies reported the urine-isolated GBS and their antimicrobial susceptibilities. Hence, we decided to investigate GBS specially isolated from urine in Suzhou, China. 27 GBS samples were isolated from urine in Suzhou, China. The PCR and agarose gel electrophoresis were used to identify the serotype distribution. Susceptibility tests were based on MIC test and Kirby-Bauer test. Genome were sequenced via Illumina Hiseq platform and assembled by SPAdes. Genomes of five isolates were sequenced and submitted to NCBI genome database. The sequencing files in fastq format were submitted to NCBI SRA database. Five serotypes were identified. The resistant rates measured for tetracycline, erythromycin, clindamycin and fluoroquinolones were 74.1, 63.0, 44.4 and 48.1%, respectively. 18.5% of the isolates were nonsusceptible to nitrofurantoin. The resistance to tetracycline was mainly associated with the gene tetM. The erythromycin resistance was mainly associated with the genes ermB and mefE. The genes ermB and lnuB were the prevalent genes in cMLSB type. No known nitrofurantoin resistance gene was found in nitrofurantoin-nonsusceptible GBS. Five serotypes were identified in our study. High rates of GBS isolates were resistant to tetracycline, erythromycin, clindamycin and fluoroquinolones. The genes ermB and lnuB occupied high rates in cMLS B phenotype.

  15. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    DOE PAGES

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less

  16. ParDRe: faster parallel duplicated reads removal tool for sequencing studies.

    PubMed

    González-Domínguez, Jorge; Schmidt, Bertil

    2016-05-15

    Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe, a de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of Single-End or Paired-End sequences from fasta or fastq files. It uses a novel bitwise approach to compare the suffixes of DNA strings and employs hybrid MPI/multithreading to reduce runtime on multicore systems. We show that ParDRe is up to 27.29 times faster than Fulcrum (a representative state-of-the-art tool) on a platform with two 8-core Sandy-Bridge processors. Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/pardre/ jgonzalezd@udc.es. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Compression of next-generation sequencing reads aided by highly efficient de novo assembly

    PubMed Central

    Jones, Daniel C.; Ruzzo, Walter L.; Peng, Xinxia

    2012-01-01

    We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ and SAM/BAM formats. In addition to implementing reference-based compression, we have developed, to our knowledge, the first assembly-based compressor, using a novel de novo assembly algorithm. A probabilistic data structure is used to dramatically reduce the memory required by traditional de Bruijn graph assemblers, allowing millions of reads to be assembled very efficiently. Read sequences are then stored as positions within the assembled contigs. This is combined with statistical compression of read identifiers, quality scores, alignment information and sequences, effectively collapsing very large data sets to <15% of their original size with no loss of information. Availability: Quip is freely available under the 3-clause BSD license from http://cs.washington.edu/homes/dcjones/quip. PMID:22904078

  18. NaviSE: superenhancer navigator integrating epigenomics signal algebra.

    PubMed

    Ascensión, Alex M; Arrospide-Elgarresta, Mikel; Izeta, Ander; Araúzo-Bravo, Marcos J

    2017-06-06

    Superenhancers are crucial structural genomic elements determining cell fate, and they are also involved in the determination of several diseases, such as cancer or neurodegeneration. Although there are pipelines which use independent pieces of software to predict the presence of superenhancers from genome-wide chromatin marks or DNA-interaction protein binding sites, there is not yet an integrated software tool that processes automatically algebra combinations of raw data sequencing into a comprehensive final annotated report of predicted superenhancers. We have developed NaviSE, a user-friendly streamlined tool which performs a fully-automated parallel processing of genome-wide epigenomics data from sequencing files into a final report, built with a comprehensive set of annotated files that are navigated through a graphic user interface dynamically generated by NaviSE. NaviSE also implements an 'epigenomics signal algebra' that allows the combination of multiple activation and repression epigenomics signals. NaviSE provides an interactive chromosomal landscaping of the locations of superenhancers, which can be navigated to obtain annotated information about superenhancer signal profile, associated genes, gene ontology enrichment analysis, motifs of transcription factor binding sites enriched in superenhancers, graphs of the metrics evaluating the superenhancers quality, protein-protein interaction networks and enriched metabolic pathways among other features. We have parallelised the most time-consuming tasks achieving a reduction up to 30% for a 15 CPUs machine. We have optimized the default parameters of NaviSE to facilitate its use. NaviSE allows different entry levels of data processing, from sra-fastq files to bed files; and unifies the processing of multiple replicates. NaviSE outperforms the more time-consuming processes required in a non-integrated pipeline. Alongside its high performance, NaviSE is able to provide biological insights, predicting cell type specific markers, such as SOX2 and ZIC3 in embryonic stem cells, CDK5R1 and REST in neurons and CD86 and TLR2 in monocytes. NaviSE is a user-friendly streamlined solution for superenhancer analysis, annotation and navigation, requiring only basic computer and next generation sequencing knowledge. NaviSE binaries and documentation are available at: https://sourceforge.net/projects/navise-superenhancer/ .

  19. 76 FR 43679 - Filing via the Internet; Notice of Additional File Formats for efiling

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-21

    ... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. RM07-16-000] Filing via the Internet; Notice of Additional File Formats for efiling Take notice that the Commission has added to its list of acceptable file formats the four-character file extensions for Microsoft Office 2007/2010...

  20. Heads Up

    MedlinePlus

    ... HEADS UP Resources Training Custom PDFs Mobile Apps Videos Graphics Podcasts Social Media File Formats Help: How do I view different file formats (PDF, DOC, PPT, MPEG) on this site? Adobe PDF file Microsoft PowerPoint ... file Apple Quicktime file RealPlayer file Text file ...

  1. w4CSeq: software and web application to analyze 4C-seq data.

    PubMed

    Cai, Mingyang; Gao, Fan; Lu, Wange; Wang, Kai

    2016-11-01

    Circularized Chromosome Conformation Capture followed by deep sequencing (4C-Seq) is a powerful technique to identify genome-wide partners interacting with a pre-specified genomic locus. Here, we present a computational and statistical approach to analyze 4C-Seq data generated from both enzyme digestion and sonication fragmentation-based methods. We implemented a command line software tool and a web interface called w4CSeq, which takes in the raw 4C sequencing data (FASTQ files) as input, performs automated statistical analysis and presents results in a user-friendly manner. Besides providing users with the list of candidate interacting sites/regions, w4CSeq generates figures showing genome-wide distribution of interacting regions, and sketches the enrichment of key features such as TSSs, TTSs, CpG sites and DNA replication timing around 4C sites. Users can establish their own web server by downloading source codes at https://github.com/WGLab/w4CSeq Additionally, a demo web server is available at http://w4cseq.wglab.org CONTACT: kaiwang@usc.edu or wangelu@usc.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. A comprehensive iterative approach is highly effective in diagnosing individuals who are exome negative.

    PubMed

    Shashi, Vandana; Schoch, Kelly; Spillmann, Rebecca; Cope, Heidi; Tan, Queenie K-G; Walley, Nicole; Pena, Loren; McConkie-Rosell, Allyn; Jiang, Yong-Hui; Stong, Nicholas; Need, Anna C; Goldstein, David B

    2018-06-15

    Sixty to seventy-five percent of individuals with rare and undiagnosed phenotypes remain undiagnosed after exome sequencing (ES). With standard ES reanalysis resolving 10-15% of the ES negatives, further approaches are necessary to maximize diagnoses in these individuals. In 38 ES negative patients an individualized genomic-phenotypic approach was employed utilizing (1) phenotyping; (2) reanalyses of FASTQ files, with innovative bioinformatics; (3) targeted molecular testing; (4) genome sequencing (GS); and (5) conferring of clinical diagnoses when pathognomonic clinical findings occurred. Certain and highly likely diagnoses were made in 18/38 (47%) individuals, including identifying two new developmental disorders. The majority of diagnoses (>70%) were due to our bioinformatics, phenotyping, and targeted testing identifying variants that were undetected or not prioritized on prior ES. GS diagnosed 3/18 individuals with structural variants not amenable to ES. Additionally, tentative diagnoses were made in 3 (8%), and in 5 individuals (13%) candidate genes were identified. Overall, diagnoses/potential leads were identified in 26/38 (68%). Our comprehensive approach to ES negatives maximizes the ES and clinical data for both diagnoses and candidate gene identification, without GS in the majority. This iterative approach is cost-effective and is pertinent to the current conundrum of ES negatives.

  3. Interoperability format translation and transformation between IFC architectural design file and simulation file formats

    DOEpatents

    Chao, Tian-Jy; Kim, Younghun

    2015-02-03

    Automatically translating a building architecture file format (Industry Foundation Class) to a simulation file, in one aspect, may extract data and metadata used by a target simulation tool from a building architecture file. Interoperability data objects may be created and the extracted data is stored in the interoperability data objects. A model translation procedure may be prepared to identify a mapping from a Model View Definition to a translation and transformation function. The extracted data may be transformed using the data stored in the interoperability data objects, an input Model View Definition template, and the translation and transformation function to convert the extracted data to correct geometric values needed for a target simulation file format used by the target simulation tool. The simulation file in the target simulation file format may be generated.

  4. Interoperability format translation and transformation between IFC architectural design file and simulation file formats

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chao, Tian-Jy; Kim, Younghun

    Automatically translating a building architecture file format (Industry Foundation Class) to a simulation file, in one aspect, may extract data and metadata used by a target simulation tool from a building architecture file. Interoperability data objects may be created and the extracted data is stored in the interoperability data objects. A model translation procedure may be prepared to identify a mapping from a Model View Definition to a translation and transformation function. The extracted data may be transformed using the data stored in the interoperability data objects, an input Model View Definition template, and the translation and transformation function tomore » convert the extracted data to correct geometric values needed for a target simulation file format used by the target simulation tool. The simulation file in the target simulation file format may be generated.« less

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Temple, Brian Allen; Armstrong, Jerawan Chudoung

    This document is a mid-year report on a deliverable for the PYTHON Radiography Analysis Tool (PyRAT) for project LANL12-RS-107J in FY15. The deliverable is deliverable number 2 in the work package and is titled “Add the ability to read in more types of image file formats in PyRAT”. Right now PyRAT can only read in uncompressed TIF files (tiff files). It is planned to expand the file formats that can be read by PyRAT, making it easier to use in more situations. A summary of the file formats added include jpeg, jpg, png and formatted ASCII files.

  6. PCF File Format.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thoreson, Gregory G

    PCF files are binary files designed to contain gamma spectra and neutron count rates from radiation sensors. It is the native format for the GAmma Detector Response and Analysis Software (GADRAS) package [1]. It can contain multiple spectra and information about each spectrum such as energy calibration. This document outlines the format of the file that would allow one to write a computer program to parse and write such files.

  7. Data files from the Grays Harbor Sediment Transport Experiment Spring 2001

    USGS Publications Warehouse

    Landerman, Laura A.; Sherwood, Christopher R.; Gelfenbaum, Guy; Lacy, Jessica; Ruggiero, Peter; Wilson, Douglas; Chisholm, Tom; Kurrus, Keith

    2005-01-01

    This publication consists of two DVD-ROMs, both of which are presented here. This report describes data collected during the Spring 2001 Grays Harbor Sediment Transport Experiment, and provides additional information needed to interpret the data. Two DVDs accompany this report; both contain documentation in html format that assist the user in navigating through the data. DVD-ROM-1 contains a digital version of this report in .pdf format, raw Aquatec acoustic backscatter (ABS) data in .zip format, Sonar data files in .avi format, and coastal processes and morphology data in ASCII format. ASCII data files are provided in .zip format; bundled coastal processes ASCII files are separated by deployment and instrument; bundled morphology ASCII files are separated into monthly data collection efforts containing the beach profiles collected (or extracted from the surface map) at that time; weekly surface maps are also bundled together. DVD-ROM-2 contains a digital version of this report in .pdf format, the binary data files collected by the SonTek instrumentation, calibration files for the pressure sensors, and Matlab m-files for loading the ABS data into Matlab and cleaning-up the optical backscatter (OBS) burst time-series data.

  8. File format for normalizing radiological concentration exposure rate and dose rate data for the effects of radioactive decay and weathering processes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kraus, Terrence D.

    2017-04-01

    This report specifies the electronic file format that was agreed upon to be used as the file format for normalized radiological data produced by the software tool developed under this TI project. The NA-84 Technology Integration (TI) Program project (SNL17-CM-635, Normalizing Radiological Data for Analysis and Integration into Models) investigators held a teleconference on December 7, 2017 to discuss the tasks to be completed under the TI program project. During this teleconference, the TI project investigators determined that the comma-separated values (CSV) file format is the most suitable file format for the normalized radiological data that will be outputted frommore » the normalizing tool developed under this TI project. The CSV file format was selected because it provides the requisite flexibility to manage different types of radiological data (i.e., activity concentration, exposure rate, dose rate) from other sources [e.g., Radiological Assessment and Monitoring System (RAMS), Aerial Measuring System (AMS), Monitoring and Sampling). The CSV file format also is suitable for the file format of the normalized radiological data because this normalized data can then be ingested by other software [e.g., RAMS, Visual Sampling Plan (VSP)] used by the NA-84’s Consequence Management Program.« less

  9. 77 FR 59692 - 2014 Diversity Immigrant Visa Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-28

    ... the E-DV system. The entry will not be accepted and must be resubmitted. Group or family photographs... must be in the Joint Photographic Experts Group (JPEG) format. Image File Size: The maximum file size...). Image File Format: The image must be in the Joint Photographic Experts Group (JPEG) format. Image File...

  10. NAVAIR Portable Source Initiative (NPSI) Standard for Reusable Source Dataset Metadata (RSDM) V2.4

    DTIC Science & Technology

    2012-09-26

    defining a raster file format: <RasterFileFormat> <FormatName>TIFF</FormatName> <Order>BIP</Order> < DataType >8-BIT_UNSIGNED</ DataType ...interleaved by line (BIL); Band interleaved by pixel (BIP). element RasterFileFormatType/ DataType diagram type restriction of xsd:string facets

  11. An Efficient Format for Nearly Constant-Time Access to Arbitrary Time Intervals in Large Trace Files

    DOE PAGES

    Chan, Anthony; Gropp, William; Lusk, Ewing

    2008-01-01

    A powerful method to aid in understanding the performance of parallel applications uses log or trace files containing time-stamped events and states (pairs of events). These trace files can be very large, often hundreds or even thousands of megabytes. Because of the cost of accessing and displaying such files, other methods are often used that reduce the size of the tracefiles at the cost of sacrificing detail or other information. This paper describes a hierarchical trace file format that provides for display of an arbitrary time window in a time independent of the total size of the file and roughlymore » proportional to the number of events within the time window. This format eliminates the need to sacrifice data to achieve a smaller trace file size (since storage is inexpensive, it is necessary only to make efficient use of bandwidth to that storage). The format can be used to organize a trace file or to create a separate file of annotations that may be used with conventional trace files. We present an analysis of the time to access all of the events relevant to an interval of time and we describe experiments demonstrating the performance of this file format.« less

  12. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.

    PubMed

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-06-27

    Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available for third-party implementation and use, and can be downloaded from http://s3.amazonaws.com/jnj_rainbow/index.html.

  13. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing

    PubMed Central

    2013-01-01

    Background Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Results Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Conclusions Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available for third-party implementation and use, and can be downloaded from http://s3.amazonaws.com/jnj_rainbow/index.html. PMID:23802613

  14. Isostatic gravity map and principal facts for 694 gravity stations in Yellowstone National Park and vicinity, Wyoming, Montana, and Idaho

    USGS Publications Warehouse

    Carle, S.F.; Glen, J.M.; Langenheim, V.E.; Smith, R.B.; Oliver, H.W.

    1990-01-01

    The report presents the principal facts for gravity stations compiled for Yellowstone National Park and vicinity. The gravity data were compiled from three sources: Defense Mapping Agency, University of Utah, and U.S. Geological Survey. Part A of the report is a paper copy describing how the compilation was done and presenting the data in tabular format as well as a map; part B is a 5-1/4 inch floppy diskette containing only the data files in ASCII format. Requirements for part B: IBM PC or compatible, DOS v. 2.0 or higher. Files contained on this diskette: DOD.ISO -- File containing the principal facts of the 514 gravity stations obtained from the Defense Mapping Agency. The data are in Plouff format* (see file PFTAB.TEX). UTAH.ISO -- File containing the principal facts of 153 gravity stations obtained from the University of Utah. Data are in Plouff format. USGS.ISO -- File containing the principal facts of 27 gravity stations collected by the U.S. Geological Survey in July 1987. Data are in Plouff format. PFTAB.TXT -- File containing explanation of principal fact format. ACC.TXT -- File containing explanation of accuracy codes.

  15. Mapping DICOM to OpenDocument format

    NASA Astrophysics Data System (ADS)

    Yu, Cong; Yao, Zhihong

    2009-02-01

    In order to enhance the readability, extensibility and sharing of DICOM files, we have introduced XML into DICOM file system (SPIE Volume 5748)[1] and the multilayer tree structure into DICOM (SPIE Volume 6145)[2]. In this paper, we proposed mapping DICOM to ODF(OpenDocument Format), for it is also based on XML. As a result, the new format realizes the separation of content(including text content and image) and display style. Meanwhile, since OpenDocument files take the format of a ZIP compressed archive, the new kind of DICOM files can benefit from ZIP's lossless compression to reduce file size. Moreover, this open format can also guarantee long-term access to data without legal or technical barriers, making medical images accessible to various fields.

  16. 18 CFR 50.3 - Applications/pre-filing; rules and format.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... filings must be signed in compliance with § 385.2005 of this chapter. (e) The Commission will conduct a... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Applications/pre-filing... INTERSTATE ELECTRIC TRANSMISSION FACILITIES § 50.3 Applications/pre-filing; rules and format. (a) Filings are...

  17. Manual for Getdata Version 3.1: a FORTRAN Utility Program for Time History Data

    NASA Technical Reports Server (NTRS)

    Maine, Richard E.

    1987-01-01

    This report documents version 3.1 of the GetData computer program. GetData is a utility program for manipulating files of time history data, i.e., data giving the values of parameters as functions of time. The most fundamental capability of GetData is extracting selected signals and time segments from an input file and writing the selected data to an output file. Other capabilities include converting file formats, merging data from several input files, time skewing, interpolating to common output times, and generating calculated output signals as functions of the input signals. This report also documents the interface standards for the subroutines used by GetData to read and write the time history files. All interface to the data files is through these subroutines, keeping the main body of GetData independent of the precise details of the file formats. Different file formats can be supported by changes restricted to these subroutines. Other computer programs conforming to the interface standards can call the same subroutines to read and write files in compatible formats.

  18. Arkansas and Louisiana Aeromagnetic and Gravity Maps and Data - A Website for Distribution of Data

    USGS Publications Warehouse

    Bankey, Viki; Daniels, David L.

    2008-01-01

    This report contains digital data, image files, and text files describing data formats for aeromagnetic and gravity data used to compile the State aeromagnetic and gravity maps of Arkansas and Louisiana. The digital files include grids, images, ArcInfo, and Geosoft compatible files. In some of the data folders, ASCII files with the extension 'txt' describe the format and contents of the data files. Read the 'txt' files before using the data files.

  19. A mass spectrometry proteomics data management platform.

    PubMed

    Sharma, Vagisha; Eng, Jimmy K; Maccoss, Michael J; Riffle, Michael

    2012-09-01

    Mass spectrometry-based proteomics is increasingly being used in biomedical research. These experiments typically generate a large volume of highly complex data, and the volume and complexity are only increasing with time. There exist many software pipelines for analyzing these data (each typically with its own file formats), and as technology improves, these file formats change and new formats are developed. Files produced from these myriad software programs may accumulate on hard disks or tape drives over time, with older files being rendered progressively more obsolete and unusable with each successive technical advancement and data format change. Although initiatives exist to standardize the file formats used in proteomics, they do not address the core failings of a file-based data management system: (1) files are typically poorly annotated experimentally, (2) files are "organically" distributed across laboratory file systems in an ad hoc manner, (3) files formats become obsolete, and (4) searching the data and comparing and contrasting results across separate experiments is very inefficient (if possible at all). Here we present a relational database architecture and accompanying web application dubbed Mass Spectrometry Data Platform that is designed to address the failings of the file-based mass spectrometry data management approach. The database is designed such that the output of disparate software pipelines may be imported into a core set of unified tables, with these core tables being extended to support data generated by specific pipelines. Because the data are unified, they may be queried, viewed, and compared across multiple experiments using a common web interface. Mass Spectrometry Data Platform is open source and freely available at http://code.google.com/p/msdapl/.

  20. Mass spectrometer output file format mzML.

    PubMed

    Deutsch, Eric W

    2010-01-01

    Mass spectrometry is an important technique for analyzing proteins and other biomolecular compounds in biological samples. Each of the vendors of these mass spectrometers uses a different proprietary binary output file format, which has hindered data sharing and the development of open source software for downstream analysis. The solution has been to develop, with the full participation of academic researchers as well as software and hardware vendors, an open XML-based format for encoding mass spectrometer output files, and then to write software to use this format for archiving, sharing, and processing. This chapter presents the various components and information available for this format, mzML. In addition to the XML schema that defines the file structure, a controlled vocabulary provides clear terms and definitions for the spectral metadata, and a semantic validation rules mapping file allows the mzML semantic validator to insure that an mzML document complies with one of several levels of requirements. Complete documentation and example files insure that the format may be uniformly implemented. At the time of release, there already existed several implementations of the format and vendors have committed to supporting the format in their products.

  1. An easy and effective approach to manage radiologic portable document format (PDF) files using iTunes.

    PubMed

    Qian, Li Jun; Zhou, Mi; Xu, Jian Rong

    2008-07-01

    The objective of this article is to explain an easy and effective approach for managing radiologic files in portable document format (PDF) using iTunes. PDF files are widely used as a standard file format for electronic publications as well as for medical online documents. Unfortunately, there is a lack of powerful software to manage numerous PDF documents. In this article, we explain how to use the hidden function of iTunes (Apple Computer) to manage PDF documents as easily as managing music files.

  2. 76 FR 10045 - Notice of Proposed Information Collection: Comment Request; “eLogic Model” Grant Performance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-23

    ... recommends not more than 32 characters). DO NOT convert Word files or Excel files into PDF format. Converting... not allow HUD to enter data from the Excel files into a database. DO NOT save your logic model in .xlsm format. If necessary save as an Excel 97-2003 .xls format. Using the .xlsm format can result in a...

  3. A Mass Spectrometry Proteomics Data Management Platform*

    PubMed Central

    Sharma, Vagisha; Eng, Jimmy K.; MacCoss, Michael J.; Riffle, Michael

    2012-01-01

    Mass spectrometry-based proteomics is increasingly being used in biomedical research. These experiments typically generate a large volume of highly complex data, and the volume and complexity are only increasing with time. There exist many software pipelines for analyzing these data (each typically with its own file formats), and as technology improves, these file formats change and new formats are developed. Files produced from these myriad software programs may accumulate on hard disks or tape drives over time, with older files being rendered progressively more obsolete and unusable with each successive technical advancement and data format change. Although initiatives exist to standardize the file formats used in proteomics, they do not address the core failings of a file-based data management system: (1) files are typically poorly annotated experimentally, (2) files are “organically” distributed across laboratory file systems in an ad hoc manner, (3) files formats become obsolete, and (4) searching the data and comparing and contrasting results across separate experiments is very inefficient (if possible at all). Here we present a relational database architecture and accompanying web application dubbed Mass Spectrometry Data Platform that is designed to address the failings of the file-based mass spectrometry data management approach. The database is designed such that the output of disparate software pipelines may be imported into a core set of unified tables, with these core tables being extended to support data generated by specific pipelines. Because the data are unified, they may be queried, viewed, and compared across multiple experiments using a common web interface. Mass Spectrometry Data Platform is open source and freely available at http://code.google.com/p/msdapl/. PMID:22611296

  4. 12 CFR 335.801 - Inapplicable SEC regulations; FDIC substituted regulations; additional information.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... a continuing hardship exemption under these rules may file the forms with the FDIC in paper format... these rules may file the appropriate forms with the FDIC in paper format. Instructions for continuing...) Previously filed exhibits, whether in paper or electronic format, may be incorporated by reference into an...

  5. 12 CFR 335.801 - Inapplicable SEC regulations; FDIC substituted regulations; additional information.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... a continuing hardship exemption under these rules may file the forms with the FDIC in paper format... these rules may file the appropriate forms with the FDIC in paper format. Instructions for continuing...) Previously filed exhibits, whether in paper or electronic format, may be incorporated by reference into an...

  6. 12 CFR 335.801 - Inapplicable SEC regulations; FDIC substituted regulations; additional information.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... a continuing hardship exemption under these rules may file the forms with the FDIC in paper format... these rules may file the appropriate forms with the FDIC in paper format. Instructions for continuing...) Previously filed exhibits, whether in paper or electronic format, may be incorporated by reference into an...

  7. 12 CFR 335.801 - Inapplicable SEC regulations; FDIC substituted regulations; additional information.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... a continuing hardship exemption under these rules may file the forms with the FDIC in paper format... these rules may file the appropriate forms with the FDIC in paper format. Instructions for continuing...) Previously filed exhibits, whether in paper or electronic format, may be incorporated by reference into an...

  8. Transferable Output ASCII Data (TOAD) gateway: Version 1.0 user's guide

    NASA Technical Reports Server (NTRS)

    Bingel, Bradford D.

    1991-01-01

    The Transferable Output ASCII Data (TOAD) Gateway, release 1.0 is described. This is a software tool for converting tabular data from one format into another via the TOAD format. This initial release of the Gateway allows free data interchange among the following file formats: TOAD; Standard Interface File (SIF); Program to Optimize Simulated Trajectories (POST) input; Comma Separated Value (TSV); and a general free-form file format. As required, additional formats can be accommodated quickly and easily.

  9. Sensitivity Data File Formats

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rearden, Bradley T.

    2016-04-01

    The format of the TSUNAMI-A sensitivity data file produced by SAMS for cases with deterministic transport solutions is given in Table 6.3.A.1. The occurrence of each entry in the data file is followed by an identification of the data contained on each line of the file and the FORTRAN edit descriptor denoting the format of each line. A brief description of each line is also presented. A sample of the TSUNAMI-A data file for the Flattop-25 sample problem is provided in Figure 6.3.A.1. Here, only two profiles out of the 130 computed are shown.

  10. TOAD Editor

    NASA Technical Reports Server (NTRS)

    Bingle, Bradford D.; Shea, Anne L.; Hofler, Alicia S.

    1993-01-01

    Transferable Output ASCII Data (TOAD) computer program (LAR-13755), implements format designed to facilitate transfer of data across communication networks and dissimilar host computer systems. Any data file conforming to TOAD format standard called TOAD file. TOAD Editor is interactive software tool for manipulating contents of TOAD files. Commonly used to extract filtered subsets of data for visualization of results of computation. Also offers such user-oriented features as on-line help, clear English error messages, startup file, macroinstructions defined by user, command history, user variables, UNDO features, and full complement of mathematical statistical, and conversion functions. Companion program, TOAD Gateway (LAR-14484), converts data files from variety of other file formats to that of TOAD. TOAD Editor written in FORTRAN 77.

  11. 78 FR 17233 - Notice of Opportunity To File Amicus Briefs

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-20

    .... Any commonly-used word processing format or PDF format is acceptable; text formats are preferable to image formats. Briefs may also be filed with the Office of the Clerk of the Board, Merit Systems...

  12. Displaying Composite and Archived Soundings in the Advanced Weather Interactive Processing System

    NASA Technical Reports Server (NTRS)

    Barrett, Joe H., III; Volkmer, Matthew R.; Blottman, Peter F.; Sharp, David W.

    2008-01-01

    In a previous task, the Applied Meteorology Unit (AMU) developed spatial and temporal climatologies of lightning occurrence based on eight atmospheric flow regimes. The AMU created climatological, or composite, soundings of wind speed and direction, temperature, and dew point temperature at four rawinsonde observation stations at Jacksonville, Tampa, Miami, and Cape Canaveral Air Force Station, for each of the eight flow regimes. The composite soundings were delivered to the National Weather Service (NWS) Melbourne (MLB) office for display using the National version of the Skew-T Hodograph analysis and Research Program (NSHARP) software program. The NWS MLB requested the AMU make the composite soundings available for display in the Advanced Weather Interactive Processing System (AWIPS), so they could be overlaid on current observed soundings. This will allow the forecasters to compare the current state of the atmosphere with climatology. This presentation describes how the AMU converted the composite soundings from NSHARP Archive format to Network Common Data Form (NetCDF) format, so that the soundings could be displayed in AWl PS. The NetCDF is a set of data formats, programming interfaces, and software libraries used to read and write scientific data files. In AWIPS, each meteorological data type, such as soundings or surface observations, has a unique NetCDF format. Each format is described by a NetCDF template file. Although NetCDF files are in binary format, they can be converted to a text format called network Common data form Description Language (CDL). A software utility called ncgen is used to create a NetCDF file from a CDL file, while the ncdump utility is used to create a CDL file from a NetCDF file. An AWIPS receives soundings in Binary Universal Form for the Representation of Meteorological data (BUFR) format (http://dss.ucar.edu/docs/formats/bufr/), and then decodes them into NetCDF format. Only two sounding files are generated in AWIPS per day. One file contains all of the soundings received worldwide between 0000 UTC and 1200 UTC, and the other includes all soundings between 1200 UTC and 0000 UTC. In order to add the composite soundings into AWIPS, a procedure was created to configure, or localize, AWIPS. This involved modifying and creating several configuration text files. A unique fourcharacter site identifier was created for each of the 32 soundings so each could be viewed separately. The first three characters were based on the site identifier of the observed sounding, while the last character was based on the flow regime. While researching the localization process for soundings, the AMU discovered a method of archiving soundings so old soundings would not get purged automatically by AWl PS. This method could provide an alternative way of localizing AWl PS for composite soundings. In addition, this would allow forecasters to use archived soundings in AWIPS for case studies. A test sounding file in NetCDF format was written in order to verify the correct format for soundings in AWIPS. After the file was viewed successfully in AWIPS, the AMU wrote a software program in the Tool Command Language/Tool Kit (Tcl/Tk) language to convert the 32 composite soundings from NSHARP Archive to CDL format. The ncgen utility was then used to convert the CDL file to a NetCDF file. The NetCDF file could then be read and displayed in AWIPS.

  13. SEGY to ASCII Conversion and Plotting Program 2.0

    USGS Publications Warehouse

    Goldman, Mark R.

    2005-01-01

    INTRODUCTION SEGY has long been a standard format for storing seismic data and header information. Almost every seismic processing package can read and write seismic data in SEGY format. In the data processing world, however, ASCII format is the 'universal' standard format. Very few general-purpose plotting or computation programs will accept data in SEGY format. The software presented in this report, referred to as SEGY to ASCII (SAC), converts seismic data written in SEGY format (Barry et al., 1975) to an ASCII data file, and then creates a postscript file of the seismic data using a general plotting package (GMT, Wessel and Smith, 1995). The resulting postscript file may be plotted by any standard postscript plotting program. There are two versions of SAC: one version for plotting a SEGY file that contains a single gather, such as a stacked CDP or migrated section, and a second version for plotting multiple gathers from a SEGY file containing more than one gather, such as a collection of shot gathers. Note that if a SEGY file has multiple gathers, then each gather must have the same number of traces per gather, and each trace must have the same sample interval and number of samples per trace. SAC will read several common standards of SEGY data, including SEGY files with sample values written in either IBM or IEEE floating-point format. In addition, utility programs are present to convert non-standard Seismic Unix (.sux) SEGY files and PASSCAL (.rsy) SEGY files to standard SEGY files. SAC allows complete user control over all plotting parameters including label size and font, tick mark intervals, trace scaling, and the inclusion of a title and descriptive text. SAC shell scripts create a postscript image of the seismic data in vector rather than bitmap format, using GMT's pswiggle command. Although this can produce a very large postscript file, the image quality is generally superior to that of a bitmap image, and commercial programs such as Adobe Illustrator? can manipulate the image more efficiently.

  14. Tools for Requirements Management: A Comparison of Telelogic DOORS and the HiVe

    DTIC Science & Technology

    2006-07-01

    types DOORS deals with are text files, spreadsheets, FrameMaker , rich text, Microsoft Word and Microsoft Project. 2.5.1 Predefined file formats DOORS...during the export. DOORS exports FrameMaker files in an incomplete format, meaning DOORS exported files will have to be opened in FrameMaker and saved

  15. 76 FR 10405 - Federal Copyright Protection of Sound Recordings Fixed Before February 15, 1972

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-24

    ... file in either the Adobe Portable Document File (PDF) format that contains searchable, accessible text (not an image); Microsoft Word; WordPerfect; Rich Text Format (RTF); or ASCII text file format (not a..., comments may be delivered in hard copy. If hand delivered by a private party, an original [[Page 10406...

  16. The prevalence of encoded digital trace evidence in the nonfile space of computer media(,) (.).

    PubMed

    Garfinkel, Simson L

    2014-09-01

    Forensically significant digital trace evidence that is frequently present in sectors of digital media not associated with allocated or deleted files. Modern digital forensic tools generally do not decompress such data unless a specific file with a recognized file type is first identified, potentially resulting in missed evidence. Email addresses are encoded differently for different file formats. As a result, trace evidence can be categorized as Plain in File (PF), Encoded in File (EF), Plain Not in File (PNF), or Encoded Not in File (ENF). The tool bulk_extractor finds all of these formats, but other forensic tools do not. A study of 961 storage devices purchased on the secondary market and shows that 474 contained encoded email addresses that were not in files (ENF). Different encoding formats are the result of different application programs that processed different kinds of digital trace evidence. Specific encoding formats explored include BASE64, GZIP, PDF, HIBER, and ZIP. Published 2014. This article is a U.S. Government work and is in the public domain in the USA. Journal of Forensic Sciences published by Wiley Periodicals, Inc. on behalf of American Academy of Forensic Sciences.

  17. Highway Safety Information System guidebook for the Minnesota state data files. Volume 1 : SAS file formats

    DOT National Transportation Integrated Search

    2001-02-01

    The Minnesota data system includes the following basic files: Accident data (Accident File, Vehicle File, Occupant File); Roadlog File; Reference Post File; Traffic File; Intersection File; Bridge (Structures) File; and RR Grade Crossing File. For ea...

  18. PDB explorer -- a web based algorithm for protein annotation viewer and 3D visualization.

    PubMed

    Nayarisseri, Anuraj; Shardiwal, Rakesh Kumar; Yadav, Mukesh; Kanungo, Neha; Singh, Pooja; Shah, Pratik; Ahmed, Sheaza

    2014-12-01

    The PDB file format, is a text format characterizing the three dimensional structures of macro molecules available in the Protein Data Bank (PDB). Determined protein structure are found in coalition with other molecules or ions such as nucleic acids, water, ions, Drug molecules and so on, which therefore can be described in the PDB format and have been deposited in PDB database. PDB is a machine generated file, it's not human readable format, to read this file we need any computational tool to understand it. The objective of our present study is to develop a free online software for retrieval, visualization and reading of annotation of a protein 3D structure which is available in PDB database. Main aim is to create PDB file in human readable format, i.e., the information in PDB file is converted in readable sentences. It displays all possible information from a PDB file including 3D structure of that file. Programming languages and scripting languages like Perl, CSS, Javascript, Ajax, and HTML have been used for the development of PDB Explorer. The PDB Explorer directly parses the PDB file, calling methods for parsed element secondary structure element, atoms, coordinates etc. PDB Explorer is freely available at http://www.pdbexplorer.eminentbio.com/home with no requirement of log-in.

  19. NoSQL: collection document and cloud by using a dynamic web query form

    NASA Astrophysics Data System (ADS)

    Abdalla, Hemn B.; Lin, Jinzhao; Li, Guoquan

    2015-07-01

    Mongo-DB (from "humongous") is an open-source document database and the leading NoSQL database. A NoSQL (Not Only SQL, next generation databases, being non-relational, deal, open-source and horizontally scalable) presenting a mechanism for storage and retrieval of documents. Previously, we stored and retrieved the data using the SQL queries. Here, we use the MonogoDB that means we are not utilizing the MySQL and SQL queries. Directly importing the documents into our Drives, retrieving the documents on that drive by not applying the SQL queries, using the IO BufferReader and Writer, BufferReader for importing our type of document files to my folder (Drive). For retrieving the document files, the usage is BufferWriter from the particular folder (or) Drive. In this sense, providing the security for those storing files for what purpose means if we store the documents in our local folder means all or views that file and modified that file. So preventing that file, we are furnishing the security. The original document files will be changed to another format like in this paper; Binary format is used. Our documents will be converting to the binary format after that direct storing in one of our folder, that time the storage space will provide the private key for accessing that file. Wherever any user tries to discover the Document files means that file data are in the binary format, the document's file owner simply views that original format using that personal key from receive the secret key from the cloud.

  20. The Design and Usage of the New Data Management Features in NASTRAN

    NASA Technical Reports Server (NTRS)

    Pamidi, P. R.; Brown, W. K.

    1984-01-01

    Two new data management features are installed in the April 1984 release of NASTRAN. These two features are the Rigid Format Data Base and the READFILE capability. The Rigid Format Data Base is stored on external files in card image format and can be easily maintained and expanded by the use of standard text editors. This data base provides the user and the NASTRAN maintenance contractor with an easy means for making changes to a Rigid Format or for generating new Rigid Formats without unnecessary compilations and link editing of NASTRAN. Each Rigid Format entry in the data base contains the Direct Matrix Abstraction Program (DMAP), along with the associated restart, DMAP sequence subset and substructure control flags. The READFILE capability allows an user to reference an external secondary file from the NASTRAN primary input file and to read data from this secondary file. There is no limit to the number of external secondary files that may be referenced and read.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sublet, J.-Ch.; Koning, A.J.; Forrest, R.A.

    The reasons for the conversion of the European Activation File, EAF into ENDF-6 format are threefold. First, it significantly enhances the JEFF-3.0 release by the addition of an activation file. Second, to considerably increase its usage by using a recognized, official file format, allowing existing plug-in processes to be effective; and third, to move towards a universal nuclear data file in contrast to the current separate general and special-purpose files. The format chosen for the JEFF-3.0/A file uses reaction cross sections (MF-3), cross sections (MF-10), and multiplicities (MF-9). Having the data in ENDF-6 format allows the ENDF suite of utilitiesmore » and checker codes to be used alongside many other utility, visualizing, and processing codes. It is based on the EAF activation file used for many applications from fission to fusion, including dosimetry, inventories, depletion-transmutation, and geophysics. JEFF-3.0/A takes advantage of four generations of EAF files. Extensive benchmarking activities on these files provide feedback and validation with integral measurements. These, in parallel with a detailed graphical analysis based on EXFOR, have been applied stimulating new measurements, significantly increasing the quality of this activation file. The next step is to include the EAF uncertainty data for all channels into JEFF-3.0/A.« less

  2. The "grep" command but not FusionMap, FusionFinder or ChimeraScan captures the CIC-DUX4 fusion gene from whole transcriptome sequencing data on a small round cell tumor with t(4;19)(q35;q13).

    PubMed

    Panagopoulos, Ioannis; Gorunova, Ludmila; Bjerkehagen, Bodil; Heim, Sverre

    2014-01-01

    Whole transcriptome sequencing was used to study a small round cell tumor in which a t(4;19)(q35;q13) was part of the complex karyotype but where the initial reverse transcriptase PCR (RT-PCR) examination did not detect a CIC-DUX4 fusion transcript previously described as the crucial gene-level outcome of this specific translocation. The RNA sequencing data were analysed using the FusionMap, FusionFinder, and ChimeraScan programs which are specifically designed to identify fusion genes. FusionMap, FusionFinder, and ChimeraScan identified 1017, 102, and 101 fusion transcripts, respectively, but CIC-DUX4 was not among them. Since the RNA sequencing data are in the fastq text-based format, we searched the files using the "grep" command-line utility. The "grep" command searches the text for specific expressions and displays, by default, the lines where matches occur. The "specific expression" was a sequence of 20 nucleotides from the coding part of the last exon 20 of CIC (Reference Sequence: NM_015125.3) chosen since all the so far reported CIC breakpoints have occurred here. Fifteen chimeric CIC-DUX4 cDNA sequences were captured and the fusion between the CIC and DUX4 genes was mapped precisely. New primer combinations were constructed based on these findings and were used together with a polymerase suitable for amplification of GC-rich DNA templates to amplify CIC-DUX4 cDNA fragments which had the same fusion point found with "grep". In conclusion, FusionMap, FusionFinder, and ChimeraScan generated a plethora of fusion transcripts but did not detect the biologically important CIC-DUX4 chimeric transcript; they are generally useful but evidently suffer from imperfect both sensitivity and specificity. The "grep" command is an excellent tool to capture chimeric transcripts from RNA sequencing data when the pathological and/or cytogenetic information strongly indicates the presence of a specific fusion gene.

  3. FRS Geospatial Return File Format

    EPA Pesticide Factsheets

    The Geospatial Return File Format describes format that needs to be used to submit latitude and longitude coordinates for use in Envirofacts mapping applications. These coordinates are stored in the Geospatail Reference Tables.

  4. SEDIMENT DATA - COMMENCEMENT BAY HYLEBOS WATERWAY - TACOMA, WA - PRE-REMEDIAL DESIGN PROGRAM

    EPA Science Inventory

    Event 1A/1B Data Files URL address: http://www.epa.gov/r10earth/datalib/superfund/hybos1ab.htm. Sediment Chemistry Data (Database Format): HYBOS1AB.EXE is a self-extracting file which expands to the single-value per record .DBF format database file HYBOS1AB.DBF. This file contai...

  5. 76 FR 5431 - Released Rates of Motor Common Carriers of Household Goods

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-31

    ... may be submitted either via the Board's e-filing format or in traditional paper format. Any person using e-filing should attach a document and otherwise comply with the instructions at the E- FILING link on the Board's website at http://www.stb.dot.gov . Any person submitting a filing in the traditional...

  6. 75 FR 52054 - Assessment of Mediation and Arbitration Procedures

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-24

    ...: Comments may be submitted either via the Board's e-filing format or in the traditional paper format. Any person using e-filing should attach a document and otherwise comply with the instructions at the E-FILING link on the Board's Web site, at http://www.stb.dot.gov . Any person submitting a filing in the...

  7. 75 FR 60846 - Bureau of Consular Affairs; Registration for the Diversity Immigrant (DV-2012) Visa Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-01

    ... need to submit a photo for a child who is already a U.S. citizen or a Legal Permanent Resident. Group... Joint Photographic Experts Group (JPEG) format; it must have a maximum image file size of two hundred... (dpi); the image file format in Joint Photographic Experts Group (JPEG) format; the maximum image file...

  8. 78 FR 59743 - Bureau of Consular Affairs; Registration for the Diversity Immigrant (DV-2015) Visa Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-09-27

    ... already a U.S. citizen or a Lawful Permanent Resident, but you will not be penalized if you do. Group... specifications: Image File Format: The miage must be in the Joint Photographic Experts Group (JPEG) format. Image... in the Joint Photographic Experts Group (JPEG) format. Image File Size: The maximum image file size...

  9. Photon-HDF5: An Open File Format for Timestamp-Based Single-Molecule Fluorescence Experiments.

    PubMed

    Ingargiola, Antonino; Laurence, Ted; Boutelle, Robert; Weiss, Shimon; Michalet, Xavier

    2016-01-05

    We introduce Photon-HDF5, an open and efficient file format to simplify exchange and long-term accessibility of data from single-molecule fluorescence experiments based on photon-counting detectors such as single-photon avalanche diode, photomultiplier tube, or arrays of such detectors. The format is based on HDF5, a widely used platform- and language-independent hierarchical file format for which user-friendly viewers are available. Photon-HDF5 can store raw photon data (timestamp, channel number, etc.) from any acquisition hardware, but also setup and sample description, information on provenance, authorship and other metadata, and is flexible enough to include any kind of custom data. The format specifications are hosted on a public website, which is open to contributions by the biophysics community. As an initial resource, the website provides code examples to read Photon-HDF5 files in several programming languages and a reference Python library (phconvert), to create new Photon-HDF5 files and convert several existing file formats into Photon-HDF5. To encourage adoption by the academic and commercial communities, all software is released under the MIT open source license. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  10. Photon-HDF5: An Open File Format for Timestamp-Based Single-Molecule Fluorescence Experiments

    PubMed Central

    Ingargiola, Antonino; Laurence, Ted; Boutelle, Robert; Weiss, Shimon; Michalet, Xavier

    2016-01-01

    We introduce Photon-HDF5, an open and efficient file format to simplify exchange and long-term accessibility of data from single-molecule fluorescence experiments based on photon-counting detectors such as single-photon avalanche diode, photomultiplier tube, or arrays of such detectors. The format is based on HDF5, a widely used platform- and language-independent hierarchical file format for which user-friendly viewers are available. Photon-HDF5 can store raw photon data (timestamp, channel number, etc.) from any acquisition hardware, but also setup and sample description, information on provenance, authorship and other metadata, and is flexible enough to include any kind of custom data. The format specifications are hosted on a public website, which is open to contributions by the biophysics community. As an initial resource, the website provides code examples to read Photon-HDF5 files in several programming languages and a reference Python library (phconvert), to create new Photon-HDF5 files and convert several existing file formats into Photon-HDF5. To encourage adoption by the academic and commercial communities, all software is released under the MIT open source license. PMID:26745406

  11. Photon-HDF5: an open file format for single-molecule fluorescence experiments using photon-counting detectors

    DOE PAGES

    Ingargiola, A.; Laurence, T. A.; Boutelle, R.; ...

    2015-12-23

    We introduce Photon-HDF5, an open and efficient file format to simplify exchange and long term accessibility of data from single-molecule fluorescence experiments based on photon-counting detectors such as single-photon avalanche diode (SPAD), photomultiplier tube (PMT) or arrays of such detectors. The format is based on HDF5, a widely used platform- and language-independent hierarchical file format for which user-friendly viewers are available. Photon-HDF5 can store raw photon data (timestamp, channel number, etc) from any acquisition hardware, but also setup and sample description, information on provenance, authorship and other metadata, and is flexible enough to include any kind of custom data. Themore » format specifications are hosted on a public website, which is open to contributions by the biophysics community. As an initial resource, the website provides code examples to read Photon-HDF5 files in several programming languages and a reference python library (phconvert), to create new Photon-HDF5 files and convert several existing file formats into Photon-HDF5. As a result, to encourage adoption by the academic and commercial communities, all software is released under the MIT open source license.« less

  12. OMERO and Bio-Formats 5: flexible access to large bioimaging datasets at scale

    NASA Astrophysics Data System (ADS)

    Moore, Josh; Linkert, Melissa; Blackburn, Colin; Carroll, Mark; Ferguson, Richard K.; Flynn, Helen; Gillen, Kenneth; Leigh, Roger; Li, Simon; Lindner, Dominik; Moore, William J.; Patterson, Andrew J.; Pindelski, Blazej; Ramalingam, Balaji; Rozbicki, Emil; Tarkowska, Aleksandra; Walczysko, Petr; Allan, Chris; Burel, Jean-Marie; Swedlow, Jason

    2015-03-01

    The Open Microscopy Environment (OME) has built and released Bio-Formats, a Java-based proprietary file format conversion tool and OMERO, an enterprise data management platform under open source licenses. In this report, we describe new versions of Bio-Formats and OMERO that are specifically designed to support large, multi-gigabyte or terabyte scale datasets that are routinely collected across most domains of biological and biomedical research. Bio- Formats reads image data directly from native proprietary formats, bypassing the need for conversion into a standard format. It implements the concept of a file set, a container that defines the contents of multi-dimensional data comprised of many files. OMERO uses Bio-Formats to read files natively, and provides a flexible access mechanism that supports several different storage and access strategies. These new capabilities of OMERO and Bio-Formats make them especially useful for use in imaging applications like digital pathology, high content screening and light sheet microscopy that create routinely large datasets that must be managed and analyzed.

  13. Keemei: cloud-based validation of tabular bioinformatics file formats in Google Sheets.

    PubMed

    Rideout, Jai Ram; Chase, John H; Bolyen, Evan; Ackermann, Gail; González, Antonio; Knight, Rob; Caporaso, J Gregory

    2016-06-13

    Bioinformatics software often requires human-generated tabular text files as input and has specific requirements for how those data are formatted. Users frequently manage these data in spreadsheet programs, which is convenient for researchers who are compiling the requisite information because the spreadsheet programs can easily be used on different platforms including laptops and tablets, and because they provide a familiar interface. It is increasingly common for many different researchers to be involved in compiling these data, including study coordinators, clinicians, lab technicians and bioinformaticians. As a result, many research groups are shifting toward using cloud-based spreadsheet programs, such as Google Sheets, which support the concurrent editing of a single spreadsheet by different users working on different platforms. Most of the researchers who enter data are not familiar with the formatting requirements of the bioinformatics programs that will be used, so validating and correcting file formats is often a bottleneck prior to beginning bioinformatics analysis. We present Keemei, a Google Sheets Add-on, for validating tabular files used in bioinformatics analyses. Keemei is available free of charge from Google's Chrome Web Store. Keemei can be installed and run on any web browser supported by Google Sheets. Keemei currently supports the validation of two widely used tabular bioinformatics formats, the Quantitative Insights into Microbial Ecology (QIIME) sample metadata mapping file format and the Spatially Referenced Genetic Data (SRGD) format, but is designed to easily support the addition of others. Keemei will save researchers time and frustration by providing a convenient interface for tabular bioinformatics file format validation. By allowing everyone involved with data entry for a project to easily validate their data, it will reduce the validation and formatting bottlenecks that are commonly encountered when human-generated data files are first used with a bioinformatics system. Simplifying the validation of essential tabular data files, such as sample metadata, will reduce common errors and thereby improve the quality and reliability of research outcomes.

  14. A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository.

    PubMed

    Smelter, Andrey; Moseley, Hunter N B

    2018-01-01

    The Metabolomics Workbench Data Repository is a public repository of mass spectrometry and nuclear magnetic resonance data and metadata derived from a wide variety of metabolomics studies. The data and metadata for each study is deposited, stored, and accessed via files in the domain-specific 'mwTab' flat file format. In order to improve the accessibility, reusability, and interoperability of the data and metadata stored in 'mwTab' formatted files, we implemented a Python library and package. This Python package, named 'mwtab', is a parser for the domain-specific 'mwTab' flat file format, which provides facilities for reading, accessing, and writing 'mwTab' formatted files. Furthermore, the package provides facilities to validate both the format and required metadata elements of a given 'mwTab' formatted file. In order to develop the 'mwtab' package we used the official 'mwTab' format specification. We used Git version control along with Python unit-testing framework as well as continuous integration service to run those tests on multiple versions of Python. Package documentation was developed using sphinx documentation generator. The 'mwtab' package provides both Python programmatic library interfaces and command-line interfaces for reading, writing, and validating 'mwTab' formatted files. Data and associated metadata are stored within Python dictionary- and list-based data structures, enabling straightforward, 'pythonic' access and manipulation of data and metadata. Also, the package provides facilities to convert 'mwTab' files into a JSON formatted equivalent, enabling easy reusability of the data by all modern programming languages that implement JSON parsers. The 'mwtab' package implements its metadata validation functionality based on a pre-defined JSON schema that can be easily specialized for specific types of metabolomics studies. The library also provides a command-line interface for interconversion between 'mwTab' and JSONized formats in raw text and a variety of compressed binary file formats. The 'mwtab' package is an easy-to-use Python package that provides FAIRer utilization of the Metabolomics Workbench Data Repository. The source code is freely available on GitHub and via the Python Package Index. Documentation includes a 'User Guide', 'Tutorial', and 'API Reference'. The GitHub repository also provides 'mwtab' package unit-tests via a continuous integration service.

  15. Accelerating Malware Detection via a Graphics Processing Unit

    DTIC Science & Technology

    2010-09-01

    Processing Unit . . . . . . . . . . . . . . . . . . 4 PE Portable Executable . . . . . . . . . . . . . . . . . . . . . 4 COFF Common Object File Format...operating systems for the future [Szo05]. The PE format is an updated version of the common object file format ( COFF ) [Mic06]. Microsoft released a new...NAs02]. These alerts can be costly in terms of time and resources for individuals and organizations to investigate each misidentified file [YWL07] [Vak10

  16. A fast and efficient python library for interfacing with the Biological Magnetic Resonance Data Bank.

    PubMed

    Smelter, Andrey; Astra, Morgan; Moseley, Hunter N B

    2017-03-17

    The Biological Magnetic Resonance Data Bank (BMRB) is a public repository of Nuclear Magnetic Resonance (NMR) spectroscopic data of biological macromolecules. It is an important resource for many researchers using NMR to study structural, biophysical, and biochemical properties of biological macromolecules. It is primarily maintained and accessed in a flat file ASCII format known as NMR-STAR. While the format is human readable, the size of most BMRB entries makes computer readability and explicit representation a practical requirement for almost any rigorous systematic analysis. To aid in the use of this public resource, we have developed a package called nmrstarlib in the popular open-source programming language Python. The nmrstarlib's implementation is very efficient, both in design and execution. The library has facilities for reading and writing both NMR-STAR version 2.1 and 3.1 formatted files, parsing them into usable Python dictionary- and list-based data structures, making access and manipulation of the experimental data very natural within Python programs (i.e. "saveframe" and "loop" records represented as individual Python dictionary data structures). Another major advantage of this design is that data stored in original NMR-STAR can be easily converted into its equivalent JavaScript Object Notation (JSON) format, a lightweight data interchange format, facilitating data access and manipulation using Python and any other programming language that implements a JSON parser/generator (i.e., all popular programming languages). We have also developed tools to visualize assigned chemical shift values and to convert between NMR-STAR and JSONized NMR-STAR formatted files. Full API Reference Documentation, User Guide and Tutorial with code examples are also available. We have tested this new library on all current BMRB entries: 100% of all entries are parsed without any errors for both NMR-STAR version 2.1 and version 3.1 formatted files. We also compared our software to three currently available Python libraries for parsing NMR-STAR formatted files: PyStarLib, NMRPyStar, and PyNMRSTAR. The nmrstarlib package is a simple, fast, and efficient library for accessing data from the BMRB. The library provides an intuitive dictionary-based interface with which Python programs can read, edit, and write NMR-STAR formatted files and their equivalent JSONized NMR-STAR files. The nmrstarlib package can be used as a library for accessing and manipulating data stored in NMR-STAR files and as a command-line tool to convert from NMR-STAR file format into its equivalent JSON file format and vice versa, and to visualize chemical shift values. Furthermore, the nmrstarlib implementation provides a guide for effectively JSONizing other older scientific formats, improving the FAIRness of data in these formats.

  17. Digital geologic map of the Butler Peak 7.5' quadrangle, San Bernardino County, California

    USGS Publications Warehouse

    Miller, Fred K.; Matti, Jonathan C.; Brown, Howard J.; digital preparation by Cossette, P. M.

    2000-01-01

    Open-File Report 00-145, is a digital geologic map database of the Butler Peak 7.5' quadrangle that includes (1) ARC/INFO (Environmental Systems Research Institute) version 7.2.1 Patch 1 coverages, and associated tables, (2) a Portable Document Format (.pdf) file of the Description of Map Units, Correlation of Map Units chart, and an explanation of symbols used on the map, btlrpk_dcmu.pdf, (3) a Portable Document Format file of this Readme, btlrpk_rme.pdf (the Readme is also included as an ascii file in the data package), and (4) a PostScript plot file of the map, Correlation of Map Units, and Description of Map Units on a single sheet, btlrpk.ps. No paper map is included in the Open-File report, but the PostScript plot file (number 4 above) can be used to produce one. The PostScript plot file generates a map, peripheral text, and diagrams in the editorial format of USGS Geologic Investigation Series (I-series) maps.

  18. MXA: a customizable HDF5-based data format for multi-dimensional data sets

    NASA Astrophysics Data System (ADS)

    Jackson, M.; Simmons, J. P.; De Graef, M.

    2010-09-01

    A new digital file format is proposed for the long-term archival storage of experimental data sets generated by serial sectioning instruments. The format is known as the multi-dimensional eXtensible Archive (MXA) format and is based on the public domain Hierarchical Data Format (HDF5). The MXA data model, its description by means of an eXtensible Markup Language (XML) file with associated Document Type Definition (DTD) are described in detail. The public domain MXA package is available through a dedicated web site (mxa.web.cmu.edu), along with implementation details and example data files.

  19. A Summary of Proposed Changes to the Current ICARTT Format Standards and their Implications to Future Airborne Studies

    NASA Astrophysics Data System (ADS)

    Northup, E. A.; Kusterer, J.; Quam, B.; Chen, G.; Early, A. B.; Beach, A. L., III

    2015-12-01

    The current ICARTT file format standards were developed for the purpose of fulfilling the data management needs for the International Consortium for Atmospheric Research on Transport and Transformation (ICARTT) campaign in 2004. The goal of the ICARTT file format was to establish a common and simple to use data file format to promote data exchange and collaboration among science teams with similar science objectives. ICARTT has been the NASA standard since 2010, and is widely used by NOAA, NSF, and international partners (DLR, FAAM). Despite its level of acceptance, there are a number of issues with the current ICARTT format, especially concerning the machine readability. To enhance usability, the ICARTT Refresh Earth Science Data Systems Working Group (ESDSWG) was established to enable a platform for atmospheric science data producers, users (e.g. modelers) and data managers to collaborate on developing criteria for this file format. Ultimately, this is a cross agency effort to improve and aggregate the metadata records being produced. After conducting a survey to identify deficiencies in the current format, we determined which are considered most important to the various communities. Numerous recommendations were made to improve upon the file format while maintaining backward compatibility. The recommendations made to date and their advantages and limitations will be discussed.

  20. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  1. De novo transcriptome analysis of an imminent biofuel crop, Camelina sativa L. using Illumina GAIIX sequencing platform and identification of SSR markers.

    PubMed

    Mudalkar, Shalini; Golla, Ramesh; Ghatty, Sreenivas; Reddy, Attipalli Ramachandra

    2014-01-01

    Camelina sativa L. is an emerging biofuel crop with potential applications in industry, medicine, cosmetics and human nutrition. The crop is unexploited owing to very limited availability of transcriptome and genomic data. In order to analyse the various metabolic pathways, we performed de novo assembly of the transcriptome on Illumina GAIIX platform with paired end sequencing for obtaining short reads. The sequencing output generated a FastQ file size of 2.97 GB with 10.83 million reads having a maximum read length of 101 nucleotides. The number of contigs generated was 53,854 with maximum and minimum lengths of 10,086 and 200 nucleotides respectively. These trancripts were annotated using BLAST search against the Aracyc, Swiss-Prot, TrEMBL, gene ontology and clusters of orthologous groups (KOG) databases. The genes involved in lipid metabolism were studied and the transcription factors were identified. Sequence similarity studies of Camelina with the other related organisms indicated the close relatedness of Camelina with Arabidopsis. In addition, bioinformatics analysis revealed the presence of a total of 19,379 simple sequence repeats. This is the first report on Camelina sativa L., where the transcriptome of the entire plant, including seedlings, seed, root, leaves and stem was done. Our data established an excellent resource for gene discovery and provide useful information for functional and comparative genomic studies in this promising biofuel crop.

  2. Disk-based k-mer counting on a PC

    PubMed Central

    2013-01-01

    Background The k-mer counting problem, which is to build the histogram of occurrences of every k-symbol long substring in a given text, is important for many bioinformatics applications. They include developing de Bruijn graph genome assemblers, fast multiple sequence alignment and repeat detection. Results We propose a simple, yet efficient, parallel disk-based algorithm for counting k-mers. Experiments show that it usually offers the fastest solution to the considered problem, while demanding a relatively small amount of memory. In particular, it is capable of counting the statistics for short-read human genome data, in input gzipped FASTQ file, in less than 40 minutes on a PC with 16 GB of RAM and 6 CPU cores, and for long-read human genome data in less than 70 minutes. On a more powerful machine, using 32 GB of RAM and 32 CPU cores, the tasks are accomplished in less than half the time. No other algorithm for most tested settings of this problem and mammalian-size data can accomplish this task in comparable time. Our solution also belongs to memory-frugal ones; most competitive algorithms cannot efficiently work on a PC with 16 GB of memory for such massive data. Conclusions By making use of cheap disk space and exploiting CPU and I/O parallelism we propose a very competitive k-mer counting procedure, called KMC. Our results suggest that judicious resource management may allow to solve at least some bioinformatics problems with massive data on a commodity personal computer. PMID:23679007

  3. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    PubMed

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  4. Evaluation of sampling and storage procedures on preserving the community structure of stool microbiota: A simple at-home toilet-paper collection method.

    PubMed

    Al, Kait F; Bisanz, Jordan E; Gloor, Gregory B; Reid, Gregor; Burton, Jeremy P

    2018-01-01

    The increasing interest on the impact of the gut microbiota on health and disease has resulted in multiple human microbiome-related studies emerging. However, multiple sampling methods are being used, making cross-comparison of results difficult. To avoid additional clinic visits and increase patient recruitment to these studies, there is the potential to utilize at-home stool sampling. The aim of this pilot study was to compare simple self-sampling collection and storage methods. To simulate storage conditions, stool samples from three volunteers were freshly collected, placed on toilet tissue, and stored at four temperatures (-80, 7, 22 and 37°C), either dry or in the presence of a stabilization agent (RNAlater®) for 3 or 7days. Using 16S rRNA gene sequencing by Illumina, the effect of storage variations for each sample was compared to a reference community from fresh, unstored counterparts. Fastq files may be accessed in the NCBI Sequence Read Archive: Bioproject ID PRJNA418287. Microbial diversity and composition were not significantly altered by any storage method. Samples were always separable based on participant, regardless of storage method suggesting there was no need for sample preservation by a stabilization agent. In summary, if immediate sample processing is not feasible, short term storage of unpreserved stool samples on toilet paper offers a reliable way to assess the microbiota composition by 16S rRNA gene sequencing. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. NASA Standard for Airborne Data: ICARTT Format ESDS-RFC-019

    NASA Astrophysics Data System (ADS)

    Thornhill, A.; Brown, C.; Aknan, A.; Crawford, J. H.; Chen, G.; Williams, E. J.

    2011-12-01

    Airborne field studies generate a plethora of data products in the effort to study atmospheric composition and processes. Data file formats for airborne field campaigns are designed to present data in an understandable and organized way to support collaboration and to document relevant and important meta data. The ICARTT file format was created to facilitate data management during the International Consortium for Atmospheric Research on Transport and Transformation (ICARTT) campaign in 2004 that involved government-agencies and university participants from five countries. Since this mission the ICARTT format has been used in subsequent field campaigns such as Polar Study Using Aircraft Remote Sensing, Surface Measurements and Models of Climates, Chemistry, Aerosols, and Transport (POLARCAT) and the first phase of Deriving Information on Surface Conditions from COlumn and VERtically Resolved Observations Relevant to Air Quality (DISCOVER-AQ). The ICARTT file format has been endorsed as a standard format for airborne data by the Standard Process Group (SPG), one of the Earth Science Data Systems Working Groups (ESDSWG) in 2010. The detailed description of the ICARTT format can be found at http://www-air.larc.nasa.gov/missions/etc/ESDS-RFC-019-v1.00.pdf. The ICARTT data format is an ASCII, comma delimited format that was based on the NASA Ames and GTE file formats. The file header is detailed enough to fully describe the data for users outside of the instrument group and includes a description of the meta data. The ICARTT scanning tools, format structure, implementations, and examples will be presented.

  6. Viewing Files | Smokefree 60+

    Cancer.gov

    In addition to standard HTML webpages, our website contains files in other formats. You may need additional software or browser plug-ins to view some of these files. The following list shows each format along with links to the corresponding freely available plug-ins or viewers. Documents  Adobe Acrobat Reader (.pdf)

  7. Dependency Tree Annotation Software

    DTIC Science & Technology

    2015-11-01

    formats, and it provides numerous options for customizing how dependency trees are displayed. Built entirely in Java , it can run on a wide range of...tree can be saved as an image, .mxe (a mxGraph editing file), a .conll file, and several other file formats. DTE uses the open source Java version

  8. Representation of thermal infrared imaging data in the DICOM using XML configuration files.

    PubMed

    Ruminski, Jacek

    2007-01-01

    The DICOM standard has become a widely accepted and implemented format for the exchange and storage of medical imaging data. Different imaging modalities are supported however there is not a dedicated solution for thermal infrared imaging in medicine. In this article we propose new ideas and improvements to final proposal of the new DICOM Thermal Infrared Imaging structures and services. Additionally, we designed, implemented and tested software packages for universal conversion of existing thermal imaging files to the DICOM format using XML configuration files. The proposed solution works fast and requires minimal number of user interactions. The XML configuration file enables to compose a set of attributes for any source file format of thermal imaging camera.

  9. Transported Geothermal Energy Technoeconomic Screening Tool - Calculation Engine

    DOE Data Explorer

    Liu, Xiaobing

    2016-09-21

    This calculation engine estimates technoeconomic feasibility for transported geothermal energy projects. The TGE screening tool (geotool.exe) takes input from input file (input.txt), and list results into output file (output.txt). Both the input and ouput files are in the same folder as the geotool.exe. To use the tool, the input file containing adequate information of the case should be prepared in the format explained below, and the input file should be put into the same folder as geotool.exe. Then the geotool.exe can be executed, which will generate a output.txt file in the same folder containing all key calculation results. The format and content of the output file is explained below as well.

  10. Effect of Instrumentation Length and Instrumentation Systems: Hand Versus Rotary Files on Apical Crack Formation – An In vitro Study

    PubMed Central

    Mahesh, MC; Bhandary, Shreetha

    2017-01-01

    Introduction Stresses generated during root canal instrumentation have been reported to cause apical cracks. The smaller, less pronounced defects like cracks can later propagate into vertical root fracture, when the tooth is subjected to repeated stresses from endodontic or restorative procedures. Aim This study evaluated occurrence of apical cracks with stainless steel hand files, rotary NiTi RaCe and K3 files at two different instrumentation lengths. Materials and Methods In the present in vitro study, 60 mandibular premolars were mounted in resin blocks with simulated periodontal ligament. Apical 3 mm of the root surfaces were exposed and stained using India ink. Preoperative images of root apices were obtained at 100x using stereomicroscope. The teeth were divided into six groups of 10 each. First two groups were instrumented with stainless steel files, next two groups with rotary NiTi RaCe files and the last two groups with rotary NiTi K3 files. The instrumentation was carried out till the apical foramen (Working Length-WL) and 1 mm short of the apical foramen (WL-1) with each file system. After root canal instrumentation, postoperative images of root apices were obtained. Preoperative and postoperative images were compared and the occurrence of cracks was recorded. Descriptive statistical analysis and Chi-square tests were used to analyze the results. Results Apical root cracks were seen in 30%, 35% and 20% of teeth instrumented with K-files, RaCe files and K3 files respectively. There was no statistical significance among three instrumentation systems in the formation of apical cracks (p=0.563). Apical cracks were seen in 40% and 20% of teeth instrumented with K-files; 60% and 10% of teeth with RaCe files and 40% and 0% of teeth with K3 files at WL and WL-1 respectively. For groups instrumented with hand files there was no statistical significance in number of cracks at WL and WL-1 (p=0.628). But for teeth instrumented with RaCe files and K3 files significantly more number of cracks were seen at WL than WL-1 (p=0.057 for RaCe files and p=0.087 for K3 files). Conclusion There was no statistical significance between stainless steel hand files and rotary files in terms of crack formation. Instrumentation length had a significant effect on the formation of cracks when rotary files were used. Using rotary instruments 1 mm short of apical foramen caused lesser crack formation. But, there was no statistically significant difference in number of cracks formed with hand files at two instrumentation levels. PMID:28274036

  11. Effect of Instrumentation Length and Instrumentation Systems: Hand Versus Rotary Files on Apical Crack Formation - An In vitro Study.

    PubMed

    Devale, Madhuri R; Mahesh, M C; Bhandary, Shreetha

    2017-01-01

    Stresses generated during root canal instrumentation have been reported to cause apical cracks. The smaller, less pronounced defects like cracks can later propagate into vertical root fracture, when the tooth is subjected to repeated stresses from endodontic or restorative procedures. This study evaluated occurrence of apical cracks with stainless steel hand files, rotary NiTi RaCe and K3 files at two different instrumentation lengths. In the present in vitro study, 60 mandibular premolars were mounted in resin blocks with simulated periodontal ligament. Apical 3 mm of the root surfaces were exposed and stained using India ink. Preoperative images of root apices were obtained at 100x using stereomicroscope. The teeth were divided into six groups of 10 each. First two groups were instrumented with stainless steel files, next two groups with rotary NiTi RaCe files and the last two groups with rotary NiTi K3 files. The instrumentation was carried out till the apical foramen (Working Length-WL) and 1 mm short of the apical foramen (WL-1) with each file system. After root canal instrumentation, postoperative images of root apices were obtained. Preoperative and postoperative images were compared and the occurrence of cracks was recorded. Descriptive statistical analysis and Chi-square tests were used to analyze the results. Apical root cracks were seen in 30%, 35% and 20% of teeth instrumented with K-files, RaCe files and K3 files respectively. There was no statistical significance among three instrumentation systems in the formation of apical cracks (p=0.563). Apical cracks were seen in 40% and 20% of teeth instrumented with K-files; 60% and 10% of teeth with RaCe files and 40% and 0% of teeth with K3 files at WL and WL-1 respectively. For groups instrumented with hand files there was no statistical significance in number of cracks at WL and WL-1 (p=0.628). But for teeth instrumented with RaCe files and K3 files significantly more number of cracks were seen at WL than WL-1 (p=0.057 for RaCe files and p=0.087 for K3 files). There was no statistical significance between stainless steel hand files and rotary files in terms of crack formation. Instrumentation length had a significant effect on the formation of cracks when rotary files were used. Using rotary instruments 1 mm short of apical foramen caused lesser crack formation. But, there was no statistically significant difference in number of cracks formed with hand files at two instrumentation levels.

  12. 15 CFR 995.26 - Conversion of NOAA ENC ® files to other formats.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ...) Conversion of NOAA ENC files to other formats—(1) Content. CEVAD may provide NOAA ENC data in forms other... data files without degradation to positional accuracy or informational content. (2) Software certification. Conversion of NOAA ENC data to other formats must be accomplished within the constraints of IHO...

  13. Early Detection | Division of Cancer Prevention

    Cancer.gov

    [[{"fid":"171","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"Early Detection Research Group Homepage Logo","field_file_image_title_text[und][0][value]":"Early Detection Research Group Homepage Logo","field_folder[und]":"15"},"type":"media","field_deltas":{"1":{"format":"default","field_file_image_alt_text[und][0][value]":"Early

  14. Image Size Variation Influence on Corrupted and Non-viewable BMP Image

    NASA Astrophysics Data System (ADS)

    Azmi, Tengku Norsuhaila T.; Azma Abdullah, Nurul; Rahman, Nurul Hidayah Ab; Hamid, Isredza Rahmi A.; Chai Wen, Chuah

    2017-08-01

    Image is one of the evidence component seek in digital forensics. Joint Photographic Experts Group (JPEG) format is most popular used in the Internet because JPEG files are very lossy and easy to compress that can speed up Internet transmitting processes. However, corrupted JPEG images are hard to recover due to the complexities of determining corruption point. Nowadays Bitmap (BMP) images are preferred in image processing compared to another formats because BMP image contain all the image information in a simple format. Therefore, in order to investigate the corruption point in JPEG, the file is required to be converted into BMP format. Nevertheless, there are many things that can influence the corrupting of BMP image such as the changes of image size that make the file non-viewable. In this paper, the experiment indicates that the size of BMP file influences the changes in the image itself through three conditions, deleting, replacing and insertion. From the experiment, we learnt by correcting the file size, it can able to produce a viewable file though partially. Then, it can be investigated further to identify the corruption point.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ingargiola, A.; Laurence, T. A.; Boutelle, R.

    We introduce Photon-HDF5, an open and efficient file format to simplify exchange and long term accessibility of data from single-molecule fluorescence experiments based on photon-counting detectors such as single-photon avalanche diode (SPAD), photomultiplier tube (PMT) or arrays of such detectors. The format is based on HDF5, a widely used platform- and language-independent hierarchical file format for which user-friendly viewers are available. Photon-HDF5 can store raw photon data (timestamp, channel number, etc) from any acquisition hardware, but also setup and sample description, information on provenance, authorship and other metadata, and is flexible enough to include any kind of custom data. Themore » format specifications are hosted on a public website, which is open to contributions by the biophysics community. As an initial resource, the website provides code examples to read Photon-HDF5 files in several programming languages and a reference python library (phconvert), to create new Photon-HDF5 files and convert several existing file formats into Photon-HDF5. As a result, to encourage adoption by the academic and commercial communities, all software is released under the MIT open source license.« less

  16. UFO (UnFold Operator) default data format

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kissel, L.; Biggs, F.; Marking, T.R.

    The default format for the storage of x,y data for use with the UFO code is described. The format assumes that the data stored in a file is a matrix of values; two columns of this matrix are selected to define a function of the form y = f(x). This format is specifically designed to allow for easy importation of data obtained from other sources, or easy entry of data using a text editor, with a minimum of reformatting. This format is flexible and extensible through the use of inline directives stored in the optional header of the file. Amore » special extension of the format implements encoded data which significantly reduces the storage required as compared wth the unencoded form. UFO supports several extensions to the file specification that implement execute-time operations, such as, transformation of the x and/or y values, selection of specific columns of the matrix for association with the x and y values, input of data directly from other formats (e.g., DAMP and PFF), and a simple type of library-structured file format. Several examples of the use of the format are given.« less

  17. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome.

    PubMed

    McDonald, Daniel; Clemente, Jose C; Kuczynski, Justin; Rideout, Jai Ram; Stombaugh, Jesse; Wendel, Doug; Wilke, Andreas; Huse, Susan; Hufnagle, John; Meyer, Folker; Knight, Rob; Caporaso, J Gregory

    2012-07-12

    We present the Biological Observation Matrix (BIOM, pronounced "biome") format: a JSON-based file format for representing arbitrary observation by sample contingency tables with associated sample and observation metadata. As the number of categories of comparative omics data types (collectively, the "ome-ome") grows rapidly, a general format to represent and archive this data will facilitate the interoperability of existing bioinformatics tools and future meta-analyses. The BIOM file format is supported by an independent open-source software project (the biom-format project), which initially contains Python objects that support the use and manipulation of BIOM data in Python programs, and is intended to be an open development effort where developers can submit implementations of these objects in other programming languages. The BIOM file format and the biom-format project are steps toward reducing the "bioinformatics bottleneck" that is currently being experienced in diverse areas of biological sciences, and will help us move toward the next phase of comparative omics where basic science is translated into clinical and environmental applications. The BIOM file format is currently recognized as an Earth Microbiome Project Standard, and as a Candidate Standard by the Genomic Standards Consortium.

  18. 76 FR 23222 - Electric Reliability Organization Interpretation of Transmission Operations Reliability

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-26

    ... applications or print-to-PDF format, and not in a scanned format, at http://www.ferc.gov/docs-filing/efiling....3d 1342 (DC Cir. 2009). \\5\\ Mandatory Reliability Standards for the Bulk-Power System, Order No. 693... applications or print-to-PDF format and not in a scanned format. Commenters filing electronically do not need...

  19. NetpathXL - An Excel Interface to the Program NETPATH

    USGS Publications Warehouse

    Parkhurst, David L.; Charlton, Scott R.

    2008-01-01

    NetpathXL is a revised version of NETPATH that runs under Windows? operating systems. NETPATH is a computer program that uses inverse geochemical modeling techniques to calculate net geochemical reactions that can account for changes in water composition between initial and final evolutionary waters in hydrologic systems. The inverse models also can account for the isotopic composition of waters and can be used to estimate radiocarbon ages of dissolved carbon in ground water. NETPATH relies on an auxiliary, database program, DB, to enter the chemical analyses and to perform speciation calculations that define total concentrations of elements, charge balance, and redox state of aqueous solutions that are then used in inverse modeling. Instead of DB, NetpathXL relies on Microsoft Excel? to enter the chemical analyses. The speciation calculation formerly included in DB is implemented within the program NetpathXL. A program DBXL can be used to translate files from the old DB format (.lon files) to NetpathXL spreadsheets, or to create new NetpathXL spreadsheets. Once users have a NetpathXL spreadsheet with the proper format, new spreadsheets can be generated by copying or saving NetpathXL spreadsheets. In addition, DBXL can convert NetpathXL spreadsheets to PHREEQC input files. New capabilities in PHREEQC (version 2.15) allow solution compositions to be written to a .lon file, and inverse models developed in PHREEQC to be written as NetpathXL .pat and model files. NetpathXL can open NetpathXL spreadsheets, NETPATH-format path files (.pat files), and NetpathXL-format path files (.pat files). Once the speciation calculations have been performed on a spreadsheet file or a .pat file has been opened, the NetpathXL calculation engine is identical to the original NETPATH. Development of models and viewing results in NetpathXL rely on keyboard entry as in NETPATH.

  20. 17 CFR 232.202 - Continuing hardship exemption.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... electronic format or post the Interactive Data File on its corporate Web site, as applicable, on the required... Interactive Data File, the electronic filer need not post on its Web site any statement with regard to the... submitted in electronic format or, in the case of an Interactive Data File (§ 232.11), to be posted on the...

  1. 17 CFR 232.202 - Continuing hardship exemption.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... electronic format or post the Interactive Data File on its corporate Web site, as applicable, on the required... Interactive Data File, the electronic filer need not post on its Web site any statement with regard to the... submitted in electronic format or, in the case of an Interactive Data File (§ 232.11), to be posted on the...

  2. 17 CFR 232.202 - Continuing hardship exemption.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... electronic format or post the Interactive Data File on its corporate Web site, as applicable, on the required... Interactive Data File, the electronic filer need not post on its Web site any statement with regard to the... submitted in electronic format or, in the case of an Interactive Data File (§ 232.11), to be posted on the...

  3. 17 CFR 232.202 - Continuing hardship exemption.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... electronic format or post the Interactive Data File on its corporate Web site, as applicable, on the required... Interactive Data File, the electronic filer need not post on its Web site any statement with regard to the... submitted in electronic format or, in the case of an Interactive Data File (§ 232.11), to be posted on the...

  4. 17 CFR 232.202 - Continuing hardship exemption.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... electronic format or post the Interactive Data File on its corporate Web site, as applicable, on the required... Interactive Data File, the electronic filer need not post on its Web site any statement with regard to the... submitted in electronic format or, in the case of an Interactive Data File (§ 232.11), to be posted on the...

  5. Data Science Bowl Launched to Improve Lung Cancer Screening | Division of Cancer Prevention

    Cancer.gov

    [[{"fid":"2078","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"Data Science Bowl Logo","field_file_image_title_text[und][0][value]":"Data Science Bowl Logo","field_folder[und]":"76"},"type":"media","field_deltas":{"1":{"format":"default","field_file_image_alt_text[und][0][value]":"Data Science Bowl

  6. 19 CFR 351.303 - Filing, document identification, format, translation, service, and certification of documents.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... submit a public version of a database in pdf format. The public version of the database must be publicly... interested party that files with the Department a request for an expedited antidumping review, an..., whichever is later. If the interested party that files the request is unable to locate a particular exporter...

  7. 47 CFR 1.10008 - What are IBFS file numbers?

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... Bureau Filing System § 1.10008 What are IBFS file numbers? (a) We assign file numbers to electronic... information, see The International Bureau Filing System File Number Format Public Notice, DA-04-568 (released... 47 Telecommunication 1 2010-10-01 2010-10-01 false What are IBFS file numbers? 1.10008 Section 1...

  8. 47 CFR 1.10008 - What are IBFS file numbers?

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... Bureau Filing System § 1.10008 What are IBFS file numbers? (a) We assign file numbers to electronic... information, see The International Bureau Filing System File Number Format Public Notice, DA-04-568 (released... 47 Telecommunication 1 2011-10-01 2011-10-01 false What are IBFS file numbers? 1.10008 Section 1...

  9. 78 FR 30245 - Electric Reliability Organization Interpretation of Specific Requirements of the Disturbance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-22

    ... print-to-PDF format and not in a scanned format. Mail/Hand Delivery: Commenters unable to file comments.... FERC, 564 F.3d 1342 (DC Cir. 2009). 3. In March 2007, the Commission issued Order No. 693, evaluating... should be filed in native applications or print-to-PDF format and not in a scanned format. Commenters...

  10. 46 CFR 67.218 - Optional filing of instruments in portable document format as attachments to electronic mail.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... recording under § 67.200 may be submitted in portable document format (.pdf) as an attachment to electronic... submitted for filing in .pdf format pertains to a vessel that is not a currently documented vessel, a... with the National Vessel Documentation Center or must be submitted in .pdf format with the instrument...

  11. NIH Seeks Input on In-patient Clinical Research Areas | Division of Cancer Prevention

    Cancer.gov

    [[{"fid":"2476","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"Aerial view of the National Institutes of Health Clinical Center (Building 10) in Bethesda, Maryland.","field_file_image_title_text[und][0][value]":false},"type":"media","field_deltas":{"1":{"format":"default","field_file_image_alt_text[und][0][value]":"Aerial view of

  12. Pancreatic Cancer Detection Consortium (PCDC) | Division of Cancer Prevention

    Cancer.gov

    [[{"fid":"2256","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"A 3-dimensional image of a human torso highlighting the pancreas.","field_file_image_title_text[und][0][value]":false},"type":"media","field_deltas":{"1":{"format":"default","field_file_image_alt_text[und][0][value]":"A 3-dimensional image of a human torso

  13. Reprocessing of multi-channel seismic-reflection data collected in the Beaufort Sea

    USGS Publications Warehouse

    Agena, W.F.; Lee, Myung W.; Hart, P.E.

    2000-01-01

    Contained on this set of two CD-ROMs are stacked and migrated multi-channel seismic-reflection data for 65 lines recorded in the Beaufort Sea by the United States Geological Survey in 1977. All data were reprocessed by the USGS using updated processing methods resulting in improved interpretability. Each of the two CD-ROMs contains the following files: 1) 65 files containing the digital seismic data in standard, SEG-Y format; 2) 1 file containing navigation data for the 65 lines in standard SEG-P1 format; 3) an ASCII text file with cross-reference information for relating the sequential trace numbers on each line to cdp numbers and shotpoint numbers; 4) 2 small scale graphic images (stacked and migrated) of a segment of line 722 in Adobe Acrobat (R) PDF format; 5) a graphic image of the location map, generated from the navigation file; 6) PlotSeis, an MS-DOS Application that allows PC users to interactively view the SEG-Y files; 7) a PlotSeis documentation file; and 8) an explanation of the processing used to create the final seismic sections (this document).

  14. FORMATOMATIC: a program for converting diploid allelic data between common formats for population genetic analysis.

    PubMed

    Manoukis, Nicholas C

    2007-07-01

    There has been a great increase in both the number of population genetic analysis programs and the size of data sets being studied with them. Since the file formats required by the most popular and useful programs are variable, automated reformatting or conversion between them is desirable. formatomatic is an easy to use program that can read allelic data files in genepop, raw (csv) or convert formats and create data files in nine formats: raw (csv), arlequin, genepop, immanc/bayesass +, migrate, newhybrids, msvar, baps and structure. Use of formatomatic should greatly reduce time spent reformatting data sets and avoid unnecessary errors.

  15. File formats commonly used in mass spectrometry proteomics.

    PubMed

    Deutsch, Eric W

    2012-12-01

    The application of mass spectrometry (MS) to the analysis of proteomes has enabled the high-throughput identification and abundance measurement of hundreds to thousands of proteins per experiment. However, the formidable informatics challenge associated with analyzing MS data has required a wide variety of data file formats to encode the complex data types associated with MS workflows. These formats encompass the encoding of input instruction for instruments, output products of the instruments, and several levels of information and results used by and produced by the informatics analysis tools. A brief overview of the most common file formats in use today is presented here, along with a discussion of related topics.

  16. Simple Ontology Format (SOFT)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sorokine, Alexandre

    2011-10-01

    Simple Ontology Format (SOFT) library and file format specification provides a set of simple tools for developing and maintaining ontologies. The library, implemented as a perl module, supports parsing and verification of the files in SOFt format, operations with ontologies (adding, removing, or filtering of entities), and converting of ontologies into other formats. SOFT allows users to quickly create ontologies using only a basic text editor, verify it, and portray it in a graph layout system using customized styles.

  17. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome

    PubMed Central

    2012-01-01

    Background We present the Biological Observation Matrix (BIOM, pronounced “biome”) format: a JSON-based file format for representing arbitrary observation by sample contingency tables with associated sample and observation metadata. As the number of categories of comparative omics data types (collectively, the “ome-ome”) grows rapidly, a general format to represent and archive this data will facilitate the interoperability of existing bioinformatics tools and future meta-analyses. Findings The BIOM file format is supported by an independent open-source software project (the biom-format project), which initially contains Python objects that support the use and manipulation of BIOM data in Python programs, and is intended to be an open development effort where developers can submit implementations of these objects in other programming languages. Conclusions The BIOM file format and the biom-format project are steps toward reducing the “bioinformatics bottleneck” that is currently being experienced in diverse areas of biological sciences, and will help us move toward the next phase of comparative omics where basic science is translated into clinical and environmental applications. The BIOM file format is currently recognized as an Earth Microbiome Project Standard, and as a Candidate Standard by the Genomic Standards Consortium. PMID:23587224

  18. 76 FR 47606 - Sport Fishing and Boating Partnership Council

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-05

    ... the following formats: One hard copy with original signature, and one electronic copy via e- mail (acceptable file formats are Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or rich text file...

  19. Measles, Mumps, and Rubella (MMR) Vaccination: What Everyone Should Know

    MedlinePlus

    ... rubella combination vaccine Measles=Rubeola Measles=”10-day”, “hard” and “red” measles MMRV=measles, mumps, rubella, and varicella combination vaccine File Formats Help: How do I view different file formats ( ...

  20. 78 FR 19152 - Revisions to Modeling, Data, and Analysis Reliability Standard

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-29

    ... processing software should be filed in native applications or print-to-PDF format and not in a scanned format...,126 (2006), aff'd sub nom. Alcoa, Inc. v. FERC, 564 F.3d 1342 (D.C. Cir. 2009). 3. In March 2007, the... print-to-PDF format and not in a scanned format. Commenters filing electronically do not need to make a...

  1. 76 FR 75898 - Sport Fishing and Boating Partnership Council

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-05

    ... following formats: One hard copy with original signature, and one electronic copy via email (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/Windows 98/2000/XP format). Please submit your statement to Douglas Hobbs, Council Coordinator (see FOR FURTHER...

  2. 14 CFR 221.195 - Requirement for filing printed material.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... (AVIATION PROCEEDINGS) ECONOMIC REGULATIONS TARIFFS Electronically Filed Tariffs § 221.195 Requirement for filing printed material. (a) Any tariff, or revision thereto, filed in paper format which accompanies....190(b). Further, such paper tariff, or revision thereto, shall be filed in accordance with the...

  3. 18 CFR 35.7 - Electronic filing requirements.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 18 Conservation of Power and Water Resources 1 2011-04-01 2011-04-01 false Electronic filing... § 35.7 Electronic filing requirements. (a) General rule. All filings made in proceedings initiated... declarations or statements and electronic signatures. (c) Format requirements for electronic filing. The...

  4. 18 CFR 35.7 - Electronic filing requirements.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 18 Conservation of Power and Water Resources 1 2012-04-01 2012-04-01 false Electronic filing... § 35.7 Electronic filing requirements. (a) General rule. All filings made in proceedings initiated... declarations or statements and electronic signatures. (c) Format requirements for electronic filing. The...

  5. 18 CFR 35.7 - Electronic filing requirements.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 18 Conservation of Power and Water Resources 1 2013-04-01 2013-04-01 false Electronic filing... § 35.7 Electronic filing requirements. (a) General rule. All filings made in proceedings initiated... declarations or statements and electronic signatures. (c) Format requirements for electronic filing. The...

  6. 18 CFR 35.7 - Electronic filing requirements.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 18 Conservation of Power and Water Resources 1 2014-04-01 2014-04-01 false Electronic filing... § 35.7 Electronic filing requirements. (a) General rule. All filings made in proceedings initiated... declarations or statements and electronic signatures. (c) Format requirements for electronic filing. The...

  7. NIMBUS 7 Earth Radiation Budget (ERB) Matrix User's Guide. Volume 2: Tape Specifications

    NASA Technical Reports Server (NTRS)

    Ray, S. N.; Vasanth, K. L.

    1984-01-01

    The ERB MATRIX tape is generated by an IBM 3081 computer program and is a 9 track, 1600 BPI tape. The gross format of the tape given on Page 1, shows an initial standard header file followed by data files. The standard header file contains two standard header records. A trailing documentation file (TDF) is the last file on the tape. Pages 9 through 17 describe, in detail, the standard header file and the TDF. The data files contain data for 37 different ERB parameters. Each file has data based on either a daily, 6 day cyclic, or monthly time interval. There are three types of physical records in the data files; namely, the world grid physical record, the documentation mercator/polar map projection physical record, and the monthly calibration physical record. The manner in which the data for the 37 ERB parameters are stored in the physical records comprising the data files, is given in the gross format section.

  8. Extracting the Data From the LCM vk4 Formatted Output File

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wendelberger, James G.

    These are slides about extracting the data from the LCM vk4 formatted output file. The following is covered: vk4 file produced by Keyence VK Software, custom analysis, no off the shelf way to read the file, reading the binary data in a vk4 file, various offsets in decimal lines, finding the height image data, directly in MATLAB, binary output beginning of height image data, color image information, color image binary data, color image decimal and binary data, MATLAB code to read vk4 file (choose a file, read the file, compute offsets, read optical image, laser optical image, read and computemore » laser intensity image, read height image, timing, display height image, display laser intensity image, display RGB laser optical images, display RGB optical images, display beginning data and save images to workspace, gamma correction subroutine), reading intensity form the vk4 file, linear in the low range, linear in the high range, gamma correction for vk4 files, computing the gamma intensity correction, observations.« less

  9. Experimental Directory Structure (Exdir): An Alternative to HDF5 Without Introducing a New File Format

    PubMed Central

    Dragly, Svenn-Arne; Hobbi Mobarhan, Milad; Lepperød, Mikkel E.; Tennøe, Simen; Fyhn, Marianne; Hafting, Torkel; Malthe-Sørenssen, Anders

    2018-01-01

    Natural sciences generate an increasing amount of data in a wide range of formats developed by different research groups and commercial companies. At the same time there is a growing desire to share data along with publications in order to enable reproducible research. Open formats have publicly available specifications which facilitate data sharing and reproducible research. Hierarchical Data Format 5 (HDF5) is a popular open format widely used in neuroscience, often as a foundation for other, more specialized formats. However, drawbacks related to HDF5's complex specification have initiated a discussion for an improved replacement. We propose a novel alternative, the Experimental Directory Structure (Exdir), an open specification for data storage in experimental pipelines which amends drawbacks associated with HDF5 while retaining its advantages. HDF5 stores data and metadata in a hierarchy within a complex binary file which, among other things, is not human-readable, not optimal for version control systems, and lacks support for easy access to raw data from external applications. Exdir, on the other hand, uses file system directories to represent the hierarchy, with metadata stored in human-readable YAML files, datasets stored in binary NumPy files, and raw data stored directly in subdirectories. Furthermore, storing data in multiple files makes it easier to track for version control systems. Exdir is not a file format in itself, but a specification for organizing files in a directory structure. Exdir uses the same abstractions as HDF5 and is compatible with the HDF5 Abstract Data Model. Several research groups are already using data stored in a directory hierarchy as an alternative to HDF5, but no common standard exists. This complicates and limits the opportunity for data sharing and development of common tools for reading, writing, and analyzing data. Exdir facilitates improved data storage, data sharing, reproducible research, and novel insight from interdisciplinary collaboration. With the publication of Exdir, we invite the scientific community to join the development to create an open specification that will serve as many needs as possible and as a foundation for open access to and exchange of data. PMID:29706879

  10. Experimental Directory Structure (Exdir): An Alternative to HDF5 Without Introducing a New File Format.

    PubMed

    Dragly, Svenn-Arne; Hobbi Mobarhan, Milad; Lepperød, Mikkel E; Tennøe, Simen; Fyhn, Marianne; Hafting, Torkel; Malthe-Sørenssen, Anders

    2018-01-01

    Natural sciences generate an increasing amount of data in a wide range of formats developed by different research groups and commercial companies. At the same time there is a growing desire to share data along with publications in order to enable reproducible research. Open formats have publicly available specifications which facilitate data sharing and reproducible research. Hierarchical Data Format 5 (HDF5) is a popular open format widely used in neuroscience, often as a foundation for other, more specialized formats. However, drawbacks related to HDF5's complex specification have initiated a discussion for an improved replacement. We propose a novel alternative, the Experimental Directory Structure (Exdir), an open specification for data storage in experimental pipelines which amends drawbacks associated with HDF5 while retaining its advantages. HDF5 stores data and metadata in a hierarchy within a complex binary file which, among other things, is not human-readable, not optimal for version control systems, and lacks support for easy access to raw data from external applications. Exdir, on the other hand, uses file system directories to represent the hierarchy, with metadata stored in human-readable YAML files, datasets stored in binary NumPy files, and raw data stored directly in subdirectories. Furthermore, storing data in multiple files makes it easier to track for version control systems. Exdir is not a file format in itself, but a specification for organizing files in a directory structure. Exdir uses the same abstractions as HDF5 and is compatible with the HDF5 Abstract Data Model. Several research groups are already using data stored in a directory hierarchy as an alternative to HDF5, but no common standard exists. This complicates and limits the opportunity for data sharing and development of common tools for reading, writing, and analyzing data. Exdir facilitates improved data storage, data sharing, reproducible research, and novel insight from interdisciplinary collaboration. With the publication of Exdir, we invite the scientific community to join the development to create an open specification that will serve as many needs as possible and as a foundation for open access to and exchange of data.

  11. Five Tips to Help Prevent Infections

    MedlinePlus

    ... Information For… Media Policy Makers 5 Tips to Help Prevent Infections Language: English (US) Español (Spanish) Recommend ... Makers Language: English (US) Español (Spanish) File Formats Help: How do I view different file formats (PDF, ...

  12. ISA-TAB-Nano: a specification for sharing nanomaterial research data in spreadsheet-based format.

    PubMed

    Thomas, Dennis G; Gaheen, Sharon; Harper, Stacey L; Fritts, Martin; Klaessig, Fred; Hahn-Dantona, Elizabeth; Paik, David; Pan, Sue; Stafford, Grace A; Freund, Elaine T; Klemm, Juli D; Baker, Nathan A

    2013-01-14

    The high-throughput genomics communities have been successfully using standardized spreadsheet-based formats to capture and share data within labs and among public repositories. The nanomedicine community has yet to adopt similar standards to share the diverse and multi-dimensional types of data (including metadata) pertaining to the description and characterization of nanomaterials. Owing to the lack of standardization in representing and sharing nanomaterial data, most of the data currently shared via publications and data resources are incomplete, poorly-integrated, and not suitable for meaningful interpretation and re-use of the data. Specifically, in its current state, data cannot be effectively utilized for the development of predictive models that will inform the rational design of nanomaterials. We have developed a specification called ISA-TAB-Nano, which comprises four spreadsheet-based file formats for representing and integrating various types of nanomaterial data. Three file formats (Investigation, Study, and Assay files) have been adapted from the established ISA-TAB specification; while the Material file format was developed de novo to more readily describe the complexity of nanomaterials and associated small molecules. In this paper, we have discussed the main features of each file format and how to use them for sharing nanomaterial descriptions and assay metadata. The ISA-TAB-Nano file formats provide a general and flexible framework to record and integrate nanomaterial descriptions, assay data (metadata and endpoint measurements) and protocol information. Like ISA-TAB, ISA-TAB-Nano supports the use of ontology terms to promote standardized descriptions and to facilitate search and integration of the data. The ISA-TAB-Nano specification has been submitted as an ASTM work item to obtain community feedback and to provide a nanotechnology data-sharing standard for public development and adoption.

  13. ISA-TAB-Nano: A Specification for Sharing Nanomaterial Research Data in Spreadsheet-based Format

    PubMed Central

    2013-01-01

    Background and motivation The high-throughput genomics communities have been successfully using standardized spreadsheet-based formats to capture and share data within labs and among public repositories. The nanomedicine community has yet to adopt similar standards to share the diverse and multi-dimensional types of data (including metadata) pertaining to the description and characterization of nanomaterials. Owing to the lack of standardization in representing and sharing nanomaterial data, most of the data currently shared via publications and data resources are incomplete, poorly-integrated, and not suitable for meaningful interpretation and re-use of the data. Specifically, in its current state, data cannot be effectively utilized for the development of predictive models that will inform the rational design of nanomaterials. Results We have developed a specification called ISA-TAB-Nano, which comprises four spreadsheet-based file formats for representing and integrating various types of nanomaterial data. Three file formats (Investigation, Study, and Assay files) have been adapted from the established ISA-TAB specification; while the Material file format was developed de novo to more readily describe the complexity of nanomaterials and associated small molecules. In this paper, we have discussed the main features of each file format and how to use them for sharing nanomaterial descriptions and assay metadata. Conclusion The ISA-TAB-Nano file formats provide a general and flexible framework to record and integrate nanomaterial descriptions, assay data (metadata and endpoint measurements) and protocol information. Like ISA-TAB, ISA-TAB-Nano supports the use of ontology terms to promote standardized descriptions and to facilitate search and integration of the data. The ISA-TAB-Nano specification has been submitted as an ASTM work item to obtain community feedback and to provide a nanotechnology data-sharing standard for public development and adoption. PMID:23311978

  14. 77 FR 12108 - Denver & Rio Grande Railway Historical Foundation d/b/a Denver & Rio Grande Railroad, L.L.C...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-28

    ... via the Board's e-filing format or in the traditional paper format. Any person using e-filing should attach a document and otherwise comply with the instructions at the E-FILING link on the Board's Web site....S.C. 554(e). DRGHF requests that the Board issue an order declaring that municipal zoning law is...

  15. A user-friendly application for the extraction of kubios hrv output to an optimal format for statistical analysis - biomed 2011.

    PubMed

    Johnsen Lind, Andreas; Helge Johnsen, Bjorn; Hill, Labarron K; Sollers Iii, John J; Thayer, Julian F

    2011-01-01

    The aim of the present manuscript is to present a user-friendly and flexible platform for transforming Kubios HRV output files to an .xls-file format, used by MS Excel. The program utilizes either native or bundled Java and is platform-independent and mobile. This means that it can run without being installed on a computer. It also has an option of continuous transferring of data indicating that it can run in the background while Kubios produces output files. The program checks for changes in the file structure and automatically updates the .xls- output file.

  16. 5 CFR 1201.14 - Electronic filing procedures.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... (PDF), and image files (files created by scanning). A list of formats allowed can be found at e-Appeal..., or by uploading the supporting documents in the form of one or more PDF files in which each...

  17. C2x: A tool for visualisation and input preparation for CASTEP and other electronic structure codes

    NASA Astrophysics Data System (ADS)

    Rutter, M. J.

    2018-04-01

    The c2x code fills two distinct roles. Its first role is in acting as a converter between the binary format .check files from the widely-used CASTEP [1] electronic structure code and various visualisation programs. Its second role is to manipulate and analyse the input and output files from a variety of electronic structure codes, including CASTEP, ONETEP and VASP, as well as the widely-used 'Gaussian cube' file format. Analysis includes symmetry analysis, and manipulation arbitrary cell transformations. It continues to be under development, with growing functionality, and is written in a form which would make it easy to extend it to working directly with files from other electronic structure codes. Data which c2x is capable of extracting from CASTEP's binary checkpoint files include charge densities, spin densities, wavefunctions, relaxed atomic positions, forces, the Fermi level, the total energy, and symmetry operations. It can recreate .cell input files from checkpoint files. Volumetric data can be output in formats useable by many common visualisation programs, and c2x will itself calculate integrals, expand data into supercells, and interpolate data via combinations of Fourier and trilinear interpolation. It can extract data along arbitrary lines (such as lines between atoms) as 1D output. C2x is able to convert between several common formats for describing molecules and crystals, including the .cell format of CASTEP. It can construct supercells, reduce cells to their primitive form, and add specified k-point meshes. It uses the spglib library [2] to report symmetry information, which it can add to .cell files. C2x is a command-line utility, so is readily included in scripts. It is available under the GPL and can be obtained from http://www.c2x.org.uk. It is believed to be the only open-source code which can read CASTEP's .check files, so it will have utility in other projects.

  18. File Formats Commonly Used in Mass Spectrometry Proteomics*

    PubMed Central

    Deutsch, Eric W.

    2012-01-01

    The application of mass spectrometry (MS) to the analysis of proteomes has enabled the high-throughput identification and abundance measurement of hundreds to thousands of proteins per experiment. However, the formidable informatics challenge associated with analyzing MS data has required a wide variety of data file formats to encode the complex data types associated with MS workflows. These formats encompass the encoding of input instruction for instruments, output products of the instruments, and several levels of information and results used by and produced by the informatics analysis tools. A brief overview of the most common file formats in use today is presented here, along with a discussion of related topics. PMID:22956731

  19. 75 FR 47624 - Sport Fishing and Boating Partnership Council

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-06

    ... Coordinator in both of the following formats: One hard copy with original signature, and one electronic copy via e- mail (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/Windows 98/2000/XP format). In order to attend this meeting, you must register by...

  20. Performance regression manager for large scale systems

    DOEpatents

    Faraj, Daniel A.

    2017-10-17

    System and computer program product to perform an operation comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputting for display an indication of a result of the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.

  1. Performance regression manager for large scale systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faraj, Daniel A.

    Methods comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputting for display an indication of a result ofmore » the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.« less

  2. Efficient stereoscopic contents file format on the basis of ISO base media file format

    NASA Astrophysics Data System (ADS)

    Kim, Kyuheon; Lee, Jangwon; Suh, Doug Young; Park, Gwang Hoon

    2009-02-01

    A lot of 3D contents haven been widely used for multimedia services, however, real 3D video contents have been adopted for a limited applications such as a specially designed 3D cinema. This is because of the difficulty of capturing real 3D video contents and the limitation of display devices available in a market. However, diverse types of display devices for stereoscopic video contents for real 3D video contents have been recently released in a market. Especially, a mobile phone with a stereoscopic camera has been released in a market, which provides a user as a consumer to have more realistic experiences without glasses, and also, as a content creator to take stereoscopic images or record the stereoscopic video contents. However, a user can only store and display these acquired stereoscopic contents with his/her own devices due to the non-existence of a common file format for these contents. This limitation causes a user not share his/her contents with any other users, which makes it difficult the relevant market to stereoscopic contents is getting expanded. Therefore, this paper proposes the common file format on the basis of ISO base media file format for stereoscopic contents, which enables users to store and exchange pure stereoscopic contents. This technology is also currently under development for an international standard of MPEG as being called as a stereoscopic video application format.

  3. 75 FR 5066 - Commission Information Collection Activities (FERC Form 60,1

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-02-01

    ... corresponding dockets and collection numbers.) Comments may be filed either electronically or in paper format. Those persons filing electronically do not need to make a paper filing. Documents filed electronically... acknowledgement to the sender's e- mail address upon receipt of comments. For paper filings, the comments should...

  4. Vidjil: A Web Platform for Analysis of High-Throughput Repertoire Sequencing.

    PubMed

    Duez, Marc; Giraud, Mathieu; Herbert, Ryan; Rocher, Tatiana; Salson, Mikaël; Thonier, Florian

    2016-01-01

    The B and T lymphocytes are white blood cells playing a key role in the adaptive immunity. A part of their DNA, called the V(D)J recombinations, is specific to each lymphocyte, and enables recognition of specific antigenes. Today, with new sequencing techniques, one can get billions of DNA sequences from these regions. With dedicated Repertoire Sequencing (RepSeq) methods, it is now possible to picture population of lymphocytes, and to monitor more accurately the immune response as well as pathologies such as leukemia. Vidjil is an open-source platform for the interactive analysis of high-throughput sequencing data from lymphocyte recombinations. It contains an algorithm gathering reads into clonotypes according to their V(D)J junctions, a web application made of a sample, experiment and patient database and a visualization for the analysis of clonotypes along the time. Vidjil is implemented in C++, Python and Javascript and licensed under the GPLv3 open-source license. Source code, binaries and a public web server are available at http://www.vidjil.org and at http://bioinfo.lille.inria.fr/vidjil. Using the Vidjil web application consists of four steps: 1. uploading a raw sequence file (typically a FASTQ); 2. running RepSeq analysis software; 3. visualizing the results; 4. annotating the results and saving them for future use. For the end-user, the Vidjil web application needs no specific installation and just requires a connection and a modern web browser. Vidjil is used by labs in hematology or immunology for research and clinical applications.

  5. Vidjil: A Web Platform for Analysis of High-Throughput Repertoire Sequencing

    PubMed Central

    Duez, Marc; Herbert, Ryan; Rocher, Tatiana; Salson, Mikaël; Thonier, Florian

    2016-01-01

    Background The B and T lymphocytes are white blood cells playing a key role in the adaptive immunity. A part of their DNA, called the V(D)J recombinations, is specific to each lymphocyte, and enables recognition of specific antigenes. Today, with new sequencing techniques, one can get billions of DNA sequences from these regions. With dedicated Repertoire Sequencing (RepSeq) methods, it is now possible to picture population of lymphocytes, and to monitor more accurately the immune response as well as pathologies such as leukemia. Methods and Results Vidjil is an open-source platform for the interactive analysis of high-throughput sequencing data from lymphocyte recombinations. It contains an algorithm gathering reads into clonotypes according to their V(D)J junctions, a web application made of a sample, experiment and patient database and a visualization for the analysis of clonotypes along the time. Vidjil is implemented in C++, Python and Javascript and licensed under the GPLv3 open-source license. Source code, binaries and a public web server are available at http://www.vidjil.org and at http://bioinfo.lille.inria.fr/vidjil. Using the Vidjil web application consists of four steps: 1. uploading a raw sequence file (typically a FASTQ); 2. running RepSeq analysis software; 3. visualizing the results; 4. annotating the results and saving them for future use. For the end-user, the Vidjil web application needs no specific installation and just requires a connection and a modern web browser. Vidjil is used by labs in hematology or immunology for research and clinical applications. PMID:27835690

  6. Validation of Splicing Events in Transcriptome Sequencing Data

    PubMed Central

    Kaisers, Wolfgang; Ptok, Johannes; Schwender, Holger; Schaal, Heiner

    2017-01-01

    Genomic alignments of sequenced cellular messenger RNA contain gapped alignments which are interpreted as consequence of intron removal. The resulting gap-sites, genomic locations of alignment gaps, are landmarks representing potential splice-sites. As alignment algorithms report gap-sites with a considerable false discovery rate, validations are required. We describe two quality scores, gap quality score (gqs) and weighted gap information score (wgis), developed for validation of putative splicing events: While gqs solely relies on alignment data wgis additionally considers information from the genomic sequence. FASTQ files obtained from 54 human dermal fibroblast samples were aligned against the human genome (GRCh38) using TopHat and STAR aligner. Statistical properties of gap-sites validated by gqs and wgis were evaluated by their sequence similarity to known exon-intron borders. Within the 54 samples, TopHat identifies 1,000,380 and STAR reports 6,487,577 gap-sites. Due to the lack of strand information, however, the percentage of identified GT-AG gap-sites is rather low. While gap-sites from TopHat contain ≈89% GT-AG, gap-sites from STAR only contain ≈42% GT-AG dinucleotide pairs in merged data from 54 fibroblast samples. Validation with gqs yields 156,251 gap-sites from TopHat alignments and 166,294 from STAR alignments. Validation with wgis yields 770,327 gap-sites from TopHat alignments and 1,065,596 from STAR alignments. Both alignment algorithms, TopHat and STAR, report gap-sites with considerable false discovery rate, which can drastically be reduced by validation with gqs and wgis. PMID:28545234

  7. The ICR96 exon CNV validation series: a resource for orthogonal assessment of exon CNV calling in NGS data.

    PubMed

    Mahamdallie, Shazia; Ruark, Elise; Yost, Shawn; Ramsay, Emma; Uddin, Imran; Wylie, Harriett; Elliott, Anna; Strydom, Ann; Renwick, Anthony; Seal, Sheila; Rahman, Nazneen

    2017-01-01

    Detection of deletions and duplications of whole exons (exon CNVs) is a key requirement of genetic testing. Accurate detection of this variant type has proved very challenging in targeted next-generation sequencing (NGS) data, particularly if only a single exon is involved. Many different NGS exon CNV calling methods have been developed over the last five years. Such methods are usually evaluated using simulated and/or in-house data due to a lack of publicly-available datasets with orthogonally generated results. This hinders tool comparisons, transparency and reproducibility. To provide a community resource for assessment of exon CNV calling methods in targeted NGS data, we here present the ICR96 exon CNV validation series. The dataset includes high-quality sequencing data from a targeted NGS assay (the TruSight Cancer Panel) together with Multiplex Ligation-dependent Probe Amplification (MLPA) results for 96 independent samples. 66 samples contain at least one validated exon CNV and 30 samples have validated negative results for exon CNVs in 26 genes. The dataset includes 46 exon CNVs in BRCA1 , BRCA2 , TP53 , MLH1 , MSH2 , MSH6 , PMS2 , EPCAM or PTEN , giving excellent representation of the cancer predisposition genes most frequently tested in clinical practice. Moreover, the validated exon CNVs include 25 single exon CNVs, the most difficult type of exon CNV to detect. The FASTQ files for the ICR96 exon CNV validation series can be accessed through the European-Genome phenome Archive (EGA) under the accession number EGAS00001002428.

  8. The RNASeq-er API-a gateway to systematically updated analysis of public RNA-seq data.

    PubMed

    Petryszak, Robert; Fonseca, Nuno A; Füllgrabe, Anja; Huerta, Laura; Keays, Maria; Tang, Y Amy; Brazma, Alvis

    2017-07-15

    The exponential growth of publicly available RNA-sequencing (RNA-Seq) data poses an increasing challenge to researchers wishing to discover, analyse and store such data, particularly those based in institutions with limited computational resources. EMBL-EBI is in an ideal position to address these challenges and to allow the scientific community easy access to not just raw, but also processed RNA-Seq data. We present a Web service to access the results of a systematically and continually updated standardized alignment as well as gene and exon expression quantification of all public bulk (and in the near future also single-cell) RNA-Seq runs in 264 species in European Nucleotide Archive, using Representational State Transfer. The RNASeq-er API (Application Programming Interface) enables ontology-powered search for and retrieval of CRAM, bigwig and bedGraph files, gene and exon expression quantification matrices (Fragments Per Kilobase Of Exon Per Million Fragments Mapped, Transcripts Per Million, raw counts) as well as sample attributes annotated with ontology terms. To date over 270 00 RNA-Seq runs in nearly 10 000 studies (1PB of raw FASTQ data) in 264 species in ENA have been processed and made available via the API. The RNASeq-er API can be accessed at http://www.ebi.ac.uk/fg/rnaseq/api . The commands used to analyse the data are available in supplementary materials and at https://github.com/nunofonseca/irap/wiki/iRAP-single-library . rnaseq@ebi.ac.uk ; rpetry@ebi.ac.uk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  9. NASA-IGES Translator and Viewer

    NASA Technical Reports Server (NTRS)

    Chou, Jin J.; Logan, Michael A.

    1995-01-01

    NASA-IGES Translator (NIGEStranslator) is a batch program that translates a general IGES (Initial Graphics Exchange Specification) file to a NASA-IGES-Nurbs-Only (NINO) file. IGES is the most popular geometry exchange standard among Computer Aided Geometric Design (CAD) systems. NINO format is a subset of IGES, implementing the simple and yet the most popular NURBS (Non-Uniform Rational B-Splines) representation. NIGEStranslator converts a complex IGES file to the simpler NINO file to simplify the tasks of CFD grid generation for models in CAD format. The NASA-IGES Viewer (NIGESview) is an Open-Inventor-based, highly interactive viewer/ editor for NINO files. Geometry in the IGES files can be viewed, copied, transformed, deleted, and inquired. Users can use NIGEStranslator to translate IGES files from CAD systems to NINO files. The geometry then can be examined with NIGESview. Extraneous geometries can be interactively removed, and the cleaned model can be written to an IGES file, ready to be used in grid generation.

  10. [Intranet-based integrated information system of radiotherapy-related images and diagnostic reports].

    PubMed

    Nakamura, R; Sasaki, M; Oikawa, H; Harada, S; Tamakawa, Y

    2000-03-01

    To use an intranet technique to develop an information system that simultaneously supports both diagnostic reports and radiotherapy planning images. Using a file server as the gateway a radiation oncology LAN was connected to an already operative RIS LAN. Dose-distribution images were saved in tagged-image-file format by way of a screen dump to the file server. X-ray simulator images and portal images were saved in encapsulated postscript format in the file server and automatically converted to portable document format. The files on the file server were automatically registered to the Web server by the search engine and were available for searching and browsing using the Web browser. It took less than a minute to register planning images. For clients, searching and browsing the file took less than 3 seconds. Over 150,000 reports and 4,000 images from a six-month period were accessible. Because the intranet technique was used, construction and maintenance was completed without specialty. Prompt access to essential information about radiotherapy has been made possible by this system. It promotes public access to radiotherapy planning that may improve the quality of treatment.

  11. 77 FR 60138 - Trinity Adaptive Management Working Group; Public Teleconference/Web-Based Meeting

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-02

    ... statements must be supplied to Elizabeth Hadley in one of the following formats: One hard copy with original... file formats are Adobe Acrobat PDF, MS Word, PowerPoint, or rich text file). Registered speakers who...

  12. E-submission chronic toxicology study supplemental files

    EPA Pesticide Factsheets

    The formats and instructions in these documents are designed to be used as an example or guide for registrants to format electronic files for submission of animal toxicology data to OPP for review in support of registration and reevaluation of pesticides.

  13. MMTF-An efficient file format for the transmission, visualization, and analysis of macromolecular structures.

    PubMed

    Bradley, Anthony R; Rose, Alexander S; Pavelka, Antonín; Valasatava, Yana; Duarte, Jose M; Prlić, Andreas; Rose, Peter W

    2017-06-01

    Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis.

  14. MMTF—An efficient file format for the transmission, visualization, and analysis of macromolecular structures

    PubMed Central

    Pavelka, Antonín; Valasatava, Yana; Prlić, Andreas

    2017-01-01

    Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis. PMID:28574982

  15. ChemEngine: harvesting 3D chemical structures of supplementary data from PDF files.

    PubMed

    Karthikeyan, Muthukumarasamy; Vyas, Renu

    2016-01-01

    Digital access to chemical journals resulted in a vast array of molecular information that is now available in the supplementary material files in PDF format. However, extracting this molecular information, generally from a PDF document format is a daunting task. Here we present an approach to harvest 3D molecular data from the supporting information of scientific research articles that are normally available from publisher's resources. In order to demonstrate the feasibility of extracting truly computable molecules from PDF file formats in a fast and efficient manner, we have developed a Java based application, namely ChemEngine. This program recognizes textual patterns from the supplementary data and generates standard molecular structure data (bond matrix, atomic coordinates) that can be subjected to a multitude of computational processes automatically. The methodology has been demonstrated via several case studies on different formats of coordinates data stored in supplementary information files, wherein ChemEngine selectively harvested the atomic coordinates and interpreted them as molecules with high accuracy. The reusability of extracted molecular coordinate data was demonstrated by computing Single Point Energies that were in close agreement with the original computed data provided with the articles. It is envisaged that the methodology will enable large scale conversion of molecular information from supplementary files available in the PDF format into a collection of ready- to- compute molecular data to create an automated workflow for advanced computational processes. Software along with source codes and instructions available at https://sourceforge.net/projects/chemengine/files/?source=navbar.Graphical abstract.

  16. Network Configuration Analysis for Formation Flying Satellites

    NASA Technical Reports Server (NTRS)

    Knoblock, Eric J.; Wallett, Thomas M.; Konangi, Vijay K.; Bhasin, Kul B.

    2001-01-01

    The performance of two networks to support autonomous multi-spacecraft formation flying systems is presented. Both systems are comprised of a ten-satellite formation, with one of the satellites designated as the central or 'mother ship.' All data is routed through the mother ship to the terrestrial network. The first system uses a TCP/EP over ATM protocol architecture within the formation, and the second system uses the IEEE 802.11 protocol architecture within the formation. The simulations consist of file transfers using either the File Transfer Protocol (FTP) or the Simple Automatic File Exchange (SAFE) Protocol. The results compare the IP queuing delay, IP queue size and IP processing delay at the mother ship as well as end-to-end delay for both systems. In all cases, using IEEE 802.11 within the formation yields less delay. Also, the throughput exhibited by SAFE is better than FTP.

  17. Format( )MEDIC( )Input

    NASA Astrophysics Data System (ADS)

    Foster, K.

    1994-09-01

    This document is a description of a computer program called Format( )MEDIC( )Input. The purpose of this program is to allow the user to quickly reformat wind velocity data in the Model Evaluation Database (MEDb) into a reasonable 'first cut' set of MEDIC input files (MEDIC.nml, StnLoc.Met, and Observ.Met). The user is cautioned that these resulting input files must be reviewed for correctness and completeness. This program will not format MEDb data into a Problem Station Library or Problem Metdata File. A description of how the program reformats the data is provided, along with a description of the required and optional user input and a description of the resulting output files. A description of the MEDb is not provided here but can be found in the RAS Division Model Evaluation Database Description document.

  18. 77 FR 66830 - LNG Development Company, LLC and Oregon Pipeline Company; Northwest Pipeline GP; Notice of...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-07

    ... can file your comments electronically using the eFiling feature located on the Commission's Web site ( www.ferc.gov ) under the Documents & Filings link. With eFiling, you can provide comments in a variety of formats by attaching them as a file with your submission. New eFiling users must first create an...

  19. 76 FR 32198 - Science Advisory Board Staff Office Notification of a Joint Public Meeting of the Chartered...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-03

    ... meeting. Written statements should be supplied to the DFO in the following formats: One hard copy with original signature and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, MS Word, WordPerfect, MS PowerPoint, or Rich Text files in IBM-PC/Windows 98/2000/XP format). Submitters are...

  20. 76 FR 4346 - Science Advisory Board Staff Office; Notification of a Public Meeting of the Science Advisory...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-25

    ... their consideration. Written statements should be supplied to the DFO in the following formats: one hard copy with original signature, and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/ Windows 98/2000/XP format...

  1. 75 FR 4816 - Science Advisory Board Staff Office; Notification of Two Public Teleconferences of the Chartered...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-01-29

    ... statements should be supplied to the DFO in the following formats: one hard copy with original signature, and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/Windows 98/2000/XP format). Submitters are asked to provide...

  2. 75 FR 52940 - Science Advisory Board Staff Office; Notification of a Public Meeting of the Chartered Science...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-30

    ... should be supplied to the DFO in the following formats: One hard copy with original signature and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, MS Word, WordPerfect, MS PowerPoint, or Rich Text files in IBM-PC/Windows 98/2000/XP format). Submitters are asked to provide electronic...

  3. 75 FR 80048 - Science Advisory Board Staff Office; Notification of an Upcoming Meeting of the Science Advisory...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-21

    ... be supplied to the DFO in the following formats: One hard copy with original signature, and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/ Windows 98/2000/XP format). Submitters are requested to provide two...

  4. 75 FR 37793 - Science Advisory Board Staff Office; Notification of a Public Meeting of the Science Advisory...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-30

    ... supplied to the DFO in the following formats: One hard copy with original signature, and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/ Windows 98/2000/XP format). Submitters are requested to provide two versions...

  5. 75 FR 1381 - Science Advisory Board Staff Office; Notification of a Public Teleconference of the Clean Air...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-01-11

    ... supplied to the DFO in the following formats: one hard copy with original signature and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, MS Word, WordPerfect, MS PowerPoint, or Rich Text files in IBM-PC/Windows 98/2000/XP format). Submitters are asked to provide versions of each...

  6. 76 FR 16769 - Science Advisory Board Staff Office; Notification of a Public Meeting of the Science Advisory...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-25

    ... statements should be supplied to the DFO in the following formats: One hard copy with original signature and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/Windows 98/2000/XP format). Submitters are requested to provide...

  7. 75 FR 62386 - Science Advisory Board Staff Office; Notification of Two Public Teleconferences of the Science...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-08

    .... Written statements should be supplied to the DFO in the following formats: one hard copy with original signature, and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/Windows 98/2000/XP format). Submitters are asked to...

  8. 76 FR 11245 - Science Advisory Board Staff Office; Notification of Two Public Teleconferences of the Science...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-01

    ... their consideration. Written statements should be supplied to the DFO in the following formats: one hard copy with original signature, and one electronic copy via e-mail (acceptable file format: Adobe Acrobat PDF, WordPerfect, MS Word, MS PowerPoint, or Rich Text files in IBM-PC/ Windows 98/2000/XP format...

  9. Regional seismic lines reprocessed using post-stack processing techniques; National Petroleum Reserve, Alaska

    USGS Publications Warehouse

    Miller, John J.; Agena, W.F.; Lee, M.W.; Zihlman, F.N.; Grow, J.A.; Taylor, D.J.; Killgore, Michele; Oliver, H.L.

    2000-01-01

    This CD-ROM contains stacked, migrated, 2-Dimensional seismic reflection data and associated support information for 22 regional seismic lines (3,470 line-miles) recorded in the National Petroleum Reserve ? Alaska (NPRA) from 1974 through 1981. Together, these lines constitute about one-quarter of the seismic data collected as part of the Federal Government?s program to evaluate the petroleum potential of the Reserve. The regional lines, which form a grid covering the entire NPRA, were created by combining various individual lines recorded in different years using different recording parameters. These data were reprocessed by the USGS using modern, post-stack processing techniques, to create a data set suitable for interpretation on interactive seismic interpretation computer workstations. Reprocessing was done in support of ongoing petroleum resource studies by the USGS Energy Program. The CD-ROM contains the following files: 1) 22 files containing the digital seismic data in standard, SEG-Y format; 2) 1 file containing navigation data for the 22 lines in standard SEG-P1 format; 3) 22 small scale graphic images of each seismic line in Adobe Acrobat? PDF format; 4) a graphic image of the location map, generated from the navigation file, with hyperlinks to the graphic images of the seismic lines; 5) an ASCII text file with cross-reference information for relating the sequential trace numbers on each regional line to the line number and shotpoint number of the original component lines; and 6) an explanation of the processing used to create the final seismic sections (this document). The SEG-Y format seismic files and SEG-P1 format navigation file contain all the information necessary for loading the data onto a seismic interpretation workstation.

  10. 17 CFR 232.14 - Paper filings not accepted without exemption.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 17 Commodity and Securities Exchanges 2 2011-04-01 2011-04-01 false Paper filings not accepted... COMMISSION REGULATION S-T-GENERAL RULES AND REGULATIONS FOR ELECTRONIC FILINGS General § 232.14 Paper filings not accepted without exemption. The Commission will not accept in paper format any filing required to...

  11. UAEMIAAE

    Atmospheric Science Data Center

    2013-12-19

    UAEMIAAE Aerosol product. ( File version details ) File version  F07_0015  has better ... properties. File version  F08_0016  has improved cloud screening procedure resulting in better aerosol optical depth. ... Coverage:  August - October 2004 File Format:  HDF-EOS Tools:  FTP Access: Data Pool ...

  12. Performance regression manager for large scale systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faraj, Daniel A.

    System and computer program product to perform an operation comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputtingmore » for display an indication of a result of the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.« less

  13. Visualization of seismic tomography on Google Earth: Improvement of KML generator and its web application to accept the data file in European standard format

    NASA Astrophysics Data System (ADS)

    Yamagishi, Y.; Yanaka, H.; Tsuboi, S.

    2009-12-01

    We have developed a conversion tool for the data of seismic tomography into KML, called KML generator, and made it available on the web site (http://www.jamstec.go.jp/pacific21/google_earth). The KML generator enables us to display vertical and horizontal cross sections of the model on Google Earth in three-dimensional manner, which would be useful to understand the Earth's interior. The previous generator accepts text files of grid-point data having longitude, latitude, and seismic velocity anomaly. Each data file contains the data for each depth. Metadata, such as bibliographic reference, grid-point interval, depth, are described in other information file. We did not allow users to upload their own tomographic model to the web application, because there is not standard format to represent tomographic model. Recently European seismology research project, NEIRES (Network of Research Infrastructures for European Seismology), advocates that the data of seismic tomography should be standardized. They propose a new format based on JSON (JavaScript Object Notation), which is one of the data-interchange formats, as a standard one for the tomography. This format consists of two parts, which are metadata and grid-point data values. The JSON format seems to be powerful to handle and to analyze the tomographic model, because the structure of the format is fully defined by JavaScript objects, thus the elements are directly accessible by a script. In addition, there exist JSON libraries for several programming languages. The International Federation of Digital Seismograph Network (FDSN) adapted this format as a FDSN standard format for seismic tomographic model. There might be a possibility that this format would not only be accepted by European seismologists but also be accepted as the world standard. Therefore we improve our KML generator for seismic tomography to accept the data file having also JSON format. We also improve the web application of the generator so that the JSON formatted data file can be uploaded. Users can convert any tomographic model data to KML. The KML obtained through the new generator should provide an arena to compare various tomographic models and other geophysical observations on Google Earth, which may act as a common platform for geoscience browser.

  14. National Geochemical Database reformatted data from the National Uranium Resource Evaluation (NURE) Hydrogeochemical and Stream Sediment Reconnaissance (HSSR) program

    USGS Publications Warehouse

    Smith, Steven M.

    1997-01-01

    The National Uranium Resource Evaluation (NURE) Hydrogeochemical and Stream Sediment Reconnaissance (HSSR) program produced a large amount of geochemical data. To fully understand how these data were generated, it is recommended that you read the History of NURE HSSR Program for a summary of the entire program. By the time the NURE program had ended, the HSSR data consisted of 894 separate data files stored with 47 different formats. Many files contained duplication of data found in other files. The University of Oklahoma's Information Systems Programs of the Energy Resources Institute (ISP) was contracted by the Department of Energy to enhance the accessibility and usefulness of the NURE HSSR data. ISP created a single standard-format master file to replace the 894 original files. ISP converted 817 of the 894 original files before its funding apparently ran out. The ISP-reformatted NURE data files have been released by the USGS on CD-ROM (Lower 48 States, Hoffman and Buttleman, 1994; Alaska, Hoffman and Buttleman, 1996). A description of each NURE database field, derived from a draft NURE HSSR data format manual (unpubl. commun., Stan Moll, ISP, Oct 7, 1988), was included in a readme file on each CD-ROM. That original manual was incomplete and assumed that the reformatting process had gone to completion. A lot of vital information was not included. Efforts to correct that manual and the NURE data revealed a large number of problems and missing data. As a result of the frustrating process of cleaning and re-cleaning data from the ISP-reformatted NURE files, a new NURE HSSR data format was developed. This work represents a totally new attempt to reformat the original NURE files into 2 consistent database structures; one for water samples and a second for sediment samples, on a quadrangle by quadrangle basis, from the original NURE files. Although this USGS-reformatted NURE HSSR data format is different than that created by the ISP, many of their ideas were incorporated and expanded in this effort. All of the data from each quadrangle are being examined thoroughly in an attempt to eliminate problems, to combine partial or duplicate records, to convert all coding to a common scheme, and to identify problems even if they can not be solved at this time.

  15. NG6: Integrated next generation sequencing storage and processing environment.

    PubMed

    Mariette, Jérôme; Escudié, Frédéric; Allias, Nicolas; Salin, Gérald; Noirot, Céline; Thomas, Sylvain; Klopp, Christophe

    2012-09-09

    Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,…) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data.

  16. SnopViz, an interactive snow profile visualization tool

    NASA Astrophysics Data System (ADS)

    Fierz, Charles; Egger, Thomas; gerber, Matthias; Bavay, Mathias; Techel, Frank

    2016-04-01

    SnopViz is a visualization tool for both simulation outputs of the snow-cover model SNOWPACK and observed snow profiles. It has been designed to fulfil the needs of operational services (Swiss Avalanche Warning Service, Avalanche Canada) as well as offer the flexibility required to satisfy the specific needs of researchers. This JavaScript application runs on any modern browser and does not require an active Internet connection. The open source code is available for download from models.slf.ch where examples can also be run. Both the SnopViz library and the SnopViz User Interface will become a full replacement of the current research visualization tool SN_GUI for SNOWPACK. The SnopViz library is a stand-alone application that parses the provided input files, for example, a single snow profile (CAAML file format) or multiple snow profiles as output by SNOWPACK (PRO file format). A plugin architecture allows for handling JSON objects (JavaScript Object Notation) as well and plugins for other file formats may be added easily. The outputs are provided either as vector graphics (SVG) or JSON objects. The SnopViz User Interface (UI) is a browser based stand-alone interface. It runs in every modern browser, including IE, and allows user interaction with the graphs. SVG, the XML based standard for vector graphics, was chosen because of its easy interaction with JS and a good software support (Adobe Illustrator, Inkscape) to manipulate graphs outside SnopViz for publication purposes. SnopViz provides new visualization for SNOWPACK timeline output as well as time series input and output. The actual output format for SNOWPACK timelines was retained while time series are read from SMET files, a file format used in conjunction with the open source data handling code MeteoIO. Finally, SnopViz is able to render single snow profiles, either observed or modelled, that are provided as CAAML-file. This file format (caaml.org/Schemas/V5.0/Profiles/SnowProfileIACS) is an international standard to exchange snow profile data. It is supported by the International Association of Cryospheric Sciences (IACS) and was developed in collaboration with practitioners (Avalanche Canada).

  17. Effect of reciprocating file motion on microcrack formation in root canals: an SEM study.

    PubMed

    Ashwinkumar, V; Krithikadatta, J; Surendran, S; Velmurugan, N

    2014-07-01

    To compare dentinal microcrack formation whilst using Ni-Ti hand K-files, ProTaper hand and rotary files and the WaveOne reciprocating file. One hundred and fifty mandibular first molars were selected. Thirty teeth were left unprepared and served as controls, and the remaining 120 teeth were divided into four groups. Ni-Ti hand K-files, ProTaper hand files, ProTaper rotary files and WaveOne Primary reciprocating files were used to prepare the mesial canals. Roots were then sectioned 3, 6 and 9 mm from the apex, and the cut surface was observed under scanning electron microscope (SEM) and checked for the presence of dentinal microcracks. The control and Ni-Ti hand K-files groups were not associated with microcracks. In roots prepared with ProTaper hand files, ProTaper rotary files and WaveOne Primary reciprocating files, dentinal microcracks were present. There was a significant difference between control/Ni-Ti hand K-files group and ProTaper hand files/ProTaper rotary files/WaveOne Primary reciprocating file group (P < 0.001) with ProTaper rotary files producing the most microcracks. No significant difference was observed between teeth prepared with ProTaper hand files and WaveOne Primary reciprocating files. ProTaper rotary files were associated with significantly more microcracks than ProTaper hand files and WaveOne Primary reciprocating files. Ni-Ti hand K-files did not produce microcracks at any levels inside the root canals. © 2013 International Endodontic Journal. Published by John Wiley & Sons Ltd.

  18. DICOM to print, 35-mm slides, web, and video projector: tutorial using Adobe Photoshop.

    PubMed

    Gurney, Jud W

    2002-10-01

    Preparing images for publication has dealt with film and the photographic process. With picture archiving and communications systems, many departments will no longer produce film. This will change how images are produced for publication. DICOM, the file format for radiographic images, has to be converted and then prepared for traditional publication, 35-mm slides, the newest techniques of video projection, and the World Wide Web. Tagged image file format is the common format for traditional print publication, whereas joint photographic expert group is the current file format for the World Wide Web. Each medium has specific requirements that can be met with a common image-editing program such as Adobe Photoshop (Adobe Systems, San Jose, CA). High-resolution images are required for print, a process that requires interpolation. However, the Internet requires images with a small file size for rapid transmission. The resolution of each output differs and the image resolution must be optimized to match the output of the publishing medium.

  19. TADPLOT program, version 2.0: User's guide

    NASA Technical Reports Server (NTRS)

    Hammond, Dana P.

    1991-01-01

    The TADPLOT Program, Version 2.0 is described. The TADPLOT program is a software package coordinated by a single, easy-to-use interface, enabling the researcher to access several standard file formats, selectively collect specific subsets of data, and create full-featured publication and viewgraph quality plots. The user-interface was designed to be independent from any file format, yet provide capabilities to accommodate highly specialized data queries. Integrated with an applications software network, data can be assessed, collected, and viewed quickly and easily. Since the commands are data independent, subsequent modifications to the file format will be transparent, while additional file formats can be integrated with minimal impact on the user-interface. The graphical capabilities are independent of the method of data collection; thus, the data specification and subsequent plotting can be modified and upgraded as separate functional components. The graphics kernel selected adheres to the full functional specifications of the CORE standard. Both interface and postprocessing capabilities are fully integrated into TADPLOT.

  20. A New Archive of UKIRT Legacy Data at CADC

    NASA Astrophysics Data System (ADS)

    Bell, G. S.; Currie, M. J.; Redman, R. O.; Purves, M.; Jenness, T.

    2014-05-01

    We describe a new archive of legacy data from the United Kingdom Infrared Telescope (UKIRT) at the Canadian Astronomy Data Centre (CADC) containing all available data from the Cassegrain instruments. The desire was to archive the raw data in as close to the original format as possible, so where the data followed our current convention of having a single data file per observation, it was archived without alteration, except for minor fixes to headers of data in FITS format to allow it to pass fitsverify and be accepted by CADC. Some of the older data comprised multiple integrations in separate files per observation, stored in either Starlink NDF or Figaro DST format. These were placed inside HDS container files, and DST files were rearranged into NDF format. The describing the observations is ingested into the CAOM-2 repository via an intermediate MongoDB header database, which will also be used to guide the ORAC-DR pipeline in generating reduced data products.

  1. Converting CSV Files to RKSML Files

    NASA Technical Reports Server (NTRS)

    Trebi-Ollennu, Ashitey; Liebersbach, Robert

    2009-01-01

    A computer program converts, into a format suitable for processing on Earth, files of downlinked telemetric data pertaining to the operation of the Instrument Deployment Device (IDD), which is a robot arm on either of the Mars Explorer Rovers (MERs). The raw downlinked data files are in comma-separated- value (CSV) format. The present program converts the files into Rover Kinematics State Markup Language (RKSML), which is an Extensible Markup Language (XML) format that facilitates representation of operations of the IDD and enables analysis of the operations by means of the Rover Sequencing Validation Program (RSVP), which is used to build sequences of commanded operations for the MERs. After conversion by means of the present program, the downlinked data can be processed by RSVP, enabling the MER downlink operations team to play back the actual IDD activity represented by the telemetric data against the planned IDD activity. Thus, the present program enhances the diagnosis of anomalies that manifest themselves as differences between actual and planned IDD activities.

  2. Fortran Program for X-Ray Photoelectron Spectroscopy Data Reformatting

    NASA Technical Reports Server (NTRS)

    Abel, Phillip B.

    1989-01-01

    A FORTRAN program has been written for use on an IBM PC/XT or AT or compatible microcomputer (personal computer, PC) that converts a column of ASCII-format numbers into a binary-format file suitable for interactive analysis on a Digital Equipment Corporation (DEC) computer running the VGS-5000 Enhanced Data Processing (EDP) software package. The incompatible floating-point number representations of the two computers were compared, and a subroutine was created to correctly store floating-point numbers on the IBM PC, which can be directly read by the DEC computer. Any file transfer protocol having provision for binary data can be used to transmit the resulting file from the PC to the DEC machine. The data file header required by the EDP programs for an x ray photoelectron spectrum is also written to the file. The user is prompted for the relevant experimental parameters, which are then properly coded into the format used internally by all of the VGS-5000 series EDP packages.

  3. 78 FR 6319 - Notice of Availability of the Report: Recommended Parameters for Solid Flame Models for Land...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-30

    ... file your comments electronically using the eFiling feature on the Commission's Web site ( www.ferc.gov ) under the link to Documents and Filings. With eFiling, you can provide comments in a variety of formats by attaching them as a file with your submission. New eFiling users must first create an account by...

  4. 77 FR 53885 - Jordan Cove Energy Project LP, Pacific Connector Gas Pipeline LP; Notice of Extension of Comment...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-04

    ... on a project; (2) You can file your comments electronically using the eFiling feature located on the Commission's Web site ( www.ferc.gov ) under the Documents & Filings link. With eFiling, you can provide comments in a variety of formats by attaching them as a file with your submission. New eFiling users must...

  5. A convertor and user interface to import CAD files into worldtoolkit virtual reality systems

    NASA Technical Reports Server (NTRS)

    Wang, Peter Hor-Ching

    1996-01-01

    Virtual Reality (VR) is a rapidly developing human-to-computer interface technology. VR can be considered as a three-dimensional computer-generated Virtual World (VW) which can sense particular aspects of a user's behavior, allow the user to manipulate the objects interactively, and render the VW at real-time accordingly. The user is totally immersed in the virtual world and feel the sense of transforming into that VW. NASA/MSFC Computer Application Virtual Environments (CAVE) has been developing the space-related VR applications since 1990. The VR systems in CAVE lab are based on VPL RB2 system which consists of a VPL RB2 control tower, an LX eyephone, an Isotrak polhemus sensor, two Fastrak polhemus sensors, a folk of Bird sensor, and two VPL DG2 DataGloves. A dynamics animator called Body Electric from VPL is used as the control system to interface with all the input/output devices and to provide the network communications as well as VR programming environment. The RB2 Swivel 3D is used as the modelling program to construct the VW's. A severe limitation of the VPL VR system is the use of RB2 Swivel 3D, which restricts the files to a maximum of 1020 objects and doesn't have the advanced graphics texture mapping. The other limitation is that the VPL VR system is a turn-key system which does not provide the flexibility for user to add new sensors and C language interface. Recently, NASA/MSFC CAVE lab provides VR systems built on Sense8 WorldToolKit (WTK) which is a C library for creating VR development environments. WTK provides device drivers for most of the sensors and eyephones available on the VR market. WTK accepts several CAD file formats, such as Sense8 Neutral File Format, AutoCAD DXF and 3D Studio file format, Wave Front OBJ file format, VideoScape GEO file format, Intergraph EMS stereolithographics and CATIA Stereolithographics STL file formats. WTK functions are object-oriented in their naming convention, are grouped into classes, and provide easy C language interface. Using a CAD or modelling program to build a VW for WTK VR applications, we typically construct the stationary universe with all the geometric objects except the dynamic objects, and create each dynamic object in an individual file.

  6. Using GDAL to Convert NetCDF 4 CF 1.6 to GeoTIFF: Interoperability Problems and Solutions for Data Providers and Distributors

    NASA Astrophysics Data System (ADS)

    Haran, T. M.; Brodzik, M. J.; Nordgren, B.; Estilow, T.; Scott, D. J.

    2015-12-01

    An increasing number of new Earth science datasets are being producedby data providers in self-describing, machine-independent file formatsincluding Hierarchical Data Format version 5 (HDF5) and NetworkCommon Data Form version 4 (netCDF-4). Furthermore data providers maybe producing netCDF-4 files that follow the conventions for Climateand Forecast metadata version 1.6 (CF 1.6) which, for datasets mappedto a projected raster grid covering all or a portion of the earth,includes the Coordinate Reference System (CRS) used to define howlatitude and longitude are mapped to grid coordinates, i.e. columnsand rows, and vice versa. One problem that users may encounter is thattheir preferred visualization and analysis tool may not yet includesupport for one of these newer formats. Moreover, data distributorssuch as NASA's NSIDC DAAC may not yet include support for on-the-flyconversion of data files for all data sets produced in a new format toa preferred older distributed format.There do exist open source solutions to this dilemma in the form ofsoftware packages that can translate files in one of the new formatsto one of the preferred formats. However these software packagesrequire that the file to be translated conform to the specificationsof its respective format. Although an online CF-Convention compliancechecker is available from cfconventions.org, a recent NSIDC userservices incident described here in detail involved an NSIDC-supporteddata set that passed the (then current) CF Checker Version 2.0.6, butwas in fact lacking two variables necessary for conformance. Thisproblem was not detected until GDAL, a software package which reliedon the missing variables, was employed by a user in an attempt totranslate the data into a different file format, namely GeoTIFF.This incident indicates that testing a candidate data product with oneor more software products written to accept the advertised conventionsis proposed as a practice which improves interoperability. Differencesbetween data file contents and software package expectations areexposed, affording an opportunity to improve conformance of software,data or both. The incident can also serve as a demonstration that dataproviders, distributors, and users can work together to improve dataproduct quality and interoperability.

  7. 41 CFR 301-52.3 - Am I required to file a travel claim in a specific format and must the claim be signed?

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... that time, you must file your travel claim in the format prescribed by your agency. If the prescribed... travel claim in a specific format and must the claim be signed? 301-52.3 Section 301-52.3 Public Contracts and Property Management Federal Travel Regulation System TEMPORARY DUTY (TDY) TRAVEL ALLOWANCES...

  8. 41 CFR 301-52.3 - Am I required to file a travel claim in a specific format and must the claim be signed?

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... that time, you must file your travel claim in the format prescribed by your agency. If the prescribed... travel claim in a specific format and must the claim be signed? 301-52.3 Section 301-52.3 Public Contracts and Property Management Federal Travel Regulation System TEMPORARY DUTY (TDY) TRAVEL ALLOWANCES...

  9. 41 CFR 301-52.3 - Am I required to file a travel claim in a specific format and must the claim be signed?

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... that time, you must file your travel claim in the format prescribed by your agency. If the prescribed... travel claim in a specific format and must the claim be signed? 301-52.3 Section 301-52.3 Public Contracts and Property Management Federal Travel Regulation System TEMPORARY DUTY (TDY) TRAVEL ALLOWANCES...

  10. 41 CFR 301-52.3 - Am I required to file a travel claim in a specific format and must the claim be signed?

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... that time, you must file your travel claim in the format prescribed by your agency. If the prescribed... travel claim in a specific format and must the claim be signed? 301-52.3 Section 301-52.3 Public Contracts and Property Management Federal Travel Regulation System TEMPORARY DUTY (TDY) TRAVEL ALLOWANCES...

  11. 41 CFR 301-52.3 - Am I required to file a travel claim in a specific format and must the claim be signed?

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... that time, you must file your travel claim in the format prescribed by your agency. If the prescribed... travel claim in a specific format and must the claim be signed? 301-52.3 Section 301-52.3 Public Contracts and Property Management Federal Travel Regulation System TEMPORARY DUTY (TDY) TRAVEL ALLOWANCES...

  12. 75 FR 78703 - Commission Information Collection Activities, Proposed Collection; Comment Request; Submitted for...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-16

    ... to Docket No. IC10-542-001. Comments may be filed either electronically or in paper format. Those persons filing electronically do not need to make a paper filing. Documents filed electronically via the... sender's e-mail address upon receipt of comments. For paper filings, the comments should be submitted to...

  13. Data File Standard for Flow Cytometry, version FCS 3.1.

    PubMed

    Spidlen, Josef; Moore, Wayne; Parks, David; Goldberg, Michael; Bray, Chris; Bierre, Pierre; Gorombey, Peter; Hyun, Bill; Hubbard, Mark; Lange, Simon; Lefebvre, Ray; Leif, Robert; Novo, David; Ostruszka, Leo; Treister, Adam; Wood, James; Murphy, Robert F; Roederer, Mario; Sudar, Damir; Zigon, Robert; Brinkman, Ryan R

    2010-01-01

    The flow cytometry data file standard provides the specifications needed to completely describe flow cytometry data sets within the confines of the file containing the experimental data. In 1984, the first Flow Cytometry Standard format for data files was adopted as FCS 1.0. This standard was modified in 1990 as FCS 2.0 and again in 1997 as FCS 3.0. We report here on the next generation flow cytometry standard data file format. FCS 3.1 is a minor revision based on suggested improvements from the community. The unchanged goal of the standard is to provide a uniform file format that allows files created by one type of acquisition hardware and software to be analyzed by any other type.The FCS 3.1 standard retains the basic FCS file structure and most features of previous versions of the standard. Changes included in FCS 3.1 address potential ambiguities in the previous versions and provide a more robust standard. The major changes include simplified support for international characters and improved support for storing compensation. The major additions are support for preferred display scale, a standardized way of capturing the sample volume, information about originality of the data file, and support for plate and well identification in high throughput, plate based experiments. Please see the normative version of the FCS 3.1 specification in Supporting Information for this manuscript (or at http://www.isac-net.org/ in the Current standards section) for a complete list of changes.

  14. Data File Standard for Flow Cytometry, Version FCS 3.1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Spidlen, Josef; Moore, Wayne; Parks, David

    2009-11-10

    The flow cytometry data file standard provides the specifications needed to completely describe flow cytometry data sets within the confines of the file containing the experimental data. In 1984, the first Flow Cytometry Standard format for data files was adopted as FCS 1.0. This standard was modified in 1990 as FCS 2.0 and again in 1997 as FCS 3.0. We report here on the next generation flow cytometry standard data file format. FCS 3.1 is a minor revision based on suggested improvements from the community. The unchanged goal of the standard is to provide a uniform file format that allowsmore » files created by one type of acquisition hardware and software to be analyzed by any other type. The FCS 3.1 standard retains the basic FCS file structure and most features of previous versions of the standard. Changes included in FCS 3.1 address potential ambiguities in the previous versions and provide a more robust standard. The major changes include simplified support for international characters and improved support for storing compensation. The major additions are support for preferred display scale, a standardized way of capturing the sample volume, information about originality of the data file, and support for plate and well identification in high throughput, plate based experiments. Please see the normative version of the FCS 3.1 specification in Supporting Information for this manuscript (or at http://www.isac-net.org/ in the Current standards section) for a complete list of changes.« less

  15. Adding Data Management Services to Parallel File Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brandt, Scott

    2015-03-04

    The objective of this project, called DAMASC for “Data Management in Scientific Computing”, is to coalesce data management with parallel file system management to present a declarative interface to scientists for managing, querying, and analyzing extremely large data sets efficiently and predictably. Managing extremely large data sets is a key challenge of exascale computing. The overhead, energy, and cost of moving massive volumes of data demand designs where computation is close to storage. In current architectures, compute/analysis clusters access data in a physically separate parallel file system and largely leave it scientist to reduce data movement. Over the past decadesmore » the high-end computing community has adopted middleware with multiple layers of abstractions and specialized file formats such as NetCDF-4 and HDF5. These abstractions provide a limited set of high-level data processing functions, but have inherent functionality and performance limitations: middleware that provides access to the highly structured contents of scientific data files stored in the (unstructured) file systems can only optimize to the extent that file system interfaces permit; the highly structured formats of these files often impedes native file system performance optimizations. We are developing Damasc, an enhanced high-performance file system with native rich data management services. Damasc will enable efficient queries and updates over files stored in their native byte-stream format while retaining the inherent performance of file system data storage via declarative queries and updates over views of underlying files. Damasc has four key benefits for the development of data-intensive scientific code: (1) applications can use important data-management services, such as declarative queries, views, and provenance tracking, that are currently available only within database systems; (2) the use of these services becomes easier, as they are provided within a familiar file-based ecosystem; (3) common optimizations, e.g., indexing and caching, are readily supported across several file formats, avoiding effort duplication; and (4) performance improves significantly, as data processing is integrated more tightly with data storage. Our key contributions are: SciHadoop which explores changes to MapReduce assumption by taking advantage of semantics of structured data while preserving MapReduce’s failure and resource management; DataMods which extends common abstractions of parallel file systems so they become programmable such that they can be extended to natively support a variety of data models and can be hooked into emerging distributed runtimes such as Stanford’s Legion; and Miso which combines Hadoop and relational data warehousing to minimize time to insight, taking into account the overhead of ingesting data into data warehousing.« less

  16. Particle Pollution

    MedlinePlus

    ... of running) so you don't breathe as hard. Avoid busy roads and highways where PM is usually worse because of emissions from cars and trucks. For more tools to help you learn about air quality, visit Tracking Air Quality . Top of Page File Formats Help: How do I view different file formats ( ...

  17. Deep PDF parsing to extract features for detecting embedded malware.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Munson, Miles Arthur; Cross, Jesse S.

    2011-09-01

    The number of PDF files with embedded malicious code has risen significantly in the past few years. This is due to the portability of the file format, the ways Adobe Reader recovers from corrupt PDF files, the addition of many multimedia and scripting extensions to the file format, and many format properties the malware author may use to disguise the presence of malware. Current research focuses on executable, MS Office, and HTML formats. In this paper, several features and properties of PDF Files are identified. Features are extracted using an instrumented open source PDF viewer. The feature descriptions of benignmore » and malicious PDFs can be used to construct a machine learning model for detecting possible malware in future PDF files. The detection rate of PDF malware by current antivirus software is very low. A PDF file is easy to edit and manipulate because it is a text format, providing a low barrier to malware authors. Analyzing PDF files for malware is nonetheless difficult because of (a) the complexity of the formatting language, (b) the parsing idiosyncrasies in Adobe Reader, and (c) undocumented correction techniques employed in Adobe Reader. In May 2011, Esparza demonstrated that PDF malware could be hidden from 42 of 43 antivirus packages by combining multiple obfuscation techniques [4]. One reason current antivirus software fails is the ease of varying byte sequences in PDF malware, thereby rendering conventional signature-based virus detection useless. The compression and encryption functions produce sequences of bytes that are each functions of multiple input bytes. As a result, padding the malware payload with some whitespace before compression/encryption can change many of the bytes in the final payload. In this study we analyzed a corpus of 2591 benign and 87 malicious PDF files. While this corpus is admittedly small, it allowed us to test a system for collecting indicators of embedded PDF malware. We will call these indicators features throughout the rest of this report. The features are extracted using an instrumented PDF viewer, and are the inputs to a prediction model that scores the likelihood of a PDF file containing malware. The prediction model is constructed from a sample of labeled data by a machine learning algorithm (specifically, decision tree ensemble learning). Preliminary experiments show that the model is able to detect half of the PDF malware in the corpus with zero false alarms. We conclude the report with suggestions for extending this work to detect a greater variety of PDF malware.« less

  18. Report of the IAU Commission 4 Working Group on Standardizing Access to Ephemerides and File Format Specification

    DTIC Science & Technology

    2014-12-01

    format for the orientation of a body. It further recommends support- ing data be stored in a text PCK. These formats are used by the SPICE system...INTRODUCTION These file formats were developed for and are used by the SPICE system, developed by the Navigation and Ancillary Information Facility (NAIF...of NASA’s Jet Propulsion Laboratory (JPL). Most users will want to use either the SPICE libraries or CALCEPH, developed by the Institut de mécanique

  19. Personalization of structural PDB files.

    PubMed

    Woźniak, Tomasz; Adamiak, Ryszard W

    2013-01-01

    PDB format is most commonly applied by various programs to define three-dimensional structure of biomolecules. However, the programs often use different versions of the format. Thus far, no comprehensive solution for unifying the PDB formats has been developed. Here we present an open-source, Python-based tool called PDBinout for processing and conversion of various versions of PDB file format for biostructural applications. Moreover, PDBinout allows to create one's own PDB versions. PDBinout is freely available under the LGPL licence at http://pdbinout.ibch.poznan.pl.

  20. 14 CFR 221.30 - Passenger fares and charges.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... PROCEEDINGS) ECONOMIC REGULATIONS TARIFFS Manner of Filing Tariffs § 221.30 Passenger fares and charges. (a... necessary to carry out the purposes of this part, the applicant carrier to file fare tariffs in a paper format. Such waivers shall only be considered where electronic filing, compared to paper filing, is...

  1. GEWEX-RFA Data File Format and File Naming Convention

    Atmospheric Science Data Center

    2016-05-20

    ... documentation, will be stored for each data product. Each time data is added to, removed from, or modified in the file set for a product, ... including 29 days in leap-year Februaries. Time series files containing 15-minute data should start at the top of an hour to ...

  2. TOLNet Data Format for Lidar Ozone Profile & Surface Observations

    NASA Astrophysics Data System (ADS)

    Chen, G.; Aknan, A. A.; Newchurch, M.; Leblanc, T.

    2015-12-01

    The Tropospheric Ozone Lidar Network (TOLNet) is an interagency initiative started by NASA, NOAA, and EPA in 2011. TOLNet currently has six Lidars and one ozonesonde station. TOLNet provides high-resolution spatio-temporal measurements of tropospheric (surface to tropopause) ozone and aerosol vertical profiles to address fundamental air-quality science questions. The TOLNet data format was developed by TOLNet members as a community standard for reporting ozone profile observations. The development of this new format was primarily based on the existing NDAAC (Network for the Detection of Atmospheric Composition Change) format and ICARTT (International Consortium for Atmospheric Research on Transport and Transformation) format. The main goal is to present the Lidar observations in self-describing and easy-to-use data files. The TOLNet format is an ASCII format containing a general file header, individual profile headers, and the profile data. The last two components repeat for all profiles recorded in the file. The TOLNet format is both human and machine readable as it adopts standard metadata entries and fixed variable names. In addition, software has been developed to check for format compliance. To be presented is a detailed description of the TOLNet format protocol and scanning software.

  3. 46 CFR 535.701 - General requirements.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ..., Washington, DC 20573-0001. A copy of the Monitoring Report form in Microsoft Word and Excel format may be... Monitoring Reports in the Commission's prescribed electronic format, either on diskette or CD-ROM. (e)(1) The... filed by this subpart may be filed by direct electronic transmission in lieu of hard copy. Detailed...

  4. Standard Electronic Format Specification for Tank Characterization Data Loader Version 3.5

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    ADAMS, M.R.

    2001-01-31

    The purpose of this document is to describe the standard electronic format for data files that will be sent for entry into the Tank Characterization Database (TCD). There are 2 different file types needed for each data load: (1) Analytical Results and (2) Sample Descriptions.

  5. 47 CFR 1.913 - Application and notification forms; electronic and manual filing.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... notifications whenever possible. The files, other than the ASCII table of contents, should be in Adobe Acrobat... possible. The attachment should be uploaded via ULS in Adobe Acrobat Portable Document Format (PDF... the table of contents, should be in Adobe Acrobat Portable Document Format (PDF) whenever possible...

  6. 9 CFR 124.30 - Filing, format, and content of petitions.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... RESTORATION Due Diligence Petitions § 124.30 Filing, format, and content of petitions. (a) Any interested... diligence in seeking APHIS approval of the product during the regulatory review period. (b) The petition... subpart. (c) The petition must allege that the applicant failed to act with due diligence sometime during...

  7. Viewing Files — EDRN Public Portal

    Cancer.gov

    In addition to standard HTML Web pages, our web site contain other file formats. You may need additional software or browser plug-ins to view some of the information available on our site. This document lists show each format, along with links to the corresponding freely available plug-ins or viewers.

  8. Painless File Extraction: The A(rc)--Z(oo) of Internet Archive Formats.

    ERIC Educational Resources Information Center

    Simmonds, Curtis

    1993-01-01

    Discusses extraction programs needed to postprocess software downloaded from the Internet that has been archived and compressed for the purposes of storage and file transfer. Archiving formats for DOS, Macintosh, and UNIX operating systems are described; and cross-platform compression utilities are explained. (LRW)

  9. SiLK: A Tool Suite for Unsampled Network Flow Analysis at Scale

    DTIC Science & Technology

    2014-06-01

    file format,” [Accessed: Feb 9, 2014]. [Online]. Available: https: //tools.netsa.cert.org/silk/faq.html#file-formats [12] “2012 data breach investigations...report (DBIR),” Verizon, Tech. Rep., 2012. [Online]. Available: http://www.verizonenterprise.com/DBIR/2012/ [13] “2013 data breach investigations

  10. Incidence of apical crack formation and propagation during removal of root canal filling materials with different engine driven nickel-titanium instruments.

    PubMed

    Özyürek, Taha; Tek, Vildan; Yılmaz, Koray; Uslu, Gülşah

    2017-11-01

    To determine the incidence of crack formation and propagation in apical root dentin after retreatment procedures performed using ProTaper Universal Retreatment (PTR), Mtwo-R, ProTaper Next (PTN), and Twisted File Adaptive (TFA) systems. The study consisted of 120 extracted mandibular premolars. One millimeter from the apex of each tooth was ground perpendicular to the long axis of the tooth, and the apical surface was polished. Twenty teeth served as the negative control group. One hundred teeth were prepared, obturated, and then divided into 5 retreatment groups. The retreatment procedures were performed using the following files: PTR, Mtwo-R, PTN, TFA, and hand files. After filling material removal, apical enlargement was done using apical size 0.50 mm ProTaper Universal (PTU), Mtwo, PTN, TFA, and hand files. Digital images of the apical root surfaces were recorded before preparation, after preparation, after obturation, after filling removal, and after apical enlargement using a stereomicroscope. The images were then inspected for the presence of new apical cracks and crack propagation. Data were analyzed with χ 2 tests using SPSS 21.0 software. New cracks and crack propagation occurred in all the experimental groups during the retreatment process. Nickel-titanium rotary file systems caused significantly more apical crack formation and propagation than the hand files. The PTU system caused significantly more apical cracks than the other groups after the apical enlargement stage. This study showed that retreatment procedures and apical enlargement after the use of retreatment files can cause crack formation and propagation in apical dentin.

  11. Enhanced Modeling of First-Order Plant Equations of Motion for Aeroelastic and Aeroservoelastic Applications

    NASA Technical Reports Server (NTRS)

    Pototzky, Anthony S.

    2010-01-01

    A methodology is described for generating first-order plant equations of motion for aeroelastic and aeroservoelastic applications. The description begins with the process of generating data files representing specialized mode-shapes, such as rigid-body and control surface modes, using both PATRAN and NASTRAN analysis. NASTRAN executes the 146 solution sequence using numerous Direct Matrix Abstraction Program (DMAP) calls to import the mode-shape files and to perform the aeroelastic response analysis. The aeroelastic response analysis calculates and extracts structural frequencies, generalized masses, frequency-dependent generalized aerodynamic force (GAF) coefficients, sensor deflections and load coefficients data as text-formatted data files. The data files are then re-sequenced and re-formatted using a custom written FORTRAN program. The text-formatted data files are stored and coefficients for s-plane equations are fitted to the frequency-dependent GAF coefficients using two Interactions of Structures, Aerodynamics and Controls (ISAC) programs. With tabular files from stored data created by ISAC, MATLAB generates the first-order aeroservoelastic plant equations of motion. These equations include control-surface actuator, turbulence, sensor and load modeling. Altitude varying root-locus plot and PSD plot results for a model of the F-18 aircraft are presented to demonstrate the capability.

  12. Extract and visualize geolocation from any text file

    NASA Astrophysics Data System (ADS)

    Boustani, M.

    2015-12-01

    There are variety of text file formats such as PDF, HTML and more which contains words about locations(countries, cities, regions and more). GeoParser developed as one of sub-projects under DARPA Memex to help finding any geolocation information crawled website data. It is a web application benefiting from Apache Tika to extract locations from any text file format and visualize geolocations on the map. https://github.com/MBoustani/GeoParserhttps://github.com/chrismattmann/tika-pythonhttp://www.darpa.mil/program/memex

  13. MT3DMS: A Modular Three-Dimensional Multispecies Transport Model for Simulation of Advection, Dispersion, and Chemical Reactions of Contaminants in Groundwater Systems; Documentation and User’s Guide

    DTIC Science & Technology

    1999-12-01

    addition, the data files saved in the POINT format can include an optional header which is compatible with Amtec Engineering’s 2-D and 3-D visualization...34.DAT" file so that the file can be used directly by Amtec Engineering’s 2-D and 3-D visualization package Tecplot©. The ARRAY and POINT formats are

  14. Can ASCII data files be standardized for Earth Science?

    NASA Astrophysics Data System (ADS)

    Evans, K. D.; Chen, G.; Wilson, A.; Law, E.; Olding, S. W.; Krotkov, N. A.; Conover, H.

    2015-12-01

    NASA's Earth Science Data Systems Working Groups (ESDSWG) was created over 10 years ago. The role of the ESDSWG is to make recommendations relevant to NASA's Earth science data systems from user experiences. Each group works independently focusing on a unique topic. Participation in ESDSWG groups comes from a variety of NASA-funded science and technology projects, such as MEaSUREs, NASA information technology experts, affiliated contractor, staff and other interested community members from academia and industry. Recommendations from the ESDSWG groups will enhance NASA's efforts to develop long term data products. Each year, the ESDSWG has a face-to-face meeting to discuss recommendations and future efforts. Last year's (2014) ASCII for Science Data Working Group (ASCII WG) completed its goals and made recommendations on a minimum set of information that is needed to make ASCII files at least human readable and usable for the foreseeable future. The 2014 ASCII WG created a table of ASCII files and their components as a means for understanding what kind of ASCII formats exist and what components they have in common. Using this table and adding information from other ASCII file formats, we will discuss the advantages and disadvantages of a standardized format. For instance, Space Geodesy scientists have been using the same RINEX/SINEX ASCII format for decades. Astronomers mostly archive their data in the FITS format. Yet Earth scientists seem to have a slew of ASCII formats, such as ICARTT, netCDF (an ASCII dump) and the IceBridge ASCII format. The 2015 Working Group is focusing on promoting extendibility and machine readability of ASCII data. Questions have been posed, including, Can we have a standardized ASCII file format? Can it be machine-readable and simultaneously human-readable? We will present a summary of the current used ASCII formats in terms of advantages and shortcomings, as well as potential improvements.

  15. As-built design specification for PARCLS

    NASA Technical Reports Server (NTRS)

    Tompkins, M. A. (Principal Investigator)

    1981-01-01

    The PARCLS program, part of the CLASFYG package, reads a parameter file created by the CLASFYG program and a pure pixel ground truth file in order to create to classification file of three separate crop categories in universal format.

  16. Analytic Patch Configuration (APC) gateway version 1.0 user's guide

    NASA Technical Reports Server (NTRS)

    Bingel, Bradford D.

    1990-01-01

    The Analytic Patch Configuration (APC) is an interactive software tool which translates aircraft configuration geometry files from one format into another. This initial release of the APC Gateway accommodates six formats: the four accepted APC formats (89f, 89fd, 89u, and 89ud), the PATRAN 2.x phase 1 neutral file format, and the Integrated Aerodynamic Analysis System (IAAS) General Geometry (GG) format. Written in ANSI FORTRAN 77 and completely self-contained, the APC Gateway is very portable and was already installed on CDC/NOS, VAX/VMS, SUN, SGI/IRIS, CONVEX, and GRAY hosts.

  17. Integration of DICOM and openEHR standards

    NASA Astrophysics Data System (ADS)

    Wang, Ying; Yao, Zhihong; Liu, Lei

    2011-03-01

    The standard format for medical imaging storage and transmission is DICOM. openEHR is an open standard specification in health informatics that describes the management and storage, retrieval and exchange of health data in electronic health records. Considering that the integration of DICOM and openEHR is beneficial to information sharing, on the basis of XML-based DICOM format, we developed a method of creating a DICOM Imaging Archetype in openEHR to enable the integration of DICOM and openEHR. Each DICOM file contains abundant imaging information. However, because reading a DICOM involves looking up the DICOM Data Dictionary, the readability of a DICOM file has been limited. openEHR has innovatively adopted two level modeling method, making clinical information divided into lower level, the information model, and upper level, archetypes and templates. But one critical challenge posed to the development of openEHR is the information sharing problem, especially in imaging information sharing. For example, some important imaging information cannot be displayed in an openEHR file. In this paper, to enhance the readability of a DICOM file and semantic interoperability of an openEHR file, we developed a method of mapping a DICOM file to an openEHR file by adopting the form of archetype defined in openEHR. Because an archetype has a tree structure, after mapping a DICOM file to an openEHR file, the converted information is structuralized in conformance with openEHR format. This method enables the integration of DICOM and openEHR and data exchange without losing imaging information between two standards.

  18. 78 FR 13933 - Railroad Cost of Capital-2012

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-01

    ... by May 31, 2013. ADDRESSES: Comments may be submitted either via the Board's e-filing system or in the traditional paper format. Any person using e-filing should comply with the instructions at the E-FILING link on the Board's Web site, at http://www.stb.dot.gov . Any person submitting a filing in the...

  19. 76 FR 10430 - Railroad Cost of Capital-2010

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-24

    ... by June 8, 2011. ADDRESSES: Comments may be submitted either via the Board's e-filing system or in the traditional paper format. Any person using e-filing should comply with the instructions at the E-FILING link on the Board's Web site, at http://www.stb.dot.gov . Any person submitting a filing in the...

  20. 75 FR 16894 - Railroad Cost of Capital-2009

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-02

    ... 15, 2010. ADDRESSES: Comments may be submitted either via the Board's e-filing system or in the traditional paper format. Any person using e-filing should comply with the instructions at the E-FILING link on the Board's Web site, at http://www.stb.dot.gov . Any person submitting a filing in the...

  1. 5 CFR 1201.14 - Electronic filing procedures.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ...-Appeal Online, in which case service is governed by paragraph (j) of this section, or by non-electronic... (PDF), and image files (files created by scanning). A list of formats allowed can be found at e-Appeal... representatives of the appeals in which they were filed. (j) Service of electronic pleadings and MSPB documents...

  2. 5 CFR 1201.14 - Electronic filing procedures.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ...-Appeal Online, in which case service is governed by paragraph (j) of this section, or by non-electronic... (PDF), and image files (files created by scanning). A list of formats allowed can be found at e-Appeal... representatives of the appeals in which they were filed. (j) Service of electronic pleadings and MSPB documents...

  3. 5 CFR 1201.14 - Electronic filing procedures.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ...-Appeal Online, in which case service is governed by paragraph (j) of this section, or by non-electronic... (PDF), and image files (files created by scanning). A list of formats allowed can be found at e-Appeal... representatives of the appeals in which they were filed. (j) Service of electronic pleadings and MSPB documents...

  4. 5 CFR 1201.14 - Electronic filing procedures.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ...-Appeal Online, in which case service is governed by paragraph (j) of this section, or by non-electronic... (PDF), and image files (files created by scanning). A list of formats allowed can be found at e-Appeal... representatives of the appeals in which they were filed. (j) Service of electronic pleadings and MSPB documents...

  5. Converting Inhouse Subject Card Files to Electronic Keyword Files.

    ERIC Educational Resources Information Center

    Culmer, Carita M.

    The library at Phoenix College developed the Controversial Issues Files (CIF), a "home made" card file containing references pertinent to specific ongoing assignments. Although the CIF had proven itself to be an excellent resource tool for beginning researchers, it was cumbersome to maintain in the card format, and was limited to very…

  6. Networks for Autonomous Formation Flying Satellite Systems

    NASA Technical Reports Server (NTRS)

    Knoblock, Eric J.; Konangi, Vijay K.; Wallett, Thomas M.; Bhasin, Kul B.

    2001-01-01

    The performance of three communications networks to support autonomous multi-spacecraft formation flying systems is presented. All systems are comprised of a ten-satellite formation arranged in a star topology, with one of the satellites designated as the central or "mother ship." All data is routed through the mother ship to the terrestrial network. The first system uses a TCP/lP over ATM protocol architecture within the formation the second system uses the IEEE 802.11 protocol architecture within the formation and the last system uses both of the previous architectures with a constellation of geosynchronous satellites serving as an intermediate point-of-contact between the formation and the terrestrial network. The simulations consist of file transfers using either the File Transfer Protocol (FTP) or the Simple Automatic File Exchange (SAFE) Protocol. The results compare the IF queuing delay, and IP processing delay at the mother ship as well as application-level round-trip time for both systems, In all cases, using IEEE 802.11 within the formation yields less delay. Also, the throughput exhibited by SAFE is better than FTP.

  7. 37 CFR 1.615 - Format of papers filed in a supplemental examination proceeding.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Format of papers filed in a supplemental examination proceeding. 1.615 Section 1.615 Patents, Trademarks, and Copyrights UNITED STATES PATENT AND TRADEMARK OFFICE, DEPARTMENT OF COMMERCE GENERAL RULES OF PRACTICE IN PATENT CASES...

  8. 75 FR 14386 - Interpretation of Transmission Planning Reliability Standard

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-03-25

    ... created electronically using word processing software should be filed in native applications or print-to.... FERC, 564 F.3d 1342 (DC Cir. 2009). \\6\\ Mandatory Reliability Standards for the Bulk-Power System... print-to-PDF format and not in a scanned format. Commenters filing electronically do not need to make a...

  9. PROPOSED STANDARD TO GREATLY EXPAND PUBLIC ACCESS AND EXPLORATION OF TOXICITY DATA: EVALUATION OF STRUCTURE DATA FILE FORMAT

    EPA Science Inventory



    PROPOSED ST ANDARD TO GREA TL Y EXP AND PUBLIC ACCESS AND EXPLORATION OF TOXICITY DATA: EVALUATION OF STRUCTURE DATA FILE FORMAT

    The ability to assess the potential toxicity of environmental, pharmaceutical, or industrial chemicals based on chemical structure in...

  10. 37 CFR 1.615 - Format of papers filed in a supplemental examination proceeding.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Format of papers filed in a supplemental examination proceeding. 1.615 Section 1.615 Patents, Trademarks, and Copyrights UNITED STATES PATENT AND TRADEMARK OFFICE, DEPARTMENT OF COMMERCE GENERAL RULES OF PRACTICE IN PATENT CASES...

  11. VizieR Online Data Catalog: Metal enrichment in semi-analytical model (Cousin+, 2016)

    NASA Astrophysics Data System (ADS)

    Cousin, M.; Buat, V.; Boissier, S.; Bethermin, M.; Roehlly, Y. Genois M.

    2016-04-01

    The repository contains outputs from the different models: - m1: Classical (only hot gas) isotropic accretion scenario + Standard Shmidt Kennicutt law - m2: Bimodal accretion (cold streams) + Standard Shmidt Kennicutt law - m3: Classical (only hot gas) isotropic accretion scenario + ad-hoc non-star forming gas reservoir - m4: Bimodal accretion (cold streams) + ad-hoc non-star forming gas reservoir For each model of these models dada are saved in eGalICS_m*.fits file. All these fits-formated files are compatible with the TOPCAT software available on: http://www.star.bris.ac.uk/~mbt/topcat/ We also provide, for each Initial Mass Function available, a set of two fits-formated files associated to the chemodynamical library presented in the paper. For these two files, data are available for all metallicity bins used. - masslossrates_IMF.fits: The instantaneous total ejecta rate associated to a SSP for the six different main-ISM elements. - SNratesIMF.fits: The total SN rate (SNII+SNIa [nb/Gyr]) associated to a SSP, individual contribution of SNII and SNIa are also given. These files are available for four different IMFs: Salpeter+55 (1955ApJ...121..161S), Chabrier+03 (2003PASP..115..763C), Kroupa+93 (2001MNRAS.322..231K) and Scalo+98 (1998ASPC..142..201S. Both ejecta rates and SN rates are computed for the complete list of stellar ages provided in the BC03 spectra library. They are saved in fits-formated files and structured with different extensions corresponding to the different initial stellar metallicity bins. We finally provide the median star formation history, the median gas accretion history and the metal enrichment histories associated to our MW-sisters sample: MWsistershistories.dat If you used data associated to eGalICS semi-analytic model, please cite the following paper: Cousin et al., 2015A&A...575A..33C, "Toward a new modelling of gas flows in a semi-analytical model of galaxy formation and evolution" (3 data files).

  12. IVS Working Group 4: VLBI Data Structures

    NASA Astrophysics Data System (ADS)

    Gipson, J.

    2012-12-01

    I present an overview of the "openDB format" for storing, archiving, and processing VLBI data. In this scheme, most VLBI data is stored in NetCDF files. NetCDF has the advantage that there are interfaces to most common computer languages including Fortran, Fortran-90, C, C++, Perl, etc, and the most common operating systems including Linux, Windows, and Mac. The data files for a particular session are organized by special ASCII "wrapper" files which contain pointers to the data files. This allows great flexibility in the processing and analysis of VLBI data. For example it allows you to easily change subsets of the data used in the analysis such as troposphere modeling, ionospheric calibration, editing, and ambiguity resolution. It also allows for extending the types of data used, e.g., source maps. I present a roadmap to transition to this new format. The new format can already be used by VieVS and by the global mode of solve. There are plans in work for other software packages to be able to use the new format.

  13. CONNJUR spectrum translator: an open source application for reformatting NMR spectral data.

    PubMed

    Nowling, Ronald J; Vyas, Jay; Weatherby, Gerard; Fenwick, Matthew W; Ellis, Heidi J C; Gryk, Michael R

    2011-05-01

    NMR spectroscopists are hindered by the lack of standardization for spectral data among the file formats for various NMR data processing tools. This lack of standardization is cumbersome as researchers must perform their own file conversion in order to switch between processing tools and also restricts the combination of tools employed if no conversion option is available. The CONNJUR Spectrum Translator introduces a new, extensible architecture for spectrum translation and introduces two key algorithmic improvements. This first is translation of NMR spectral data (time and frequency domain) to a single in-memory data model to allow addition of new file formats with two converter modules, a reader and a writer, instead of writing a separate converter to each existing format. Secondly, the use of layout descriptors allows a single fid data translation engine to be used for all formats. For the end user, sophisticated metadata readers allow conversion of the majority of files with minimum user configuration. The open source code is freely available at http://connjur.sourceforge.net for inspection and extension.

  14. Segy-change: The swiss army knife for the SEG-Y files

    NASA Astrophysics Data System (ADS)

    Stanghellini, Giuseppe; Carrara, Gabriela

    Data collected during active and passive seismic surveys can be stored in many different, more or less standard, formats. One of the most popular is the SEG-Y format, developed since 1975 to store single-line seismic digital data on tapes, and now evolved to store them into hard-disk and other media as well. Unfortunately, sometimes, files that are claimed to be recorded in the SEG-Y format cannot be processed using available free or industrial packages. Aiming to solve this impasse we present segy-change, a pre-processing software program to view, analyze, change and fix errors present in SEG-Y data files. It is written in C language and it can be used also as a software library and is compatible with most operating systems. Segy-change allows the user to display and optionally change the values inside all parts of a SEG-Y file: the file header, the trace headers and the data blocks. In addition, it allows to do a quality check on the data by plotting the traces. We provide instructions and examples on how to use the software.

  15. VizieR Online Data Catalog: Opacities from the Opacity Project (Seaton+, 1995)

    NASA Astrophysics Data System (ADS)

    Seaton, M. J.; Yan, Y.; Mihalas, D.; Pradhan, A. K.

    1997-08-01

    1 CODES. ***** 1.1 Code rop.for ************ This code reads opacity files written in standard OP format. Its main purpose is to provide documentation on the contents of the files. This code, like the other codes provided, prompts for the name of the file (or files) to be read. The file names read in response to the prompt may have up to 128 characters. 1.2 Code opfit.for ************** This code reads opacity files in standard OP format, and provides for interpolation of opacities to any required values of temperature and mass-density. The method used is described in OPF. The code prompts for the name of a file giving all required control parameters. As an example, the file opfit.dat is provided (users will need to change directory names and file names). The use of opfit.for is illustrated using opfit.dat. Most users will probably want to adapt opfit.for for use as a subroutine in other codes. Timings for DEC 7000 ALPHA: 0.3 sec for data read and initialisations; then 0.0007 sec for each temperature-density point. Users who like OPAL formats should note that opfit.for has a facility to produce files of OP data in OPAL-type formats. 1.3 Code ixz.for ************ This code provides for interpolations to any required values of X and Z. See IXZ. It prompts for the name of a file giving all required control parameters. An example of such a file if provided, ixz.dat (the user will need to change directory and file names). The output files have names s92INT.'nnn'. The user specifies the first value of nnn, and the number of files to be produced. 2. DATA FILES ********** 2.1 Data files for solar metal-mix ****************************** Data for solar metal-mix s92 as defined in SYMP. These files are from version 2 runs of December 1994 (see IXZ for details on Version 2). There are 213 files with names s92.'nnn', 'nnn'=201 to 413. Each file occupies 83762 bytes. The file s92.version2 gives values of X (hydrogen mass-faction) and Z (metals mass-fraction) for each value of 'nnn'. The user can get s92.version2, select the values of 'nnn' required, then get the required files s92.'nnn'. The user can see the file in ftp, displayed on the screen, by typing "get s92.version2 -". The files s92.'nnn' can be used with opfit.for to obtain opacities for any requires value of temperature and mass density. Files for other metal-mixtures will be added in due course. Send requests to mjs@star.ucl.ac.uk. 2.2 Files for interpolation in X and Z ********************************** The data files have names s92xz.'mmm', where 'mmm'=001 to 096. They differ from the standard OP files (such as s92.'nnn' --- section 2.1 above) in that they contain information giving derivatives of opacities with respect to X and Z. Each file s92xz.'mmm' occupies 148241 bytes. The interpolations to any required values of X and Z are made using ixz.for. Timings: on DEC 7000 ALPHA, 2.16 sec for each new-mixture file. For interpolations to some specified values of X and Z, one requires just 4 files s92xz.'mmm'. Most users will not require the complete set of files s92xz.'mmm'. The file s92xz.index includes a table (starting on line 3) giving values, for each 'mmm' file, of x,y,z (abundances by number-factions) and X,Y,Z (abundances by mass-fractions). Users are advised to get the file s92.index, and select values of 'mmm' for files required, then get those files. The files produced by ixz.for are in standard OP format and can be used with opfit.for to obtain opacities for any required values of temperature and mass density. 3 RECOMMENDED PROCEDURE FOR USE OF OPACITY FILES ********************************************** (1) Get the file s92.version2. (2) If the values of X and Z you require are available in the files s92.'nnn' then get those files. (3) If not, get the file s92xz.index. (4) Select from s92xz.index the values of 'mmm' which cover the range of X and Z in which your are interested. Get those files and use ixz.for to generate files for your exact required values of X and Z. (5) Note that the exact abundance mixtures used are specified in each file (see rop.for). Also each run of opfit.for produces a table of abundances. (6) If you want a metal-mix different from that of s92, contact mjs@star.ucl.ac.uk. 4 FUTURE DEVELOPMENTS ******************* (1) Data for the calculation of radiative forces are provided as the CDS catalog (added August 1997) (2) Facilities will be added later which will enable the user to make calculations giving files for any required mixtures. (9 data files).

  16. Web Standard: PDF - When to Use, Document Metadata, PDF Sections

    EPA Pesticide Factsheets

    PDF files provide some benefits when used appropriately. PDF files should not be used for short documents ( 5 pages) unless retaining the format for printing is important. PDFs should have internal file metadata and meet section 508 standards.

  17. VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering.

    PubMed

    Verbist, Bie M P; Thys, Kim; Reumers, Joke; Wetzels, Yves; Van der Borght, Koen; Talloen, Willem; Aerssens, Jeroen; Clement, Lieven; Thas, Olivier

    2015-01-01

    In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations. A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%. The VirVarSeq is available, together with a user's guide and test data, at sourceforge: http://sourceforge.net/projects/virtools/?source=directory. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Simultaneous human platelet antigen genotyping and detection of novel single nucleotide polymorphisms by targeted next-generation sequencing.

    PubMed

    Davey, Sue; Navarrete, Cristina; Brown, Colin

    2017-06-01

    Twenty-nine human platelet antigen systems have been described to date, but the majority of current genotyping methods are restricted to the identification of those most commonly associated with alloantibody production in a clinical context. This can result in a protracted investigation if causative human platelet antigens are rare or novel. A targeted next-generation sequencing approach was designed to detect all known human platelet antigens with the additional capability of identifying novel mutations in the encoding genes. A targeted enrichment, high-sensitivity HaloPlex assay was designed to sequence all exons and flanking regions of the six genes known to encode human platelet antigens. Indexed DNA libraries were prepared from 47 previously human platelet antigen-genotyped samples and subsequently combined into one of three pools for sequencing on an Illumina MiSeq platform. The generated FASTQ files were aligned and scrutinized for each human platelet antigen polymorphism using SureCall data analysis software. Forty-six samples were successfully genotyped for human platelet antigens 1 through 29bw, with an average per base coverage depth of 1144. Concordance with historical human platelet antigen genotypes was 100%. A putative novel mutation in Exon 10 of the integrin β-3 (ITGB3) gene from an unsolved case of fetal neonatal alloimmune thrombocytopenia was also detected. A next-generation sequencing-based method that can accurately define all known human platelet antigen polymorphisms was developed. With the ability to sequence up to 96 samples simultaneously, our HaloPlex design could be used for high-throughput human platelet antigen genotyping. This method is also applicable for investigating fetal neonatal alloimmune thrombocytopenia when rare or novel human platelet antigens are suspected. © 2017 AABB.

  19. Guide to GFS History File Change on May 1, 2007

    Science.gov Websites

    Guide to GFS History File Change on May 1, 2007 On May 1, 2007 12Z, the GFS had a major change. The change caused the internal binary GFS history file to change formats. The file is still in spectral space but now pressure is calculated in a different way. Sometime in the future, the GFS history file may be

  20. FGGE/ERBZ tape specification and shipping letter description

    NASA Technical Reports Server (NTRS)

    Han, D.; Lo, H.

    1983-01-01

    The FGGE/ERBZ tape contains 5 parameters which are extracted and reformatted from the Nimbus-7 ERB Zonal Means Tape. There are three types of files on a FGGE/ERBZ tape: a tape header file, and data files. Physical characteristics, gross format, and file specifications are given. A sample tape check/document printout (shipping letter) is included.

  1. NCEP BUFR File Structure

    Science.gov Websites

    . These tables may be defined within a separate ASCII text file (see Description and Format of BUFR Tables time, the BUFR tables are usually read from an external ASCII text file (although it is also possible reports. Click here to view the ASCII text file (called /nwprod/fix/bufrtab.002 on the NCEP CCS machines

  2. 75 FR 45609 - Commission Information Collection Activities (FERC-542); Comment Request; Extension

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-03

    ... electronically (eFiled) or in paper format, and should refer to Docket No. IC10-542-000. Documents must be.... Commenters making an eFiling should not make a paper filing. Commenters that are not able to file electronically must send an original and two (2) paper copies of their comments to: Federal Energy Regulatory...

  3. Global Paleoclimatic Data for 6000 Yr B.P. (1985) (NDP-011)

    DOE Data Explorer

    Webb, III, T. [Department of Geological Sciences, Brown University, Providence, Rhode Island (USA)

    2012-01-01

    To determine regional and global climatic variations during the past 6000 years, pollen, lake level, and marine plankton data from 797 stations were compiled to form a global data set. Radiocarbon dating and dated tephras were used to determine the ages of the specimens. The data available for the pollen data are site number, site name, latitude, longitude, elevation, and percentages of various taxa. For lake-level data, the data are site number, site name, latitude, longitude, and lake-level status. And for marine plankton, the data are site number, site name, latitude, longitude, water depth, date, dating control code, depth of sample, interpolated age of sample, estimated winter and summer sea-surface temperatures, and percentages of various taxa. The data are in 55 files: 5 files for each of 9 geographic regions and 10 supplemental files. The files for each region include (1) a FORMAT file describing the format and contents of the data for that region, (2) an INDEX file containing descriptive information about each site and its data, (3) a DATA file containing the data and available climatic estimates, (4) a PUBINDEX file indexing the bibliographic references associated with each site, and (5) a REFERENCE file containing the bibliographic references. The files range in size from 2 to 66 kB.

  4. PATSTAGS - PATRAN-STAGSC-1 TRANSLATOR

    NASA Technical Reports Server (NTRS)

    Otte, N. E.

    1994-01-01

    PATSTAGS translates PATRAN finite model data into STAGS (Structural Analysis of General Shells) input records to be used for engineering analysis. The program reads data from a PATRAN neutral file and writes STAGS input records into a STAGS input file and a UPRESS data file. It is able to support translations of nodal constraints, nodal, element, force and pressure data. PATSTAGS uses three files: the PATRAN neutral file to be translated, a STAGS input file and a STAGS pressure data file. The user provides the names for the neutral file and the desired names of the STAGS files to be created. The pressure data file contains the element live pressure data used in the STAGS subroutine UPRESS. PATSTAGS is written in FORTRAN 77 for DEC VAX series computers running VMS. The main memory requirement for execution is approximately 790K of virtual memory. Output blocks can be modified to output the data in any format desired, allowing the program to be used to translate model data to analysis codes other than STAGSC-1 (HQN-10967). This program is available in DEC VAX BACKUP format on a 9-track magnetic tape or TK50 tape cartridge. Documentation is included in the price of the program. PATSTAGS was developed in 1990. DEC, VAX, TK50 and VMS are trademarks of Digital Equipment Corporation.

  5. The development of method for continuous improvement of master file of the nursing practice terminology.

    PubMed

    Tsuru, Satoko; Okamine, Eiko; Takada, Aya; Watanabe, Chitose; Uchiyama, Makiko; Dannoue, Hideo; Aoyagi, Hisae; Endo, Akira

    2009-01-01

    Nursing Action Master and Nursing Observation Master were released from 2002 to 2008. Two kinds of format, an Excel format and a CSV format file are prepared for maintaining them. Followings were decided as a basic rule of the maintenance: newly addition, revision, deletion, the numbering of the management and a rule of the coding. The master was developed based on it. We do quality assurance for the masters using these rules.

  6. Chapter 6. Tabular data and graphical images in support of the U.S. Geological Survey National Oil and Gas Assessment-East Texas basin and Louisiana-Mississippi salt basins provinces, Jurassic Smackover interior salt basins total petroleum system (504902), Travis Peak and Hosston formations.

    USGS Publications Warehouse

    ,

    2006-01-01

    This chapter describes data used in support of the process being applied by the U.S. Geological Survey (USGS) National Oil and Gas Assessment (NOGA) project. Digital tabular data used in this report and archival data that permit the user to perform further analyses are available elsewhere on the CD-ROM. Computers and software may import the data without transcription from the Portable Document Format files (.pdf files) of the text by the reader. Because of the number and variety of platforms and software available, graphical images are provided as .pdf files and tabular data are provided in a raw form as tab-delimited text files (.tab files).

  7. 75 FR 71625 - System Restoration Reliability Standards

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-11-24

    ... processing software should be filed in native applications or print-to-PDF format, and not in a scanned... (2006), aff'd sub nom. Alcoa, Inc. v. FERC, 564 F.3d 1342 (D.C. Cir. 2009). 6. On March 16, 2007, the... electronically using word processing software should be filed in native applications or print-to-PDF format, and...

  8. 75 FR 81152 - Interpretation of Protection System Reliability Standard

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-27

    ... created electronically using word processing software should be filed in native applications or print-to... reh'g & compliance, 117 FERC ] 61,126 (2006), aff'd sub nom. Alcoa, Inc. v. FERC, 564 F.3d 1342 (DC... print-to-PDF format and not in a scanned format, at http://www.ferc.gov/docs-filing/efiling.asp . Mail...

  9. 78 FR 4766 - Adoption of Updated EDGAR Filer Manual

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-23

    ... primarily to introduce the new EDGARLink Online submission type IRANNOTICE; and support PDF as an official... Portable Document Format (PDF) as an official filing format. EDGAR will continue to accept ASCII and HTML...) and 101 (17 CFR 232.101) of Regulation S-T and the EDGAR Filer Manual relating to the use of PDF files...

  10. 75 FR 80296 - Extension of Filing Accommodation for Static Pool Information in Filings With Respect to Asset...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-22

    ... Systems in 1993 for document exchange. PDF captures formatting information from a variety of desktop publishing applications, making it possible to send formatted documents and have them appear on the recipient... Administrative Procedure Act generally requires that an agency publish an adopted rule in the Federal Register 30...

  11. COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project.

    PubMed

    Bergmann, Frank T; Adams, Richard; Moodie, Stuart; Cooper, Jonathan; Glont, Mihai; Golebiewski, Martin; Hucka, Michael; Laibe, Camille; Miller, Andrew K; Nickerson, David P; Olivier, Brett G; Rodriguez, Nicolas; Sauro, Herbert M; Scharm, Martin; Soiland-Reyes, Stian; Waltemath, Dagmar; Yvon, Florent; Le Novère, Nicolas

    2014-12-14

    With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separate components required to reproduce a given published scientific result. We describe the Open Modeling EXchange format (OMEX). Together with the use of other standard formats from the Computational Modeling in Biology Network (COMBINE), OMEX is the basis of the COMBINE Archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE Archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools that support the COMBINE Archive are available, either as independent libraries or embedded in modeling software. The COMBINE Archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps in building activity logs and audit trails. We anticipate that the COMBINE Archive will become a significant help for modellers, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.

  12. NMReDATA, a standard to report the NMR assignment and parameters of organic compounds.

    PubMed

    Pupier, Marion; Nuzillard, Jean-Marc; Wist, Julien; Schlörer, Nils E; Kuhn, Stefan; Erdelyi, Mate; Steinbeck, Christoph; Williams, Antony J; Butts, Craig; Claridge, Tim D W; Mikhova, Bozhana; Robien, Wolfgang; Dashti, Hesam; Eghbalnia, Hamid R; Farès, Christophe; Adam, Christian; Kessler, Pavel; Moriaud, Fabrice; Elyashberg, Mikhail; Argyropoulos, Dimitris; Pérez, Manuel; Giraudeau, Patrick; Gil, Roberto R; Trevorrow, Paul; Jeannerat, Damien

    2018-04-14

    Even though NMR has found countless applications in the field of small molecule characterization, there is no standard file format available for the NMR data relevant to structure characterization of small molecules. A new format is therefore introduced to associate the NMR parameters extracted from 1D and 2D spectra of organic compounds to the proposed chemical structure. These NMR parameters, which we shall call NMReDATA (for nuclear magnetic resonance extracted data), include chemical shift values, signal integrals, intensities, multiplicities, scalar coupling constants, lists of 2D correlations, relaxation times, and diffusion rates. The file format is an extension of the existing Structure Data Format, which is compatible with the commonly used MOL format. The association of an NMReDATA file with the raw and spectral data from which it originates constitutes an NMR record. This format is easily readable by humans and computers and provides a simple and efficient way for disseminating results of structural chemistry investigations, allowing automatic verification of published results, and for assisting the constitution of highly needed open-source structural databases. Copyright © 2018 John Wiley & Sons, Ltd.

  13. Photon-HDF5: Open Data Format and Computational Tools for Timestamp-based Single-Molecule Experiments

    PubMed Central

    Ingargiola, Antonino; Laurence, Ted; Boutelle, Robert; Weiss, Shimon; Michalet, Xavier

    2017-01-01

    Archival of experimental data in public databases has increasingly become a requirement for most funding agencies and journals. These data-sharing policies have the potential to maximize data reuse, and to enable confirmatory as well as novel studies. However, the lack of standard data formats can severely hinder data reuse. In photon-counting-based single-molecule fluorescence experiments, data is stored in a variety of vendor-specific or even setup-specific (custom) file formats, making data interchange prohibitively laborious, unless the same hardware-software combination is used. Moreover, the number of available techniques and setup configurations make it difficult to find a common standard. To address this problem, we developed Photon-HDF5 (www.photon-hdf5.org), an open data format for timestamp-based single-molecule fluorescence experiments. Building on the solid foundation of HDF5, Photon-HDF5 provides a platform- and language-independent, easy-to-use file format that is self-describing and supports rich metadata. Photon-HDF5 supports different types of measurements by separating raw data (e.g. photon-timestamps, detectors, etc) from measurement metadata. This approach allows representing several measurement types and setup configurations within the same core structure and makes possible extending the format in backward-compatible way. Complementing the format specifications, we provide open source software to create and convert Photon-HDF5 files, together with code examples in multiple languages showing how to read Photon-HDF5 files. Photon-HDF5 allows sharing data in a format suitable for long term archival, avoiding the effort to document custom binary formats and increasing interoperability with different analysis software. We encourage participation of the single-molecule community to extend interoperability and to help defining future versions of Photon-HDF5. PMID:28649160

  14. Photon-HDF5: Open Data Format and Computational Tools for Timestamp-based Single-Molecule Experiments.

    PubMed

    Ingargiola, Antonino; Laurence, Ted; Boutelle, Robert; Weiss, Shimon; Michalet, Xavier

    2016-02-13

    Archival of experimental data in public databases has increasingly become a requirement for most funding agencies and journals. These data-sharing policies have the potential to maximize data reuse, and to enable confirmatory as well as novel studies. However, the lack of standard data formats can severely hinder data reuse. In photon-counting-based single-molecule fluorescence experiments, data is stored in a variety of vendor-specific or even setup-specific (custom) file formats, making data interchange prohibitively laborious, unless the same hardware-software combination is used. Moreover, the number of available techniques and setup configurations make it difficult to find a common standard. To address this problem, we developed Photon-HDF5 (www.photon-hdf5.org), an open data format for timestamp-based single-molecule fluorescence experiments. Building on the solid foundation of HDF5, Photon-HDF5 provides a platform- and language-independent, easy-to-use file format that is self-describing and supports rich metadata. Photon-HDF5 supports different types of measurements by separating raw data (e.g. photon-timestamps, detectors, etc) from measurement metadata. This approach allows representing several measurement types and setup configurations within the same core structure and makes possible extending the format in backward-compatible way. Complementing the format specifications, we provide open source software to create and convert Photon-HDF5 files, together with code examples in multiple languages showing how to read Photon-HDF5 files. Photon-HDF5 allows sharing data in a format suitable for long term archival, avoiding the effort to document custom binary formats and increasing interoperability with different analysis software. We encourage participation of the single-molecule community to extend interoperability and to help defining future versions of Photon-HDF5.

  15. Photon-HDF5: open data format and computational tools for timestamp-based single-molecule experiments

    NASA Astrophysics Data System (ADS)

    Ingargiola, Antonino; Laurence, Ted; Boutelle, Robert; Weiss, Shimon; Michalet, Xavier

    2016-02-01

    Archival of experimental data in public databases has increasingly become a requirement for most funding agencies and journals. These data-sharing policies have the potential to maximize data reuse, and to enable confirmatory as well as novel studies. However, the lack of standard data formats can severely hinder data reuse. In photon-counting-based single-molecule fluorescence experiments, data is stored in a variety of vendor-specific or even setup-specific (custom) file formats, making data interchange prohibitively laborious, unless the same hardware-software combination is used. Moreover, the number of available techniques and setup configurations make it difficult to find a common standard. To address this problem, we developed Photon-HDF5 (www.photon-hdf5.org), an open data format for timestamp-based single-molecule fluorescence experiments. Building on the solid foundation of HDF5, Photon- HDF5 provides a platform- and language-independent, easy-to-use file format that is self-describing and supports rich metadata. Photon-HDF5 supports different types of measurements by separating raw data (e.g. photon-timestamps, detectors, etc) from measurement metadata. This approach allows representing several measurement types and setup configurations within the same core structure and makes possible extending the format in backward-compatible way. Complementing the format specifications, we provide open source software to create and convert Photon- HDF5 files, together with code examples in multiple languages showing how to read Photon-HDF5 files. Photon- HDF5 allows sharing data in a format suitable for long term archival, avoiding the effort to document custom binary formats and increasing interoperability with different analysis software. We encourage participation of the single-molecule community to extend interoperability and to help defining future versions of Photon-HDF5.

  16. Petroleum system modeling of the western Canada sedimentary basin - isopach grid files

    USGS Publications Warehouse

    Higley, Debra K.; Henry, Mitchell E.; Roberts, Laura N.R.

    2005-01-01

    This publication contains zmap-format grid files of isopach intervals that represent strata associated with Devonian to Holocene petroleum systems of the Western Canada Sedimentary Basin (WCSB) of Alberta, British Columbia, and Saskatchewan, Canada. Also included is one grid file that represents elevations relative to sea level of the top of the Lower Cretaceous Mannville Group. Vertical and lateral scales are in meters. The age range represented by the stratigraphic intervals comprising the grid files is 373 million years ago (Ma) to present day. File names, age ranges, formation intervals, and primary petroleum system elements are listed in table 1. Metadata associated with this publication includes information on the study area and the zmap-format files. The digital files listed in table 1 were compiled as part of the Petroleum Processes Research Project being conducted by the Central Energy Resources Team of the U.S. Geological Survey, which focuses on modeling petroleum generation, 3 migration, and accumulation through time for petroleum systems of the WCSB. Primary purposes of the WCSB study are to Construct the 1-D/2-D/3-D petroleum system models of the WCSB. Actual boundaries of the study area are documented within the metadata; excluded are northern Alberta and eastern Saskatchewan, but fringing areas of the United States are included.Publish results of the research and the grid files generated for use in the 3-D model of the WCSB.Evaluate the use of petroleum system modeling in assessing undiscovered oil and gas resources for geologic provinces across the World.

  17. SEDIMENT DATA - ST. PAUL WATERWAY - TACOMA, WA - 1996 MONITORING DATA

    EPA Science Inventory

    Benthic Infauna Monitoring Data Files are Excel-format spreadsheet files which contain data presented in the St. Paul Waterway Area Remedial Action and Habitat Restoration Project, 1996 Monitoring Report. The files can be viewed directly or readily downlo aded and read into most ...

  18. Active Management of Integrated Geothermal-CO2 Storage Reservoirs in Sedimentary Formations

    DOE Data Explorer

    Buscheck, Thomas A.

    2012-01-01

    Active Management of Integrated Geothermal–CO2 Storage Reservoirs in Sedimentary Formations: An Approach to Improve Energy Recovery and Mitigate Risk : FY1 Final Report The purpose of phase 1 is to determine the feasibility of integrating geologic CO2 storage (GCS) with geothermal energy production. Phase 1 includes reservoir analyses to determine injector/producer well schemes that balance the generation of economically useful flow rates at the producers with the need to manage reservoir overpressure to reduce the risks associated with overpressure, such as induced seismicity and CO2 leakage to overlying aquifers. This submittal contains input and output files of the reservoir model analyses. A reservoir-model "index-html" file was sent in a previous submittal to organize the reservoir-model input and output files according to sections of the FY1 Final Report to which they pertain. The recipient should save the file: Reservoir-models-inputs-outputs-index.html in the same directory that the files: Section2.1.*.tar.gz files are saved in.

  19. Active Management of Integrated Geothermal-CO2 Storage Reservoirs in Sedimentary Formations

    DOE Data Explorer

    Buscheck, Thomas A.

    2000-01-01

    Active Management of Integrated Geothermal–CO2 Storage Reservoirs in Sedimentary Formations: An Approach to Improve Energy Recovery and Mitigate Risk: FY1 Final Report The purpose of phase 1 is to determine the feasibility of integrating geologic CO2 storage (GCS) with geothermal energy production. Phase 1 includes reservoir analyses to determine injector/producer well schemes that balance the generation of economically useful flow rates at the producers with the need to manage reservoir overpressure to reduce the risks associated with overpressure, such as induced seismicity and CO2 leakage to overlying aquifers. This submittal contains input and output files of the reservoir model analyses. A reservoir-model "index-html" file was sent in a previous submittal to organize the reservoir-model input and output files according to sections of the FY1 Final Report to which they pertain. The recipient should save the file: Reservoir-models-inputs-outputs-index.html in the same directory that the files: Section2.1.*.tar.gz files are saved in.

  20. Developing a radiology-based teaching approach for gross anatomy in the digital era.

    PubMed

    Marker, David R; Bansal, Anshuman K; Juluru, Krishna; Magid, Donna

    2010-08-01

    The purpose of this study was to assess the implementation of a digital anatomy lecture series based largely on annotated, radiographic images and the utility of the Radiological Society of North America-developed Medical Imaging Resource Center (MIRC) for providing an online educational resource. A series of digital teaching images were collected and organized to correspond to lecture and dissection topics. MIRC was used to provide the images in a Web-based educational format for incorporation into anatomy lectures and as a review resource. A survey assessed the impressions of the medical students regarding this educational format. MIRC teaching files were successfully used in our teaching approach. The lectures were interactive with questions to and from the medical student audience regarding the labeled images used in the presentation. Eighty-five of 120 students completed the survey. The majority of students (87%) indicated that the MIRC teaching files were "somewhat useful" to "very useful" when incorporated into the lecture. The students who used the MIRC files were most likely to access the material from home (82%) on an occasional basis (76%). With regard to areas for improvement, 63% of the students reported that they would have benefited from more teaching files, and only 9% of the students indicated that the online files were not user friendly. The combination of electronic radiology resources available in lecture format and on the Internet can provide multiple opportunities for medical students to learn and revisit first-year anatomy. MIRC provides a user-friendly format for presenting radiology education files for medical students. 2010 AUR. Published by Elsevier Inc. All rights reserved.

  1. Parser Combinators: a Practical Application for Generating Parsers for NMR Data

    PubMed Central

    Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi JC; Gryk, Michael R.

    2013-01-01

    Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as “parser combinators”. This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525

  2. VizieR Online Data Catalog: Infrared Arcturus Atlas (Hinkle+ 1995)

    NASA Astrophysics Data System (ADS)

    Hinkle, K.; Wallace, L.; Livingston, W.

    1996-01-01

    The atlas is contained in 310 spectral files a list of line identifications, plus a file containing a list of the files and unobserved spectral regions. The spectral file names are in the form 'abnnnnn' where 'nnnnn' denotes the spectral region, e.g. file 'ab4300' contains spectra for the 4300-4325 cm-1 range. The atomic and molecular line identifications are in files 'appendix.a' and 'appendix.b', and repeated with a uniform format in file 'lines'. The file 'appendix.c' is a book-keeping device used to correlate the plot plages and spectral files with frequency. See the author-supplied description in 'readme.dat' for more information. (311 data files).

  3. Strategies for Sharing Seismic Data Among Multiple Computer Platforms

    NASA Astrophysics Data System (ADS)

    Baker, L. M.; Fletcher, J. B.

    2001-12-01

    Seismic waveform data is readily available from a variety of sources, but it often comes in a distinct, instrument-specific data format. For example, data may be from portable seismographs, such as those made by Refraction Technology or Kinemetrics, from permanent seismograph arrays, such as the USGS Parkfield Dense Array, from public data centers, such as the IRIS Data Center, or from personal communication with other researchers through e-mail or ftp. A computer must be selected to import the data - usually whichever is the most suitable for reading the originating format. However, the computer best suited for a specific analysis may not be the same. When copies of the data are then made for analysis, a proliferation of copies of the same data results, in possibly incompatible, computer-specific formats. In addition, if an error is detected and corrected in one copy, or some other change is made, all the other copies must be updated to preserve their validity. Keeping track of what data is available, where it is located, and which copy is authoritative requires an effort that is easy to neglect. We solve this problem by importing waveform data to a shared network file server that is accessible to all our computers on our campus LAN. We use a Network Appliance file server running Sun's Network File System (NFS) software. Using an NFS client software package on each analysis computer, waveform data can then be read by our MatLab or Fortran applications without first copying the data. Since there is a single copy of the waveform data in a single location, the NFS file system hierarchy provides an implicit complete waveform data catalog and the single copy is inherently authoritative. Another part of our solution is to convert the original data into a blocked-binary format (known historically as USGS DR100 or VFBB format) that is interpreted by MatLab or Fortran library routines available on each computer so that the idiosyncrasies of each machine are not visible to the user. Commercial software packages, such as MatLab, also have the ability to share data in their own formats across multiple computer platforms. Our Fortran applications can create plot files in Adobe PostScript, Illustrator, and Portable Document Format (PDF) formats. Vendor support for reading these files is readily available on multiple computer platforms. We will illustrate by example our strategies for sharing seismic data among our multiple computer platforms, and we will discuss our positive and negative experiences. We will include our solutions for handling the different byte ordering, floating-point formats, and text file ``end-of-line'' conventions on the various computer platforms we use (6 different operating systems on 5 processor architectures).

  4. Geologic map of the Valjean Hills 7.5' quadrangle, San Bernardino County, California

    USGS Publications Warehouse

    Calzia, J.P.; Troxel, Bennie W.; digital database by Raumann, Christian G.

    2003-01-01

    FGDC-compliant metadata for the ARC/INFO coverages. The Correlation of Map Units and Description of Map Units is in the editorial format of USGS Geologic Investigations Series (I-series) maps but has not been edited to comply with I-map standards. Within the geologic map data package, map units are identified by standard geologic map criteria such as formation-name, age, and lithology. Even though this is an Open-File Report and includes the standard USGS Open-File disclaimer, the report closely adheres to the stratigraphic nomenclature of the U.S. Geological Survey. Descriptions of units can be obtained by viewing or plotting the .pdf file (3 above) or plotting the postscript file (2 above).

  5. PDB Editor: a user-friendly Java-based Protein Data Bank file editor with a GUI.

    PubMed

    Lee, Jonas; Kim, Sung Hou

    2009-04-01

    The Protein Data Bank file format is the format most widely used by protein crystallographers and biologists to disseminate and manipulate protein structures. Despite this, there are few user-friendly software packages available to efficiently edit and extract raw information from PDB files. This limitation often leads to many protein crystallographers wasting significant time manually editing PDB files. PDB Editor, written in Java Swing GUI, allows the user to selectively search, select, extract and edit information in parallel. Furthermore, the program is a stand-alone application written in Java which frees users from the hassles associated with platform/operating system-dependent installation and usage. PDB Editor can be downloaded from http://sourceforge.net/projects/pdbeditorjl/.

  6. CSAM: Compressed SAM format.

    PubMed

    Cánovas, Rodrigo; Moffat, Alistair; Turpin, Andrew

    2016-12-15

    Next generation sequencing machines produce vast amounts of genomic data. For the data to be useful, it is essential that it can be stored and manipulated efficiently. This work responds to the combined challenge of compressing genomic data, while providing fast access to regions of interest, without necessitating decompression of whole files. We describe CSAM (Compressed SAM format), a compression approach offering lossless and lossy compression for SAM files. The structures and techniques proposed are suitable for representing SAM files, as well as supporting fast access to the compressed information. They generate more compact lossless representations than BAM, which is currently the preferred lossless compressed SAM-equivalent format; and are self-contained, that is, they do not depend on any external resources to compress or decompress SAM files. An implementation is available at https://github.com/rcanovas/libCSAM CONTACT: canovas-ba@lirmm.frSupplementary Information: Supplementary data is available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. Software for Automated Reading of STEP Files by I-DEAS(trademark)

    NASA Technical Reports Server (NTRS)

    Pinedo, John

    2003-01-01

    A program called "readstep" enables the I-DEAS(tm) computer-aided-design (CAD) software to automatically read Standard for the Exchange of Product Model Data (STEP) files. (The STEP format is one of several used to transfer data between dissimilar CAD programs.) Prior to the development of "readstep," it was necessary to read STEP files into I-DEAS(tm) one at a time in a slow process that required repeated intervention by the user. In operation, "readstep" prompts the user for the location of the desired STEP files and the names of the I-DEAS(tm) project and model file, then generates an I-DEAS(tm) program file called "readstep.prg" and two Unix shell programs called "runner" and "controller." The program "runner" runs I-DEAS(tm) sessions that execute readstep.prg, while "controller" controls the execution of "runner" and edits readstep.prg if necessary. The user sets "runner" and "controller" into execution simultaneously, and then no further intervention by the user is required. When "runner" has finished, the user should see only parts from successfully read STEP files present in the model file. STEP files that could not be read successfully (e.g., because of format errors) should be regenerated before attempting to read them again.

  8. Shuttle Data Center File-Processing Tool in Java

    NASA Technical Reports Server (NTRS)

    Barry, Matthew R.; Miller, Walter H.

    2006-01-01

    A Java-language computer program has been written to facilitate mining of data in files in the Shuttle Data Center (SDC) archives. This program can be executed on a variety of workstations or via Web-browser programs. This program is partly similar to prior C-language programs used for the same purpose, while differing from those programs in that it exploits the platform-neutrality of Java in implementing several features that are important for analysis of large sets of time-series data. The program supports regular expression queries of SDC archive files, reads the files, interleaves the time-stamped samples according to a chosen output, then transforms the results into that format. A user can choose among a variety of output file formats that are useful for diverse purposes, including plotting, Markov modeling, multivariate density estimation, and wavelet multiresolution analysis, as well as for playback of data in support of simulation and testing.

  9. Users' Manual and Installation Guide for the EverVIEW Slice and Dice Tool (Version 1.0 Beta)

    USGS Publications Warehouse

    Roszell, Dustin; Conzelmann, Craig; Chimmula, Sumani; Chandrasekaran, Anuradha; Hunnicut, Christina

    2009-01-01

    Network Common Data Form (NetCDF) is a self-describing, machine-independent file format for storing array-oriented scientific data. Over the past few years, there has been a growing movement within the community of natural resource managers in The Everglades, Fla., to use NetCDF as the standard data container for datasets based on multidimensional arrays. As a consequence, a need arose for additional tools to view and manipulate NetCDF datasets, specifically to create subsets of large NetCDF files. To address this need, we created the EverVIEW Slice and Dice Tool to allow users to create subsets of grid-based NetCDF files. The major functions of this tool are (1) to subset NetCDF files both spatially and temporally; (2) to view the NetCDF data in table form; and (3) to export filtered data to a comma-separated value file format.

  10. HDF4 Maps: For Now and For the Future

    NASA Astrophysics Data System (ADS)

    Plutchak, J.; Aydt, R.; Folk, M. J.

    2013-12-01

    Data formats and access tools necessarily change as technology improves to address emerging requirements with new capabilities. This on-going process inevitably leaves behind significant data collections in legacy formats that are difficult to support and sustain. NASA ESDIS and The HDF Group currently face this problem with large and growing archives of data in HDF4, an older version of the HDF format. Indefinitely guaranteeing the ability to read these data with multi-platform libraries in many languages is very difficult. As an alternative, HDF and NASA worked together to create maps of the files that contain metadata and information about data types, locations, and sizes of data objects in the files. These maps are written in XML and have successfully been used to access and understand data in HDF4 files without the HDF libraries. While originally developed to support sustainable access to these data, these maps can also be used to provide access to HDF4 metadata, facilitate user understanding of files prior to download, and validate the files for compliance with particular conventions. These capabilities are now available as a service for HDF4 archives and users.

  11. 75 FR 57327 - GNP Rly, Inc.-Acquisition and Operation Exemption-Redmond Spur and Woodinville Subdivision

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-20

    ...). Those NITUs permitted railbanking/interim trail use negotiations under the Trails Act, 16 U.S.C. 1247(d... November 19, 2010. ADDRESSES: Comments may be submitted either via the Board's e-filing format or in the traditional paper format. Any person using e-filing should attach a document and otherwise comply with the...

  12. FastStats: Obstetrical Procedures

    MedlinePlus

    ... Publications and Information Products Surveys and Data Collection Systems Washington Group on Disability Statistics Where to Write for Vital Records File Formats Help: How do I view different file ...

  13. FastStats: Prostate Disease

    MedlinePlus

    ... Publications and Information Products Surveys and Data Collection Systems Washington Group on Disability Statistics Where to Write for Vital Records File Formats Help: How do I view different file ...

  14. HepML, an XML-based format for describing simulated data in high energy physics

    NASA Astrophysics Data System (ADS)

    Belov, S.; Dudko, L.; Kekelidze, D.; Sherstnev, A.

    2010-10-01

    In this paper we describe a HepML format and a corresponding C++ library developed for keeping complete description of parton level events in a unified and flexible form. HepML tags contain enough information to understand what kind of physics the simulated events describe and how the events have been prepared. A HepML block can be included into event files in the LHEF format. The structure of the HepML block is described by means of several XML Schemas. The Schemas define necessary information for the HepML block and how this information should be located within the block. The library libhepml is a C++ library intended for parsing and serialization of HepML tags, and representing the HepML block in computer memory. The library is an API for external software. For example, Matrix Element Monte Carlo event generators can use the library for preparing and writing a header of an LHEF file in the form of HepML tags. In turn, Showering and Hadronization event generators can parse the HepML header and get the information in the form of C++ classes. libhepml can be used in C++, C, and Fortran programs. All necessary parts of HepML have been prepared and we present the project to the HEP community. Program summaryProgram title: libhepml Catalogue identifier: AEGL_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGL_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPLv3 No. of lines in distributed program, including test data, etc.: 138 866 No. of bytes in distributed program, including test data, etc.: 613 122 Distribution format: tar.gz Programming language: C++, C Computer: PCs and workstations Operating system: Scientific Linux CERN 4/5, Ubuntu 9.10 RAM: 1 073 741 824 bytes (1 Gb) Classification: 6.2, 11.1, 11.2 External routines: Xerces XML library ( http://xerces.apache.org/xerces-c/), Expat XML Parser ( http://expat.sourceforge.net/) Nature of problem: Monte Carlo simulation in high energy physics is divided into several stages. Various programs exist for these stages. In this article we are interested in interfacing different Monte Carlo event generators via data files, in particular, Matrix Element (ME) generators and Showering and Hadronization (SH) generators. There is a widely accepted format for data files for such interfaces - Les Houches Event Format (LHEF). Although information kept in an LHEF file is enough for proper working of SH generators, it is insufficient for understanding how events in the LHEF file have been prepared and which physical model has been applied. In this paper we propose an extension of the format for keeping additional information available in generators. We propose to add a new information block, marked up with XML tags, to the LHEF file. This block describes events in the file in more detail. In particular, it stores information about a physical model, kinematical cuts, generator, etc. This helps to make LHEF files self-documented. Certainly, HepML can be applied in more general context, not in LHEF files only. Solution method: In order to overcome drawbacks of the original LHEF accord we propose to add a new information block of HepML tags. HepML is an XML-based markup language. We designed several XML Schemas for all tags in the language. Any HepML document should follow rules of the Schemas. The language is equipped with a library for operation with HepML tags and documents. This C++ library, called libhepml, consists of classes for HepML objects, which represent a HepML document in computer memory, parsing classes, serializating classes, and some auxiliary classes. Restrictions: The software is adapted for solving problems, described in the article. There are no additional restrictions. Running time: Tests have been done on a computer with Intel(R) Core(TM)2 Solo, 1.4 GHz. Parsing of a HepML file: 6 ms (size of the HepML files is 12.5 Kb) Writing of a HepML block to file: 14 ms (file size 12.5 Kb) Merging of two HepML blocks and writing to file: 18 ms (file size - 25.0 Kb).

  15. BOREAS Forest Cover Data Layers over the SSA-MSA in Raster Format

    NASA Technical Reports Server (NTRS)

    Nickeson, Jaime; Gruszka, F; Hall, F.

    2000-01-01

    This data set, originally provided as vector polygons with attributes, has been processed by BORIS staff to provide raster files that can be used for modeling or for comparison purposes. The original data were received as ARC/INFO coverages or as export files from SERM. The data include information on forest parameters for the BOREAS SSA-MSA. Most of the data used for this product were acquired by BORIS in 1993; the maps were produced from aerial photography taken as recently as 1988. The data are stored in binary, image format files.

  16. Development of Software to Model AXAF-I Image Quality

    NASA Technical Reports Server (NTRS)

    Geary, Joseph; Hawkins, Lamar; Ahmad, Anees; Gong, Qian

    1997-01-01

    This report describes work conducted on Delivery Order 181 between October 1996 through June 1997. During this period software was written to: compute axial PSD's from RDOS AXAF-I mirror surface maps; plot axial surface errors and compute PSD's from HDOS "Big 8" axial scans; plot PSD's from FITS format PSD files; plot band-limited RMS vs axial and azimuthal position for multiple PSD files; combine and organize PSD's from multiple mirror surface measurements formatted as input to GRAZTRACE; modify GRAZTRACE to read FITS formatted PSD files; evaluate AXAF-I test results; improve and expand the capabilities of the GT x-ray mirror analysis package. During this period work began on a more user-friendly manual for the GT program, and improvements were made to the on-line help manual.

  17. UNICON: A Powerful and Easy-to-Use Compound Library Converter.

    PubMed

    Sommer, Kai; Friedrich, Nils-Ole; Bietz, Stefan; Hilbig, Matthias; Inhester, Therese; Rarey, Matthias

    2016-06-27

    The accurate handling of different chemical file formats and the consistent conversion between them play important roles for calculations in complex cheminformatics workflows. Working with different cheminformatic tools often makes the conversion between file formats a mandatory step. Such a conversion might become a difficult task in cases where the information content substantially differs. This paper describes UNICON, an easy-to-use software tool for this task. The functionality of UNICON ranges from file conversion between standard formats SDF, MOL2, SMILES, PDB, and PDBx/mmCIF via the generation of 2D structure coordinates and 3D structures to the enumeration of tautomeric forms, protonation states, and conformer ensembles. For this purpose, UNICON bundles the key elements of the previously described NAOMI library in a single, easy-to-use command line tool.

  18. The Open Microscopy Environment: open image informatics for the biological sciences

    NASA Astrophysics Data System (ADS)

    Blackburn, Colin; Allan, Chris; Besson, Sébastien; Burel, Jean-Marie; Carroll, Mark; Ferguson, Richard K.; Flynn, Helen; Gault, David; Gillen, Kenneth; Leigh, Roger; Leo, Simone; Li, Simon; Lindner, Dominik; Linkert, Melissa; Moore, Josh; Moore, William J.; Ramalingam, Balaji; Rozbicki, Emil; Rustici, Gabriella; Tarkowska, Aleksandra; Walczysko, Petr; Williams, Eleanor; Swedlow, Jason R.

    2016-07-01

    Despite significant advances in biological imaging and analysis, major informatics challenges remain unsolved: file formats are proprietary, storage and analysis facilities are lacking, as are standards for sharing image data and results. While the open FITS file format is ubiquitous in astronomy, astronomical imaging shares many challenges with biological imaging, including the need to share large image sets using secure, cross-platform APIs, and the need for scalable applications for processing and visualization. The Open Microscopy Environment (OME) is an open-source software framework developed to address these challenges. OME tools include: an open data model for multidimensional imaging (OME Data Model); an open file format (OME-TIFF) and library (Bio-Formats) enabling free access to images (5D+) written in more than 145 formats from many imaging domains, including FITS; and a data management server (OMERO). The Java-based OMERO client-server platform comprises an image metadata store, an image repository, visualization and analysis by remote access, allowing sharing and publishing of image data. OMERO provides a means to manage the data through a multi-platform API. OMERO's model-based architecture has enabled its extension into a range of imaging domains, including light and electron microscopy, high content screening, digital pathology and recently into applications using non-image data from clinical and genomic studies. This is made possible using the Bio-Formats library. The current release includes a single mechanism for accessing image data of all types, regardless of original file format, via Java, C/C++ and Python and a variety of applications and environments (e.g. ImageJ, Matlab and R).

  19. Java Library for Input and Output of Image Data and Metadata

    NASA Technical Reports Server (NTRS)

    Deen, Robert; Levoe, Steven

    2003-01-01

    A Java-language library supports input and output (I/O) of image data and metadata (label data) in the format of the Video Image Communication and Retrieval (VICAR) image-processing software and in several similar formats, including a subset of the Planetary Data System (PDS) image file format. The library does the following: It provides low-level, direct access layer, enabling an application subprogram to read and write specific image files, lines, or pixels, and manipulate metadata directly. Two coding/decoding subprograms ("codecs" for short) based on the Java Advanced Imaging (JAI) software provide access to VICAR and PDS images in a file-format-independent manner. The VICAR and PDS codecs enable any program that conforms to the specification of the JAI codec to use VICAR or PDS images automatically, without specific knowledge of the VICAR or PDS format. The library also includes Image I/O plugin subprograms for VICAR and PDS formats. Application programs that conform to the Image I/O specification of Java version 1.4 can utilize any image format for which such a plug-in subprogram exists, without specific knowledge of the format itself. Like the aforementioned codecs, the VICAR and PDS Image I/O plug-in subprograms support reading and writing of metadata.

  20. Cambio : a file format translation and analysis application for the nuclear response emergency community.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lasche, George P.

    2009-10-01

    Cambio is an application intended to automatically read and display any spectrum file of any format in the world that the nuclear emergency response community might encounter. Cambio also provides an analysis capability suitable for HPGe spectra when detector response and scattering environment are not well known. Why is Cambio needed: (1) Cambio solves the following problem - With over 50 types of formats from instruments used in the field and new format variations appearing frequently, it is impractical for every responder to have current versions of the manufacturer's software from every instrument used in the field; (2) Cambio convertsmore » field spectra to any one of several common formats that are used for analysis, saving valuable time in an emergency situation; (3) Cambio provides basic tools for comparing spectra, calibrating spectra, and isotope identification with analysis suited especially for HPGe spectra; and (4) Cambio has a batch processing capability to automatically translate a large number of archival spectral files of any format to one of several common formats, such as the IAEA SPE or the DHS N42. Currently over 540 analysts and members of the nuclear emergency response community worldwide are on the distribution list for updates to Cambio. Cambio users come from all levels of government, university, and commercial partners around the world that support efforts to counter terrorist nuclear activities. Cambio is Unclassified Unlimited Release (UUR) and distributed by internet downloads with email notifications whenever a new build of Cambio provides for new formats, bug fixes, or new or improved capabilities. Cambio is also provided as a DLL to the Karlsruhe Institute for Transuranium Elements so that Cambio's automatic file-reading capability can be included at the Nucleonica web site.« less

  1. 75 FR 41093 - FM Table of Allotments, Maupin, Oregon

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-15

    .... SUMMARY: The Audio Division grants the Petition for Reconsideration filed on behalf of Maupin Broadcasting... materials in accessible formats for people with disabilities (Braille, large print, electronic files, audio.... John A. Karousos, Assistant Chief, Audio Division, Media Bureau. [FR Doc. 2010-17226 Filed 7-14-10; 8...

  2. Quantitative Microbial Risk Assessment Tutorial: Publishing a Microbial Density Time Series as a Txt File

    EPA Science Inventory

    A SARA Timeseries Utility supports analysis and management of time-varying environmental data including listing, graphing, computing statistics, computing meteorological data and saving in a WDM or text file. File formats supported include WDM, HSPF Binary (.hbn), USGS RDB, and T...

  3. IDG - INTERACTIVE DIF GENERATOR

    NASA Technical Reports Server (NTRS)

    Preheim, L. E.

    1994-01-01

    The Interactive DIF Generator (IDG) utility is a tool used to generate and manipulate Directory Interchange Format files (DIF). Its purpose as a specialized text editor is to create and update DIF files which can be sent to NASA's Master Directory, also referred to as the International Global Change Directory at Goddard. Many government and university data systems use the Master Directory to advertise the availability of research data. The IDG interface consists of a set of four windows: (1) the IDG main window; (2) a text editing window; (3) a text formatting and validation window; and (4) a file viewing window. The IDG main window starts up the other windows and contains a list of valid keywords. The keywords are loaded from a user-designated file and selected keywords can be copied into any active editing window. Once activated, the editing window designates the file to be edited. Upon switching from the editing window to the formatting and validation window, the user has options for making simple changes to one or more files such as inserting tabs, aligning fields, and indenting groups. The viewing window is a scrollable read-only window that allows fast viewing of any text file. IDG is an interactive tool and requires a mouse or a trackball to operate. IDG uses the X Window System to build and manage its interactive forms, and also uses the Motif widget set and runs under Sun UNIX. IDG is written in C-language for Sun computers running SunOS. This package requires the X Window System, Version 11 Revision 4, with OSF/Motif 1.1. IDG requires 1.8Mb of hard disk space. The standard distribution medium for IDG is a .25 inch streaming magnetic tape cartridge in UNIX tar format. It is also available on a 3.5 inch diskette in UNIX tar format. The program was developed in 1991 and is a copyrighted work with all copyright vested in NASA. SunOS is a trademark of Sun Microsystems, Inc. X Window System is a trademark of Massachusetts Institute of Technology. OSF/Motif is a trademark of the Open Software Foundation, Inc. UNIX is a trademark of Bell Laboratories.

  4. 75 FR 35700 - Revisions to Forms, Statements, and Reporting Requirements for Natural Gas Pipelines

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-23

    ... filed in native applications or print-to-PDF format and not in a scanned format. Mail/Hand Delivery... also propose to revise page 520 accordingly. \\1\\ American Gas Association v. FERC, 593 F.3d 14 (D.C....\\14\\ \\14\\ 593 F.3d at 21. 8. Following the court's remand, AGA filed a motion requesting that the...

  5. Cambio

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, William

    2015-10-19

    Cambio opens data files from common gamma radiation detectors, displays a visual representation of it, and allows the user to edit the meta-data, as well as convert the data to a different file format.

  6. Preliminary surficial geologic map database of the Amboy 30 x 60 minute quadrangle, California

    USGS Publications Warehouse

    Bedford, David R.; Miller, David M.; Phelps, Geoffrey A.

    2006-01-01

    The surficial geologic map database of the Amboy 30x60 minute quadrangle presents characteristics of surficial materials for an area approximately 5,000 km2 in the eastern Mojave Desert of California. This map consists of new surficial mapping conducted between 2000 and 2005, as well as compilations of previous surficial mapping. Surficial geology units are mapped and described based on depositional process and age categories that reflect the mode of deposition, pedogenic effects occurring post-deposition, and, where appropriate, the lithologic nature of the material. The physical properties recorded in the database focus on those that drive hydrologic, biologic, and physical processes such as particle size distribution (PSD) and bulk density. This version of the database is distributed with point data representing locations of samples for both laboratory determined physical properties and semi-quantitative field-based information. Future publications will include the field and laboratory data as well as maps of distributed physical properties across the landscape tied to physical process models where appropriate. The database is distributed in three parts: documentation, spatial map-based data, and printable map graphics of the database. Documentation includes this file, which provides a discussion of the surficial geology and describes the format and content of the map data, a database 'readme' file, which describes the database contents, and FGDC metadata for the spatial map information. Spatial data are distributed as Arc/Info coverage in ESRI interchange (e00) format, or as tabular data in the form of DBF3-file (.DBF) file formats. Map graphics files are distributed as Postscript and Adobe Portable Document Format (PDF) files, and are appropriate for representing a view of the spatial database at the mapped scale.

  7. Rosetta: Ensuring the Preservation and Usability of ASCII-based Data into the Future

    NASA Astrophysics Data System (ADS)

    Ramamurthy, M. K.; Arms, S. C.

    2015-12-01

    Field data obtained from dataloggers often take the form of comma separated value (CSV) ASCII text files. While ASCII based data formats have positive aspects, such as the ease of accessing the data from disk and the wide variety of tools available for data analysis, there are some drawbacks, especially when viewing the situation through the lens of data interoperability and stewardship. The Unidata data translation tool, Rosetta, is a web-based service that provides an easy, wizard-based interface for data collectors to transform their datalogger generated ASCII output into Climate and Forecast (CF) compliant netCDF files following the CF-1.6 discrete sampling geometries. These files are complete with metadata describing what data are contained in the file, the instruments used to collect the data, and other critical information that otherwise may be lost in one of many README files. The choice of the machine readable netCDF data format and data model, coupled with the CF conventions, ensures long-term preservation and interoperability, and that future users will have enough information to responsibly use the data. However, with the understanding that the observational community appreciates the ease of use of ASCII files, methods for transforming the netCDF back into a CSV or spreadsheet format are also built-in. One benefit of translating ASCII data into a machine readable format that follows open community-driven standards is that they are instantly able to take advantage of data services provided by the many open-source data server tools, such as the THREDDS Data Server (TDS). While Rosetta is currently a stand-alone service, this talk will also highlight efforts to couple Rosetta with the TDS, thus allowing self-publishing of thoroughly documented datasets by the data producers themselves.

  8. imzML: Imaging Mass Spectrometry Markup Language: A common data format for mass spectrometry imaging.

    PubMed

    Römpp, Andreas; Schramm, Thorsten; Hester, Alfons; Klinkert, Ivo; Both, Jean-Pierre; Heeren, Ron M A; Stöckli, Markus; Spengler, Bernhard

    2011-01-01

    Imaging mass spectrometry is the method of scanning a sample of interest and generating an "image" of the intensity distribution of a specific analyte. The data sets consist of a large number of mass spectra which are usually acquired with identical settings. Existing data formats are not sufficient to describe an MS imaging experiment completely. The data format imzML was developed to allow the flexible and efficient exchange of MS imaging data between different instruments and data analysis software.For this purpose, the MS imaging data is divided in two separate files. The mass spectral data is stored in a binary file to ensure efficient storage. All metadata (e.g., instrumental parameters, sample details) are stored in an XML file which is based on the standard data format mzML developed by HUPO-PSI. The original mzML controlled vocabulary was extended to include specific parameters of imaging mass spectrometry (such as x/y position and spatial resolution). The two files (XML and binary) are connected by offset values in the XML file and are unambiguously linked by a universally unique identifier. The resulting datasets are comparable in size to the raw data and the separate metadata file allows flexible handling of large datasets.Several imaging MS software tools already support imzML. This allows choosing from a (growing) number of processing tools. One is no longer limited to proprietary software, but is able to use the processing software which is best suited for a specific question or application. On the other hand, measurements from different instruments can be compared within one software application using identical settings for data processing. All necessary information for evaluating and implementing imzML can be found at http://www.imzML.org .

  9. Genotype harmonizer: automatic strand alignment and format conversion for genotype data integration.

    PubMed

    Deelen, Patrick; Bonder, Marc Jan; van der Velde, K Joeri; Westra, Harm-Jan; Winder, Erwin; Hendriksen, Dennis; Franke, Lude; Swertz, Morris A

    2014-12-11

    To gain statistical power or to allow fine mapping, researchers typically want to pool data before meta-analyses or genotype imputation. However, the necessary harmonization of genetic datasets is currently error-prone because of many different file formats and lack of clarity about which genomic strand is used as reference. Genotype Harmonizer (GH) is a command-line tool to harmonize genetic datasets by automatically solving issues concerning genomic strand and file format. GH solves the unknown strand issue by aligning ambiguous A/T and G/C SNPs to a specified reference, using linkage disequilibrium patterns without prior knowledge of the used strands. GH supports many common GWAS/NGS genotype formats including PLINK, binary PLINK, VCF, SHAPEIT2 & Oxford GEN. GH is implemented in Java and a large part of the functionality can also be used as Java 'Genotype-IO' API. All software is open source under license LGPLv3 and available from http://www.molgenis.org/systemsgenetics. GH can be used to harmonize genetic datasets across different file formats and can be easily integrated as a step in routine meta-analysis and imputation pipelines.

  10. BOREAS Elevation Contours over the NSA and SSA in ARC/INFO Generate Format

    NASA Technical Reports Server (NTRS)

    Knapp, David; Nickeson, Jaime; Hall, Forrest G. (Editor)

    2000-01-01

    This data set was prepared by BORIS Staff by reformatting the original data into the ARC/INFO Generate format. The original data were received in SIF at a scale of 1:50,000. BORIS staff could not find a format document or commercial software for reading SIF; the BOREAS HYD-08 team pro-vided some C source code that could read some of the SIF files. The data cover the BOREAS NSA and SSA. The original data were compiled from information available in the 1970s and 1980s. The data are available in ARC/INFO Generate format files.

  11. Pre-Launch Algorithm and Data Format for the Level 1 Calibration Products for the EOS AM-1 Moderate Resolution Imaging Spectroradiometer (MODIS)

    NASA Technical Reports Server (NTRS)

    Guenther, Bruce W.; Godden, Gerald D.; Xiong, Xiao-Xiong; Knight, Edward J.; Qiu, Shi-Yue; Montgomery, Harry; Hopkins, M. M.; Khayat, Mohammad G.; Hao, Zhi-Dong; Smith, David E. (Technical Monitor)

    2000-01-01

    The Moderate Resolution Imaging Spectroradiometer (MODIS) radiometric calibration product is described for the thermal emissive and the reflective solar bands. Specific sensor design characteristics are identified to assist in understanding how the calibration algorithm software product is designed. The reflected solar band software products of radiance and reflectance factor both are described. The product file format is summarized and the MODIS Characterization Support Team (MCST) Homepage location for the current file format is provided.

  12. VizieR Online Data Catalog: Sgr B2(N) and Sgr B2(M) IRAM 30m line survey (Belloche+, 2013)

    NASA Astrophysics Data System (ADS)

    Belloche, A.; Mueller, H. S. P.; Menten, K. M.; Schilke, P.; Comito, C.

    2013-08-01

    The list of line identifications corresponding to the blue labels in Figs. 2 to 7 where the labels are often too crowded to be easily readable are available in ASCII format. The lists are split into six files, three for Sgr B2(N) and three for Sgr B2(M). For each source, there is one file per atmospheric window (3, 2, and 1mm). Each file is ordered by increasing frequency. The observed and synthetic spectra of Sgr B2(N) and Sgr B2(M) between 80 and 116GHz are available both in ASCII and FITS formats. The synthetic spectra were resampled to the same frequency channels as the observed spectra. The blanking value is -1000K for the ASCII files. There is one ASCII file per source. There are two FITS files per source, one for the observed spectrum and one for the synthetic spectrum. The intensities are in main-beam temperature scale in K. The blanking value is 42.75234K for the observed spectrum of SgrB2(N) and 53.96533K for the observed spectrum of SgrB2(M). (9 data files).

  13. OpenMSI: A High-Performance Web-Based Platform for Mass Spectrometry Imaging

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rubel, Oliver; Greiner, Annette; Cholia, Shreyas

    Mass spectrometry imaging (MSI) enables researchers to directly probe endogenous molecules directly within the architecture of the biological matrix. Unfortunately, efficient access, management, and analysis of the data generated by MSI approaches remain major challenges to this rapidly developing field. Despite the availability of numerous dedicated file formats and software packages, it is a widely held viewpoint that the biggest challenge is simply opening, sharing, and analyzing a file without loss of information. Here we present OpenMSI, a software framework and platform that addresses these challenges via an advanced, high-performance, extensible file format and Web API for remote data accessmore » (http://openmsi.nersc.gov). The OpenMSI file format supports storage of raw MSI data, metadata, and derived analyses in a single, self-describing format based on HDF5 and is supported by a large range of analysis software (e.g., Matlab and R) and programming languages (e.g., C++, Fortran, and Python). Careful optimization of the storage layout of MSI data sets using chunking, compression, and data replication accelerates common, selective data access operations while minimizing data storage requirements and are critical enablers of rapid data I/O. The OpenMSI file format has shown to provide >2000-fold improvement for image access operations, enabling spectrum and image retrieval in less than 0.3 s across the Internet even for 50 GB MSI data sets. To make remote high-performance compute resources accessible for analysis and to facilitate data sharing and collaboration, we describe an easy-to-use yet powerful Web API, enabling fast and convenient access to MSI data, metadata, and derived analysis results stored remotely to facilitate high-performance data analysis and enable implementation of Web based data sharing, visualization, and analysis.« less

  14. Use of Schema on Read in Earth Science Data Archives

    NASA Technical Reports Server (NTRS)

    Hegde, Mahabaleshwara; Smit, Christine; Pilone, Paul; Petrenko, Maksym; Pham, Long

    2017-01-01

    Traditionally, NASA Earth Science data archives have file-based storage using proprietary data file formats, such as HDF and HDF-EOS, which are optimized to support fast and efficient storage of spaceborne and model data as they are generated. The use of file-based storage essentially imposes an indexing strategy based on data dimensions. In most cases, NASA Earth Science data uses time as the primary index, leading to poor performance in accessing data in spatial dimensions. For example, producing a time series for a single spatial grid cell involves accessing a large number of data files. With exponential growth in data volume due to the ever-increasing spatial and temporal resolution of the data, using file-based archives poses significant performance and cost barriers to data discovery and access. Storing and disseminating data in proprietary data formats imposes an additional access barrier for users outside the mainstream research community. At the NASA Goddard Earth Sciences Data Information Services Center (GES DISC), we have evaluated applying the schema-on-read principle to data access and distribution. We used Apache Parquet to store geospatial data, and have exposed data through Amazon Web Services (AWS) Athena, AWS Simple Storage Service (S3), and Apache Spark. Using the schema-on-read approach allows customization of indexing spatially or temporally to suit the data access pattern. The storage of data in open formats such as Apache Parquet has widespread support in popular programming languages. A wide range of solutions for handling big data lowers the access barrier for all users. This presentation will discuss formats used for data storage, frameworks with This presentation will discuss formats used for data storage, frameworks with support for schema-on-read used for data access, and common use cases covering data usage patterns seen in a geospatial data archive.

  15. A malware detection scheme based on mining format information.

    PubMed

    Bai, Jinrong; Wang, Junfeng; Zou, Guozhong

    2014-01-01

    Malware has become one of the most serious threats to computer information system and the current malware detection technology still has very significant limitations. In this paper, we proposed a malware detection approach by mining format information of PE (portable executable) files. Based on in-depth analysis of the static format information of the PE files, we extracted 197 features from format information of PE files and applied feature selection methods to reduce the dimensionality of the features and achieve acceptable high performance. When the selected features were trained using classification algorithms, the results of our experiments indicate that the accuracy of the top classification algorithm is 99.1% and the value of the AUC is 0.998. We designed three experiments to evaluate the performance of our detection scheme and the ability of detecting unknown and new malware. Although the experimental results of identifying new malware are not perfect, our method is still able to identify 97.6% of new malware with 1.3% false positive rates.

  16. A Malware Detection Scheme Based on Mining Format Information

    PubMed Central

    Bai, Jinrong; Wang, Junfeng; Zou, Guozhong

    2014-01-01

    Malware has become one of the most serious threats to computer information system and the current malware detection technology still has very significant limitations. In this paper, we proposed a malware detection approach by mining format information of PE (portable executable) files. Based on in-depth analysis of the static format information of the PE files, we extracted 197 features from format information of PE files and applied feature selection methods to reduce the dimensionality of the features and achieve acceptable high performance. When the selected features were trained using classification algorithms, the results of our experiments indicate that the accuracy of the top classification algorithm is 99.1% and the value of the AUC is 0.998. We designed three experiments to evaluate the performance of our detection scheme and the ability of detecting unknown and new malware. Although the experimental results of identifying new malware are not perfect, our method is still able to identify 97.6% of new malware with 1.3% false positive rates. PMID:24991639

  17. Revised Subsurface Stratigraphic Framework of the Fort Union and Wasatch Formations, Powder River Basin, Wyoming and Montana

    USGS Publications Warehouse

    Flores, Romeo M.; Spear, Brianne D.; Purchase, Peter A.; Gallagher, Craig M.

    2010-01-01

    Described in this report is an updated subsurface stratigraphic framework of the Paleocene Fort Union Formation and Eocene Wasatch Formation in the Powder River Basin (PRB) in Wyoming and Montana. This framework is graphically presented in 17 intersecting west-east and north-south cross sections across the basin. Also included are: (1) the dataset and all associated digital files and (2) digital files for all figures and table 1 suitable for large-format printing. The purpose of this U.S. Geological Survey (USGS) Open-File Report is to provide rapid dissemination and accessibility of the stratigraphic cross sections and related digital data to USGS customers, especially the U.S. Bureau of Land Management (BLM), to facilitate their modeling of the hydrostratigraphy of the PRB. This report contains a brief summary of the coal-bed correlations and database, and is part of a larger ongoing study that will be available in the near future.

  18. Development of an e-VLBI Data Transport Software Suite with VDIF

    NASA Technical Reports Server (NTRS)

    Sekido, Mamoru; Takefuji, Kazuhiro; Kimura, Moritaka; Hobiger, Thomas; Kokado, Kensuke; Nozawa, Kentarou; Kurihara, Shinobu; Shinno, Takuya; Takahashi, Fujinobu

    2010-01-01

    We have developed a software library (KVTP-lib) for VLBI data transmission over the network with the VDIF (VLBI Data Interchange Format), which is the newly proposed standard VLBI data format designed for electronic data transfer over the network. The software package keeps the application layer (VDIF frame) and the transmission layer separate, so that each layer can be developed efficiently. The real-time VLBI data transmission tool sudp-send is an application tool based on the KVTP-lib library. sudp-send captures the VLBI data stream from the VSI-H interface with the K5/VSI PC-board and writes the data to file in standard Linux file format or transmits it to the network using the simple- UDP (SUDP) protocol. Another tool, sudp-recv , receives the data stream from the network and writes the data to file in a specific VLBI format (K5/VSSP, VDIF, or Mark 5B). This software system has been implemented on the Wettzell Tsukuba baseline; evaluation before operational employment is under way.

  19. 75 FR 19339 - FM Table of Allotments, Amboy, California

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-14

    .... SUMMARY: The Audio Division seeks comments on a petition filed by Sunnylands Broadcasting, LLC, proposing... disabilities (Braille, large print, electronic files, audio format), send an e-mail to [email protected] or call... Chief, Audio Division, Media Bureau. [FR Doc. 2010-8449 Filed 4-13-10; 8:45 am] BILLING CODE 6712-01-S ...

  20. 14 CFR 221.121 - How to prepare and file applications for Special Tariff Permission.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ..., DEPARTMENT OF TRANSPORTATION (AVIATION PROCEEDINGS) ECONOMIC REGULATIONS TARIFFS Special Tariff Permission To... notice shall conform to the requirements of § 221.212 if filed electronically. (b) Number of paper copies and place of filing. For paper format applications, the original and one copy of each such application...

  1. Biological Investigations of Adaptive Networks: Neuronal Control of Conditioned Responses

    DTIC Science & Technology

    1989-07-01

    The program also controls A/D sampling of voltage trace from NMR transducer and disk files for NMR, neural spikes, and synchronization. * HSAD . Basic...format which ANALYZE (by John Desmond) can read. e FIG.HIRES Reads C-64 HSAD files and EVENT NMR files and generates oscilloscope-like figures showing

  2. 77 FR 6625 - Railroad Cost of Capital-2011

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-08

    ... railroads are due by May 9, 2012. ADDRESSES: Comments may be submitted either via the Board's e-filing system or in the traditional paper format. Any person using e-filing should comply with the instructions at the E-FILING link on the Board's Web site, at http://www.stb.dot.gov . Any person submitting a...

  3. Students' Attitudes to and Usage of Academic Feedback Provided via Audio Files

    ERIC Educational Resources Information Center

    Merry, Stephen; Orsmond, Paul

    2008-01-01

    This study explores students' attitudes to the provision of formative feedback on academic work using audio files together with the ways in which students implement such feedback within their learning. Fifteen students received audio file feedback on written work and were subsequently interviewed regarding their utilisation of that feedback within…

  4. New Powder Diffraction File (PDF-4) in relational database format: advantages and data-mining capabilities.

    PubMed

    Kabekkodu, Soorya N; Faber, John; Fawcett, Tim

    2002-06-01

    The International Centre for Diffraction Data (ICDD) is responding to the changing needs in powder diffraction and materials analysis by developing the Powder Diffraction File (PDF) in a very flexible relational database (RDB) format. The PDF now contains 136,895 powder diffraction patterns. In this paper, an attempt is made to give an overview of the PDF-4, search/match methods and the advantages of having the PDF-4 in RDB format. Some case studies have been carried out to search for crystallization trends, properties, frequencies of space groups and prototype structures. These studies give a good understanding of the basic structural aspects of classes of compounds present in the database. The present paper also reports data-mining techniques and demonstrates the power of a relational database over the traditional (flat-file) database structures.

  5. Is HDF5 a Good Format to Replace UVFITS?

    NASA Astrophysics Data System (ADS)

    Price, D. C.; Barsdell, B. R.; Greenhill, L. J.

    2015-09-01

    The FITS (Flexible Image Transport System) data format was developed in the late 1970s for storage and exchange of astronomy-related image data. Since then, it has become a standard file format not only for images, but also for radio interferometer data (e.g. UVFITS, FITS-IDI). But is FITS the right format for next-generation telescopes to adopt? The newer Hierarchical Data Format (HDF5) file format offers considerable advantages over FITS, but has yet to gain widespread adoption within the radio astronomy. One of the major holdbacks is that HDF5 is not well supported by data reduction software packages. Here, we present a comparison of FITS, HDF5, and the MeasurementSet (MS) format for storage of interferometric data. In addition, we present a tool for converting between formats. We show that the underlying data model of FITS can be ported to HDF5, a first step toward achieving wider HDF5 support.

  6. BOREAS TE-20 Soils Data Over the NSA-MSA and Tower Sites in Raster Format

    NASA Technical Reports Server (NTRS)

    Hall, Forrest G. (Editor); Veldhuis, Hugo; Knapp, David; Veldhuis, Hugo

    2000-01-01

    The BOREAS TE-20 team collected several data sets for use in developing and testing models of forest ecosystem dynamics. This data set was gridded from vector layers of soil maps that were received from Dr. Hugo Veldhuis, who did the original mapping in the field during 1994. The vector layers were gridded into raster files that cover the NSA-MSA and tower sites. The data are stored in binary, image format files. The data files are available on a CD-ROM (see document number 20010000884), or from the Oak Ridge National Laboratory (ORNL) Distributed Active Center (DAAC).

  7. Covariance Data File Formats for Whisper-1.0 & Whisper-1.1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, Forrest B.; Rising, Michael Evan

    2017-01-09

    Whisper is a statistical analysis package developed in 2014 to support nuclear criticality safety (NCS) validation. It uses the sensitivity profile data for an application as computed by MCNP6 along with covariance files for the nuclear data to determine a baseline upper-subcritical-limit (USL) for the application. Whisper version 1.0 was first developed and used at LANL in 2014. During 2015-2016, Whisper was updated to version 1.1 and is to be included with the upcoming release of MCNP6.2. This report describes the file formats used for the covariance data in both Whisper-1.0 and Whisper-1.1.

  8. Tool for Merging Proposals Into DSN Schedules

    NASA Technical Reports Server (NTRS)

    Khanampornpan, Teerapat; Kwok, John; Call, Jared

    2008-01-01

    A Practical Extraction and Reporting Language (Perl) script called merge7da has been developed to facilitate determination, by a project scheduler in NASA's Deep Space Network, of whether a proposal for use of the DSN could create a conflict with the current DSN schedule. Prior to the development of merge7da, there was no way to quickly identify potential schedule conflicts: it was necessary to submit a proposal and wait a day or two for a response from a DSN scheduling facility. By using merge7da to detect and eliminate potential schedule conflicts before submitting a proposal, a project scheduler saves time and gains assurance that the proposal will probably be accepted. merge7da accepts two input files, one of which contains the current DSN schedule and is in a DSN-standard format called '7da'. The other input file contains the proposal and is in another DSN-standard format called 'C1/C2'. merge7da processes the two input files to produce a merged 7da-format output file that represents the DSN schedule as it would be if the proposal were to be adopted. This 7da output file can be loaded into various DSN scheduling software tools now in use.

  9. 78 FR 6173 - Diana Del Grosso, Ray Smith, Joseph Hatch, Cheryl Hatch, Kathleen Kelley, Andrew Wilklund, and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-29

    ... submissions by the parties may be submitted via the Board's e-filing format or in the traditional paper format. Any person using e-filing should attach a document and otherwise comply with the instructions at the E... proceeding under 49 U.S.C. 721 and 5 U.S.C. 554(e). Petitioners request that the Board declare that specific...

  10. Occupational Survey Report. Visual Information, AFSC 3V0X1

    DTIC Science & Technology

    2000-04-01

    of the career ladder include: Scan artwork using flatbed scanners Convert graphic file formats Design layouts Letter certificates using laser...Design layouts Scan artwork using flatbed scanners Produce artwork using mouse or digitizing tablets Design and produce imagery for web pages Produce...DAFSC 3V031 PERSONNEL TASKS A0034 Scan artwork using flatbed scanners C0065 Design layouts A0004 Convert graphic file formats A0006 Create

  11. LipidMiner: A Software for Automated Identification and Quantification of Lipids from Multiple Liquid Chromatography-Mass Spectrometry Data Files

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meng, Da; Zhang, Qibin; Gao, Xiaoli

    2014-04-30

    We have developed a tool for automated, high-throughput analysis of LC-MS/MS data files, which greatly simplifies LC-MS based lipidomics analysis. Our results showed that LipidMiner is accurate and comprehensive in identification and quantification of lipid molecular species. In addition, the workflow implemented in LipidMiner is not limited to identification and quantification of lipids. If a suitable metabolite library is implemented in the library matching module, LipidMiner could be reconfigured as a tool for general metabolomics data analysis. It is of note that LipidMiner currently is limited to singly charged ions, although it is adequate for the purpose of lipidomics sincemore » lipids are rarely multiply charged,[14] even for the polyphosphoinositides. LipidMiner also only processes file formats generated from mass spectrometers from Thermo, i.e. the .RAW format. In the future, we are planning to accommodate file formats generated by mass spectrometers from other predominant instrument vendors to make this tool more universal.« less

  12. Transforming Dermatologic Imaging for the Digital Era: Metadata and Standards.

    PubMed

    Caffery, Liam J; Clunie, David; Curiel-Lewandrowski, Clara; Malvehy, Josep; Soyer, H Peter; Halpern, Allan C

    2018-01-17

    Imaging is increasingly being used in dermatology for documentation, diagnosis, and management of cutaneous disease. The lack of standards for dermatologic imaging is an impediment to clinical uptake. Standardization can occur in image acquisition, terminology, interoperability, and metadata. This paper presents the International Skin Imaging Collaboration position on standardization of metadata for dermatologic imaging. Metadata is essential to ensure that dermatologic images are properly managed and interpreted. There are two standards-based approaches to recording and storing metadata in dermatologic imaging. The first uses standard consumer image file formats, and the second is the file format and metadata model developed for the Digital Imaging and Communication in Medicine (DICOM) standard. DICOM would appear to provide an advantage over using consumer image file formats for metadata as it includes all the patient, study, and technical metadata necessary to use images clinically. Whereas, consumer image file formats only include technical metadata and need to be used in conjunction with another actor-for example, an electronic medical record-to supply the patient and study metadata. The use of DICOM may have some ancillary benefits in dermatologic imaging including leveraging DICOM network and workflow services, interoperability of images and metadata, leveraging existing enterprise imaging infrastructure, greater patient safety, and better compliance to legislative requirements for image retention.

  13. MSiReader: an open-source interface to view and analyze high resolving power MS imaging files on Matlab platform.

    PubMed

    Robichaud, Guillaume; Garrard, Kenneth P; Barry, Jeremy A; Muddiman, David C

    2013-05-01

    During the past decade, the field of mass spectrometry imaging (MSI) has greatly evolved, to a point where it has now been fully integrated by most vendors as an optional or dedicated platform that can be purchased with their instruments. However, the technology is not mature and multiple research groups in both academia and industry are still very actively studying the fundamentals of imaging techniques, adapting the technology to new ionization sources, and developing new applications. As a result, there important varieties of data file formats used to store mass spectrometry imaging data and, concurrent to the development of MSi, collaborative efforts have been undertaken to introduce common imaging data file formats. However, few free software packages to read and analyze files of these different formats are readily available. We introduce here MSiReader, a free open source application to read and analyze high resolution MSI data from the most common MSi data formats. The application is built on the Matlab platform (Mathworks, Natick, MA, USA) and includes a large selection of data analysis tools and features. People who are unfamiliar with the Matlab language will have little difficult navigating the user-friendly interface, and users with Matlab programming experience can adapt and customize MSiReader for their own needs.

  14. MSiReader: An Open-Source Interface to View and Analyze High Resolving Power MS Imaging Files on Matlab Platform

    NASA Astrophysics Data System (ADS)

    Robichaud, Guillaume; Garrard, Kenneth P.; Barry, Jeremy A.; Muddiman, David C.

    2013-05-01

    During the past decade, the field of mass spectrometry imaging (MSI) has greatly evolved, to a point where it has now been fully integrated by most vendors as an optional or dedicated platform that can be purchased with their instruments. However, the technology is not mature and multiple research groups in both academia and industry are still very actively studying the fundamentals of imaging techniques, adapting the technology to new ionization sources, and developing new applications. As a result, there important varieties of data file formats used to store mass spectrometry imaging data and, concurrent to the development of MSi, collaborative efforts have been undertaken to introduce common imaging data file formats. However, few free software packages to read and analyze files of these different formats are readily available. We introduce here MSiReader, a free open source application to read and analyze high resolution MSI data from the most common MSi data formats. The application is built on the Matlab platform (Mathworks, Natick, MA, USA) and includes a large selection of data analysis tools and features. People who are unfamiliar with the Matlab language will have little difficult navigating the user-friendly interface, and users with Matlab programming experience can adapt and customize MSiReader for their own needs.

  15. ArrayInitiative - a tool that simplifies creating custom Affymetrix CDFs

    PubMed Central

    2011-01-01

    Background Probes on a microarray represent a frozen view of a genome and are quickly outdated when new sequencing studies extend our knowledge, resulting in significant measurement error when analyzing any microarray experiment. There are several bioinformatics approaches to improve probe assignments, but without in-house programming expertise, standardizing these custom array specifications as a usable file (e.g. as Affymetrix CDFs) is difficult, owing mostly to the complexity of the specification file format. However, without correctly standardized files there is a significant barrier for testing competing analysis approaches since this file is one of the required inputs for many commonly used algorithms. The need to test combinations of probe assignments and analysis algorithms led us to develop ArrayInitiative, a tool for creating and managing custom array specifications. Results ArrayInitiative is a standalone, cross-platform, rich client desktop application for creating correctly formatted, custom versions of manufacturer-provided (default) array specifications, requiring only minimal knowledge of the array specification rules and file formats. Users can import default array specifications, import probe sequences for a default array specification, design and import a custom array specification, export any array specification to multiple output formats, export the probe sequences for any array specification and browse high-level information about the microarray, such as version and number of probes. The initial release of ArrayInitiative supports the Affymetrix 3' IVT expression arrays we currently analyze, but as an open source application, we hope that others will contribute modules for other platforms. Conclusions ArrayInitiative allows researchers to create new array specifications, in a standard format, based upon their own requirements. This makes it easier to test competing design and analysis strategies that depend on probe definitions. Since the custom array specifications are easily exported to the manufacturer's standard format, researchers can analyze these customized microarray experiments using established software tools, such as those available in Bioconductor. PMID:21548938

  16. Toolsets for Airborne Data (TAD): Improving Machine Readability for ICARTT Data Files

    NASA Astrophysics Data System (ADS)

    Northup, E. A.; Early, A. B.; Beach, A. L., III; Kusterer, J.; Quam, B.; Wang, D.; Chen, G.

    2015-12-01

    NASA has conducted airborne tropospheric chemistry studies for about three decades. These field campaigns have generated a great wealth of observations, including a wide range of the trace gases and aerosol properties. The ASDC Toolsets for Airborne Data (TAD) is designed to meet the user community needs for manipulating aircraft data for scientific research on climate change and air quality relevant issues. TAD makes use of aircraft data stored in the International Consortium for Atmospheric Research on Transport and Transformation (ICARTT) file format. ICARTT has been the NASA standard since 2010, and is widely used by NOAA, NSF, and international partners (DLR, FAAM). Its level of acceptance is due in part to it being generally self-describing for researchers, i.e., it provides necessary data descriptions for proper research use. Despite this, there are a number of issues with the current ICARTT format, especially concerning the machine readability. In order to overcome these issues, the TAD team has developed an "idealized" file format. This format is ASCII and is sufficiently machine readable to sustain the TAD system, however, it is not fully compatible with the current ICARTT format. The process of mapping ICARTT metadata to the idealized format, the format specifics, and the actual conversion process will be discussed. The goal of this presentation is to demonstrate an example of how to improve the machine readability of ASCII data format protocols.

  17. SW New Mexico Oil Well Formation Tops

    DOE Data Explorer

    Shari Kelley

    2015-10-21

    Rock formation top picks from oil wells from southwestern New Mexico from scout cards and other sources. There are differing formation tops interpretations for some wells, so for those wells duplicate formation top data are presented in this file.

  18. Methods and apparatus for capture and storage of semantic information with sub-files in a parallel computing system

    DOEpatents

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Torres, Aaron

    2015-02-03

    Techniques are provided for storing files in a parallel computing system using sub-files with semantically meaningful boundaries. A method is provided for storing at least one file generated by a distributed application in a parallel computing system. The file comprises one or more of a complete file and a plurality of sub-files. The method comprises the steps of obtaining a user specification of semantic information related to the file; providing the semantic information as a data structure description to a data formatting library write function; and storing the semantic information related to the file with one or more of the sub-files in one or more storage nodes of the parallel computing system. The semantic information provides a description of data in the file. The sub-files can be replicated based on semantically meaningful boundaries.

  19. Standard interface files and procedures for reactor physics codes, version III

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carmichael, B.M.

    Standards and procedures for promoting the exchange of reactor physics codes are updated to Version-III status. Standards covering program structure, interface files, file handling subroutines, and card input format are included. The implementation status of the standards in codes and the extension of the standards to new code areas are summarized. (15 references) (auth)

  20. 75 FR 19338 - FM TABLE OF ALLOTMENTS, Milford, Utah

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-14

    .... SUMMARY: The Audio Division seeks comments on a petition filed by Canyon Media Group, LLC, authorized..., large print, electronic files, audio format), send an e-mail to [email protected] or call the Consumer... Chief, Audio Division, Media Bureau. [FR Doc. 2010-8448 Filed 4-13-10; 8:45 am] BILLING CODE 6712-01-S ...

  1. Snake River Plain Geothermal Play Fairway Analysis - Phase 1 KMZ files

    DOE Data Explorer

    John Shervais

    2015-10-10

    This dataset contain raw data files in kmz files (Google Earth georeference format). These files include volcanic vent locations and age, the distribution of fine-grained lacustrine sediments (which act as both a seal and an insulating layer for hydrothermal fluids), and post-Miocene faults compiled from the Idaho Geological Survey, the USGS Quaternary Fault database, and unpublished mapping. It also contains the Composite Common Risk Segment Map created during Phase 1 studies, as well as a file with locations of select deep wells used to interrogate the subsurface.

  2. Plan for DoD Wide Demonstrations of a DoD Improved Interactive Electronic Technical Manual (IETM) Architecture

    DTIC Science & Technology

    1998-07-01

    all the MS Word files into FrameMaker + SGML format and use the FrameMaker application to SGML tag all of the data in accordance with the Army TM...Document Type Definitions (DTDs) in MIL-STD- 2361. The edited SGML tagged files are saved as PDF files for delivery to the field. The FrameMaker ...as TIFF files and being imported into FrameMaker prior to saving the TMs as PDF files. Since the hardware to be used by the AN/PPS-5 technician is

  3. Cytoscape file of chemical networks

    EPA Pesticide Factsheets

    The maximum connectivity scores of pairwise chemical conditions summarized from Cmap results in a file with Cytoscape format (http://www.cytoscape.org/). The figures in the publication were generated from this file. The Cytoscape file is formed from importing the eight text file therein.This dataset is associated with the following publication:Wang , R., A. Biales , N. Garcia-Reyero, E. Perkins, D. Villeneuve, G. Ankley, and D. Bencic. Fish Connectivity Mapping: Linking Chemical Stressors by Their MOA-Driven Transcriptomic Profiles. BMC Genomics. BioMed Central Ltd, London, UK, 17(84): 1-20, (2016).

  4. Index files for Belle II - very small skim containers

    NASA Astrophysics Data System (ADS)

    Sevior, Martin; Bloomfield, Tristan; Kuhr, Thomas; Ueda, I.; Miyake, H.; Hara, T.

    2017-10-01

    The Belle II experiment[1] employs the root file format[2] for recording data and is investigating the use of “index-files” to reduce the size of data skims. These files contain pointers to the location of interesting events within the total Belle II data set and reduce the size of data skims by 2 orders of magnitude. We implement this scheme on the Belle II grid by recording the parent file metadata and the event location within the parent file. While the scheme works, it is substantially slower than a normal sequential read of standard skim files using default root file parameters. We investigate the performance of the scheme by adjusting the “splitLevel” and “autoflushsize” parameters of the root files in the parent data files.

  5. 18 CFR 270.304 - Tight formation gas.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... determination that natural gas is tight formation gas must file with the jurisdictional agency an application... formation; (d) A complete copy of the well log, including the log heading identifying the designated tight...

  6. Proposal for a Standard Format for Neurophysiology Data Recording and Exchange.

    PubMed

    Stead, Matt; Halford, Jonathan J

    2016-10-01

    The lack of interoperability between information networks is a significant source of cost in health care. Standardized data formats decrease health care cost, improve quality of care, and facilitate biomedical research. There is no common standard digital format for storing clinical neurophysiologic data. This review proposes a new standard file format for neurophysiology data (the bulk of which is video-electroencephalographic data), entitled the Multiscale Electrophysiology Format, version 3 (MEF3), which is designed to address many of the shortcomings of existing formats. MEF3 provides functionality that addresses many of the limitations of current formats. The proposed improvements include (1) hierarchical file structure with improved organization; (2) greater extensibility for big data applications requiring a large number of channels, signal types, and parallel processing; (3) efficient and flexible lossy or lossless data compression; (4) industry standard multilayered data encryption and time obfuscation that permits sharing of human data without the need for deidentification procedures; (5) resistance to file corruption; (6) facilitation of online and offline review and analysis; and (7) provision of full open source documentation. At this time, there is no other neurophysiology format that supports all of these features. MEF3 is currently gaining industry and academic community support. The authors propose the use of the MEF3 as a standard format for neurophysiology recording and data exchange. Collaboration between industry, professional organizations, research communities, and independent standards organizations is needed to move the project forward.

  7. HDFITS: Porting the FITS data model to HDF5

    NASA Astrophysics Data System (ADS)

    Price, D. C.; Barsdell, B. R.; Greenhill, L. J.

    2015-09-01

    The FITS (Flexible Image Transport System) data format has been the de facto data format for astronomy-related data products since its inception in the late 1970s. While the FITS file format is widely supported, it lacks many of the features of more modern data serialization, such as the Hierarchical Data Format (HDF5). The HDF5 file format offers considerable advantages over FITS, such as improved I/O speed and compression, but has yet to gain widespread adoption within astronomy. One of the major holdbacks is that HDF5 is not well supported by data reduction software packages and image viewers. Here, we present a comparison of FITS and HDF5 as a format for storage of astronomy datasets. We show that the underlying data model of FITS can be ported to HDF5 in a straightforward manner, and that by doing so the advantages of the HDF5 file format can be leveraged immediately. In addition, we present a software tool, fits2hdf, for converting between FITS and a new 'HDFITS' format, where data are stored in HDF5 in a FITS-like manner. We show that HDFITS allows faster reading of data (up to 100x of FITS in some use cases), and improved compression (higher compression ratios and higher throughput). Finally, we show that by only changing the import lines in Python-based FITS utilities, HDFITS formatted data can be presented transparently as an in-memory FITS equivalent.

  8. What is meant by Format Version? Product Version? Collection?

    Atmospheric Science Data Center

    2017-10-12

    The format Version is used to distinguish between software deliveries to ASDC that result in a product format change. The format version is given in the MISR data file name using the designator _Fnn_ where nn is the version number. ...

  9. ListingAnalyst: A program for analyzing the main output file from MODFLOW

    USGS Publications Warehouse

    Winston, Richard B.; Paulinski, Scott

    2014-01-01

    ListingAnalyst is a Windows® program for viewing the main output file from MODFLOW-2005, MODFLOW-NWT, or MODFLOW-LGR. It organizes and displays large files quickly without using excessive memory. The sections and subsections of the file are displayed in a tree-view control, which allows the user to navigate quickly to desired locations in the files. ListingAnalyst gathers error and warning messages scattered throughout the main output file and displays them all together in an error and a warning tab. A grid view displays tables in a readable format and allows the user to copy the table into a spreadsheet. The user can also search the file for terms of interest.

  10. Vector Topographic Map Data over the BOREAS NSA and SSA in SIF Format

    NASA Technical Reports Server (NTRS)

    Knapp, David; Nickeson, Jaime; Hall, Forrest G. (Editor)

    2000-01-01

    This data set contains vector contours and other features of individual topographic map sheets from the National Topographic Series (NTS). The map sheet files were received in Standard Interchange Format (SIF) and cover the BOReal Ecosystem-Atmosphere Study (BOREAS) Northern Study Area (NSA) and Southern Study Area (SSA) at scales of 1:50,000 and 1:250,000. The individual files are stored in compressed Unix tar archives.

  11. Workflow opportunities using JPEG 2000

    NASA Astrophysics Data System (ADS)

    Foshee, Scott

    2002-11-01

    JPEG 2000 is a new image compression standard from ISO/IEC JTC1 SC29 WG1, the Joint Photographic Experts Group (JPEG) committee. Better thought of as a sibling to JPEG rather than descendant, the JPEG 2000 standard offers wavelet based compression as well as companion file formats and related standardized technology. This paper examines the JPEG 2000 standard for features in four specific areas-compression, file formats, client-server, and conformance/compliance that enable image workflows.

  12. GIF Animation of Mode Shapes and Other Data on the Internet

    NASA Technical Reports Server (NTRS)

    Pappa, Richard S.

    1998-01-01

    The World Wide Web abounds with animated cartoons and advertisements competing for our attention. Most of these figures are animated Graphics Interchange Format (GIF) files. These files contain a series of ordinary GIF images plus control information, and they provide an exceptionally simple, effective way to animate on the Internet. To date, however, this format has rarely been used for technical data, although there is no inherent reason not to do so. This paper describes a procedure for creating high-resolution animated GIFs of mode shapes and other types of structural dynamics data with readily available software. The paper shows three example applications using recent modal test data and video footage of a high-speed sled run. A fairly detailed summary of the GIF file format is provided in the appendix. All of the animations discussed in the paper are posted on the Internet available through the following address: http://sdb-www.larc.nasa.gov/.

  13. Preparing PNNL Reports with LaTeX

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Waichler, Scott R.

    2005-06-01

    LaTeX is a mature document preparation system that is the standard in many scientific and academic workplaces. It has been used extensively by scattered individuals and research groups within PNNL for years, but until now there have been no centralized or lab-focused resources to help authors and editors. PNNL authors and editors can produce correctly formatted PNNL or PNWD reports using the LaTeX document preparation system and the available template files. Please visit the PNNL-LaTeX Project (http://stidev.pnl.gov/resources/latex/, inside the PNNL firewall) for additional information and files. In LaTeX, document content is maintained separately from document structure for the most part.more » This means that the author can easily produce the same content in different formats and, more importantly, can focus on the content and write it in a plain text file that doesn't go awry, is easily transferable, and won't become obsolete due to software changes. LaTeX produces the finest print quality output; its typesetting is noticeably better than that of MS Word. This is particularly true for mathematics, tables, and other types of special text. Other benefits of LaTeX: easy handling of large numbers of figures and tables; automatic and error-free captioning, citation, cross-referencing, hyperlinking, and indexing; excellent published and online documentation; free or low-cost distributions for Windows/Linux/Unix/Mac OS X. This document serves two purposes: (1) it provides instructions to produce reports formatted to PNNL requirements using LaTeX, and (2) the document itself is in the form of a PNNL report, providing examples of many solved formatting challenges. Authors can use this document or its skeleton version (with formatting examples removed) as the starting point for their own reports. The pnnreport.cls class file and pnnl.bst bibliography style file contain the required formatting specifications for reports to the Department of Energy. Options are also provided for formatting PNWD (non-1830) reports. This documentation and the referenced files are meant to provide a complete package of PNNL particulars for authors and editors who wish to prepare technical reports using LaTeX. The example material in this document was borrowed from real reports and edited for demonstration purposes. The subject matter content of the example material is not relevant here and generally does not make literal sense in the context of this document. Brackets ''[]'' are used to denote large blocks of example text. The PDF file for this report contains hyperlinks to facilitate navigation. Hyperlinks are provided for all cross-referenced material, including section headings, figures, tables, and references. Not all hyperlinks are colored but will be obvious when you move your mouse over them.« less

  14. Publications - PIR 2002-3 | Alaska Division of Geological & Geophysical

    Science.gov Websites

    ): Philip Smith Mountains Bibliographic Reference Stevens, D.S.P., 2014, Engineering-geologic map of the Digital Geospatial Data Philip Smith Mountains: Engineering-geologic map Data File Format File Size Info

  15. Need of a consistent and convenient nucleus identification in ENDF files for the automatic construction of the depletion chains

    NASA Astrophysics Data System (ADS)

    Mosca, Pietro; Mounier, Claude

    2016-03-01

    The automatic construction of evolution chains recently implemented in GALILEE system is based on the analysis of several ENDF files : the multigroup production cross sections present in the GENDF files processed by NJOY from the ENDF evaluation, the decay file and the fission product yields (FPY) file. In this context, this paper highlights the importance of the nucleus identification to properly interconnect the data mentioned above. The first part of the paper describes the present status of the nucleus identification among the several ENDF files focusing, in particular, on the use of the excited state number and of the isomeric state number. The second part reviews the problems encountered during the automatic construction of the depletion chains using recent ENDF data. The processing of the JEFF-3.1.1, ENDF/B-VII.0 (decay and FPY) and the JEFF-3.2 (production cross section) points out problems about the compliance or not of the nucleus identifiers with the ENDF-6 format and sometimes the inconsistencies among the various ENDF files. In addition, the analysis of EAF-2003 and EAF-2010 shows some incoherence between the ZA product identifier and the reaction identifier MT for the reactions (n, pα) and (n, 2np). As a main result of this work, our suggestion is to change the ENDF format using systematically the isomeric state number to identify the nuclei. This proposal is already compliant to a huge amount ENDF data that are not in agreement with the present ENDF format. This choice is the most convenient because, ultimately, it allows one to give human readable names to the nuclei of the depletion chains.

  16. BOREAS RSS-14 Level-1a GOES-8 Visible, IR and Water Vapor Images

    NASA Technical Reports Server (NTRS)

    Hall, Forrest G. (Editor); Newcomer, Jeffrey A.; Faysash, David; Cooper, Harry J.; Smith, Eric A.

    2000-01-01

    The BOREAS RSS-14 team collected and processed several GOES-7 and GOES-8 image data sets that covered the BOREAS study region. The level-1a GOES-8 images were created by BORIS personnel from the level-1 images delivered by FSU personnel. The data cover 14-Jul-1995 to 21-Sep-1995 and 12-Feb-1996 to 03-Oct-1996. The data start out as three bands with 8-bit pixel values and end up as five bands with 10-bit pixel values. No major problems with the data have been identified. The differences between the level-1 and level-1a GOES-8 data are the formatting and packaging of the data. The images missing from the temporal series of level-1 GOES-8 images were zero-filled by BORIS staff to create files consistent in size and format. In addition, BORIS staff packaged all the images of a given type from a given day into a single file, removed the header information from the individual level-1 files, and placed it into a single descriptive ASCII header file. The data are contained in binary image format files. Due to the large size of the images, the level-1a GOES-8 data are not contained on the BOREAS CD-ROM set. An inventory listing file is supplied on the CD-ROM to inform users of what data were collected. The level-1a GOES-8 image data are available from the Earth Observing System Data and Information System (EOSDIS) Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC). See sections 15 and 16 for more information. The data files are available on a CD-ROM (see document number 20010000884).

  17. Main image file tape description

    USGS Publications Warehouse

    Warriner, Howard W.

    1980-01-01

    This Main Image File Tape document defines the data content and file structure of the Main Image File Tape (MIFT) produced by the EROS Data Center (EDC). This document also defines an INQUIRY tape, which is just a subset of the MIFT. The format of the INQUIRY tape is identical to the MIFT except for two records; therefore, with the exception of these two records (described elsewhere in this document), every remark made about the MIFT is true for the INQUIRY tape.

  18. lcps: Light curve pre-selection

    NASA Astrophysics Data System (ADS)

    Schlecker, Martin

    2018-05-01

    lcps searches for transit-like features (i.e., dips) in photometric data. Its main purpose is to restrict large sets of light curves to a number of files that show interesting behavior, such as drops in flux. While lcps is adaptable to any format of time series, its I/O module is designed specifically for photometry of the Kepler spacecraft. It extracts the pre-conditioned PDCSAP data from light curves files created by the standard Kepler pipeline. It can also handle csv-formatted ascii files. lcps uses a sliding window technique to compare a section of flux time series with its surroundings. A dip is detected if the flux within the window is lower than a threshold fraction of the surrounding fluxes.

  19. Chemopreventive Agent Development | Division of Cancer Prevention

    Cancer.gov

    [[{"fid":"174","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"Chemoprevenentive Agent Development Research Group Homepage Logo","field_file_image_title_text[und][0][value]":"Chemoprevenentive Agent Development Research Group Homepage

  20. 17 CFR 16.06 - Errors or omissions.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ..., reporting markets shall file corrections to errors or omissions in data previously filed with the Commission pursuant to §§ 16.00 and 16.01 in the format and using the coding structure and electronic data submission...

  1. Publications - PR 121 | Alaska Division of Geological & Geophysical Surveys

    Science.gov Websites

    : Download below or please see our publication sales page for more information. Quadrangle(s): Philip Smith Philip Smith Mountains: Surficial Geology Data File Format File Size Info Download psm-surficial-geo

  2. Publications - RI 2001-1C | Alaska Division of Geological & Geophysical

    Science.gov Websites

    map of the Chulitna region, southcentral Alaska, scale 1:63,360 (7.5 M) Digital Geospatial Data Digital Geospatial Data Chulitna region surficial geology Data File Format File Size Info Download

  3. Publications - RDF 2015-17 | Alaska Division of Geological & Geophysical

    Science.gov Websites

    /10.14509/29519 Publication Products Report Report Information rdf2015_017.pdf (347.0 K) Digital Geospatial Data Digital Geospatial Data Tonsina geochemistry: DGGS samples Data File Format File Size Info

  4. VizieR Online Data Catalog: Horizontal temperature at Venus upper atmosphere (Peralta+, 2016)

    NASA Astrophysics Data System (ADS)

    Peralta, J.; Lopez-Valverde, M. A.; Gilli, G.; Piccialli, A.

    2015-11-01

    The dayside atmospheric temperatures in the UMLT of Venus (displayed in Figure 7A of this article) are listed as a CSV data file. These values consist of averages in bins of 5° in latitude and 0.25-hours in local time from dayside temperatures covering five years of data (from 2006/05/14 to 2011/06/05). These temperatures were inferred from the CO2 NLTE nadir spectra measured by the instrument VIRTIS-H onboard Venus Express (see article for full description of the procedure), and are representative of the atmospheric region between 10-2 to 10-5mb. Along with the temperatures, we also provide the corresponding error and the number of temperatures averaged in each bin. The format of the CSV file reasonably agrees with the expected format of the data files to be provided in the future version of the Venus International Reference Atmosphere (VIRA). (1 data file).

  5. Profex: a graphical user interface for the Rietveld refinement program BGMN.

    PubMed

    Doebelin, Nicola; Kleeberg, Reinhard

    2015-10-01

    Profex is a graphical user interface for the Rietveld refinement program BGMN . Its interface focuses on preserving BGMN 's powerful and flexible scripting features by giving direct access to BGMN input files. Very efficient workflows for single or batch refinements are achieved by managing refinement control files and structure files, by providing dialogues and shortcuts for many operations, by performing operations in the background, and by providing import filters for CIF and XML crystal structure files. Refinement results can be easily exported for further processing. State-of-the-art graphical export of diffraction patterns to pixel and vector graphics formats allows the creation of publication-quality graphs with minimum effort. Profex reads and converts a variety of proprietary raw data formats and is thus largely instrument independent. Profex and BGMN are available under an open-source license for Windows, Linux and OS X operating systems.

  6. Profex: a graphical user interface for the Rietveld refinement program BGMN

    PubMed Central

    Doebelin, Nicola; Kleeberg, Reinhard

    2015-01-01

    Profex is a graphical user interface for the Rietveld refinement program BGMN. Its interface focuses on preserving BGMN’s powerful and flexible scripting features by giving direct access to BGMN input files. Very efficient workflows for single or batch refinements are achieved by managing refinement control files and structure files, by providing dialogues and shortcuts for many operations, by performing operations in the background, and by providing import filters for CIF and XML crystal structure files. Refinement results can be easily exported for further processing. State-of-the-art graphical export of diffraction patterns to pixel and vector graphics formats allows the creation of publication-quality graphs with minimum effort. Profex reads and converts a variety of proprietary raw data formats and is thus largely instrument independent. Profex and BGMN are available under an open-source license for Windows, Linux and OS X operating systems. PMID:26500466

  7. Desktop document delivery using portable document format (PDF) files and the Web.

    PubMed Central

    Shipman, J P; Gembala, W L; Reeder, J M; Zick, B A; Rainwater, M J

    1998-01-01

    Desktop access to electronic full-text literature was rated one of the most desirable services in a client survey conducted by the University of Washington Libraries. The University of Washington Health Sciences Libraries (UW HSL) conducted a ten-month pilot test from August 1996 to May 1997 to determine the feasibility of delivering electronic journal articles via the Internet to remote faculty. Articles were scanned into Adobe Acrobat Portable Document Format (PDF) files and delivered to individuals using Multipurpose Internet Mail Extensions (MIME) standard e-mail attachments and the Web. Participants retrieved scanned articles and used the Adobe Acrobat Reader software to view and print files. The pilot test required a special programming effort to automate the client notification and file deletion processes. Test participants were satisfied with the pilot test despite some technical difficulties. Desktop delivery is now offered as a routine delivery method from the UW HSL. PMID:9681165

  8. Desktop document delivery using portable document format (PDF) files and the Web.

    PubMed

    Shipman, J P; Gembala, W L; Reeder, J M; Zick, B A; Rainwater, M J

    1998-07-01

    Desktop access to electronic full-text literature was rated one of the most desirable services in a client survey conducted by the University of Washington Libraries. The University of Washington Health Sciences Libraries (UW HSL) conducted a ten-month pilot test from August 1996 to May 1997 to determine the feasibility of delivering electronic journal articles via the Internet to remote faculty. Articles were scanned into Adobe Acrobat Portable Document Format (PDF) files and delivered to individuals using Multipurpose Internet Mail Extensions (MIME) standard e-mail attachments and the Web. Participants retrieved scanned articles and used the Adobe Acrobat Reader software to view and print files. The pilot test required a special programming effort to automate the client notification and file deletion processes. Test participants were satisfied with the pilot test despite some technical difficulties. Desktop delivery is now offered as a routine delivery method from the UW HSL.

  9. "AFacet": a geometry based format and visualizer to support SAR and multisensor signature generation

    NASA Astrophysics Data System (ADS)

    Rosencrantz, Stephen; Nehrbass, John; Zelnio, Ed; Sudkamp, Beth

    2018-04-01

    When simulating multisensor signature data (including SAR, LIDAR, EO, IR, etc...), geometry data are required that accurately represent the target. Most vehicular targets can, in real life, exist in many possible configurations. Examples of these configurations might include a rotated turret, an open door, a missing roof rack, or a seat made of metal or wood. Previously we have used the Modelman (.mmp) format and tool to represent and manipulate our articulable models. Unfortunately Modelman is now an unsupported tool and an undocumented binary format. Some work has been done to reverse engineer a reader in Matlab so that the format could continue to be useful. This work was tedious and resulted in an incomplete conversion. In addition, the resulting articulable models could not be altered and re-saved in the Modelman format. The AFacet (.afacet) articulable facet file format is a replacement for the binary Modelman (.mmp) file format. There is a one-time straight forward path for conversion from Modelman to the AFacet format. It is a simple ASCII, comma separated, self-documenting format that is easily readable (and in many cases usefully editable) by a human with any text editor, preventing future obsolescence. In addition, because the format is simple, it is relatively easy for even the most novice programmer to create a program to read and write AFacet files in any language without any special libraries. This paper presents the AFacet format, as well as a suite of tools for creating, articulating, manipulating, viewing, and converting the 370+ (when this paper was written) models that have been converted to the AFacet format.

  10. mzDB: A File Format Using Multiple Indexing Strategies for the Efficient Analysis of Large LC-MS/MS and SWATH-MS Data Sets*

    PubMed Central

    Bouyssié, David; Dubois, Marc; Nasso, Sara; Gonzalez de Peredo, Anne; Burlet-Schiltz, Odile; Aebersold, Ruedi; Monsarrat, Bernard

    2015-01-01

    The analysis and management of MS data, especially those generated by data independent MS acquisition, exemplified by SWATH-MS, pose significant challenges for proteomics bioinformatics. The large size and vast amount of information inherent to these data sets need to be properly structured to enable an efficient and straightforward extraction of the signals used to identify specific target peptides. Standard XML based formats are not well suited to large MS data files, for example, those generated by SWATH-MS, and compromise high-throughput data processing and storing. We developed mzDB, an efficient file format for large MS data sets. It relies on the SQLite software library and consists of a standardized and portable server-less single-file database. An optimized 3D indexing approach is adopted, where the LC-MS coordinates (retention time and m/z), along with the precursor m/z for SWATH-MS data, are used to query the database for data extraction. In comparison with XML formats, mzDB saves ∼25% of storage space and improves access times by a factor of twofold up to even 2000-fold, depending on the particular data access. Similarly, mzDB shows also slightly to significantly lower access times in comparison with other formats like mz5. Both C++ and Java implementations, converting raw or XML formats to mzDB and providing access methods, will be released under permissive license. mzDB can be easily accessed by the SQLite C library and its drivers for all major languages, and browsed with existing dedicated GUIs. The mzDB described here can boost existing mass spectrometry data analysis pipelines, offering unprecedented performance in terms of efficiency, portability, compactness, and flexibility. PMID:25505153

  11. ArrayBridge: Interweaving declarative array processing with high-performance computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xing, Haoyuan; Floratos, Sofoklis; Blanas, Spyros

    Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aimsmore » to make declarative array manipulations interoperable with imperative file-centric analyses. Our prototype implementation of ArrayBridge uses HDF5 as the underlying array storage library and seamlessly integrates into the SciDB open-source array database system. In addition to fast querying over external array objects, ArrayBridge produces arrays in the HDF5 file format just as easily as it can read from it. ArrayBridge also supports time travel queries from imperative kernels through the unmodified HDF5 API, and automatically deduplicates between array versions for space efficiency. Our extensive performance evaluation in NERSC, a large-scale scientific computing facility, shows that ArrayBridge exhibits statistically indistinguishable performance and I/O scalability to the native SciDB storage engine.« less

  12. User's guide to HYPOINVERSE-2000, a Fortran program to solve for earthquake locations and magnitudes

    USGS Publications Warehouse

    Klein, Fred W.

    2002-01-01

    Hypoinverse is a computer program that processes files of seismic station data for an earthquake (like p wave arrival times and seismogram amplitudes and durations) into earthquake locations and magnitudes. It is one of a long line of similar USGS programs including HYPOLAYR (Eaton, 1969), HYPO71 (Lee and Lahr, 1972), and HYPOELLIPSE (Lahr, 1980). If you are new to Hypoinverse, you may want to start by glancing at the section “SOME SIMPLE COMMAND SEQUENCES” to get a feel of some simpler sessions. This document is essentially an advanced user’s guide, and reading it sequentially will probably plow the reader into more detail than he/she needs. Every user must have a crust model, station list and phase data input files, and glancing at these sections is a good place to begin. The program has many options because it has grown over the years to meet the needs of one the largest seismic networks in the world, but small networks with just a few stations do use the program and can ignore most of the options and commands. History and availability. Hypoinverse was originally written for the Eclipse minicomputer in 1978 (Klein, 1978). A revised version for VAX and Pro-350 computers (Klein, 1985) was later expanded to include multiple crustal models and other capabilities (Klein, 1989). This current report documents the expanded Y2000 version and it supercedes the earlier documents. It serves as a detailed user's guide to the current version running on unix and VAX-alpha computers, and to the version supplied with the Earthworm earthquake digitizing system. Fortran-77 source code (Sun and VAX compatible) and copies of this documentation is available via anonymous ftp from computers in Menlo Park. At present, the computer is swave.wr.usgs.gov and the directory is /ftp/pub/outgoing/klein/hyp2000. If you are running Hypoinverse on one of the Menlo Park EHZ or NCSN unix computers, the executable currently is ~klein/hyp2000/hyp2000. New features. The Y2000 version of Hypoinverse includes all of the previous capabilities, but adds Y2000 formats to those defined earlier. In most cases, the new formats add 2 digits to the year field to accommodate the century. Other fields are sometimes rearranged or expanded to accommodate a better field order. The Y2000 formats are invoked with the “200” command. When the Y2000 flag is turned on, all files are read and written in the new format and there is no mixing of format types in a single run. Some formats without a date field, like station files, have not changed. A separate program called 2000CONV has been written to convert old formats to new. Other new features, like expanded station names, calculating amplitude magnitudes from a variety of digital seismometers, station history files, interactive earthquake processing, and locations from CUSP (Caltech USGS Seismic Processing) binary files have been added. General features. Hypoinverse will locate any number of events in an input file, which can be in one of several different formats. Any or all of printout, summary or archive output may be produced. Hypoinverse is driven by user commands. The various commands define input and output files, set adjustable parameters, and solve for locations of a file of earthquake data using the parameters and files currently set. It is both interactive and "batch" in that commands may be executed either from the keyboard or from a file. You execute the commands in a file by typing @filename at the Hypoinverse prompt. Users may either supply parameters on the command line, or omit them and are prompted interactively. The current parameter values are displayed and may be taken as defaults by pressing just the RETURN key after the prompt. This makes the program very easy to use, providing you can remember the names of the commands. Combining commands with and without their required parameters into a command file permits a variety of customized procedures such as automatic input of crustal model and station data, but prompting for a different phase file each time. All commands are 3 letters long and most require one or more parameters or file names. If they appear on a line with a command, character strings such as filenames must be enclosed in apostrophes (single quotes). Appendix 1 gives this and other free-format rules for supplying parameters, which are parsed in Fortran. When several parameters are required following a command, any of them may be omitted by replacing them with null fields (see appendix 1). A null field leaves that parameter unchanged from its current or default value. When you start HYPOINVERSE, default values are in effect for all parameters except file names. Hypoinverse is a complicated program with many features and options. Many of these "advanced" or seldom used features are documented here, but are more detailed than a typical user needs to read about when first starting with the program. I have put some of this material in smaller type so that a first time user can concentrate on the more important information.

  13. Interactive Visualization Systems and Data Integration Methods for Supporting Discovery in Collections of Scientific Information

    DTIC Science & Technology

    2011-05-01

    iTunes illustrate the difference between the centralized approach of digital library systems and the distributed approach of container file formats...metadata in a container file format. Apple’s iTunes uses a centralized metadata approach and allows users to maintain song metadata in a single...one iTunes library to another the metadata must be copied separately or reentered in the new library. This demonstrates the utility of storing metadata

  14. Enhanced Historical Land-Use and Land-Cover Data Sets of the U.S. Geological Survey

    USGS Publications Warehouse

    Price, Curtis V.; Nakagaki, Naomi; Hitt, Kerie J.; Clawges, Rick M.

    2007-01-01

    Historical land-use and land-cover data, available from the U.S. Geological Survey (USGS) for the conterminous United States and Hawaii, have been enhanced for use in geographic information systems (GIS) applications. The original digital data sets were created by the USGS in the late 1970s and early 1980s and were later converted by USGS and the U.S. Environmental Protection Agency (USEPA) to a geographic information system (GIS) format in the early 1990s. These data were made available on USEPA's Web site since the early 1990s and have been used for many national applications, despite minor coding and topological errors. During the 1990s, a group of USGS researchers made modifications to the data set for use in the National Water-Quality Assessment Program. These edited files have been further modified to create a more accurate, topologically clean, and seamless national data set. Several different methods, including custom editing software and several batch processes, were applied to create this enhanced version of the national data set. The data sets are included in this report in the commonly used shapefile and Tagged Image Format File (TIFF) formats. In addition, this report includes two polygon data sets (in shapefile format) representing (1) land-use and land-cover source documentation extracted from the previously published USGS data files, and (2) the extent of each polygon data file.

  15. Format requirements of thermal neutron scattering data in a nuclear data format to succeed the ENDF format

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, D.

    2014-03-31

    In November 2012, the Working Party on Evaluation Cooperation Subgroup 38 (WPEC-SG38) began with the task of developing a nuclear data format and supporting infrastructure to replace the now nearly 50 year old ENDF format. The first step in this process is to develop requirements for the new format and infrastructure. In this talk, I will review the status of ENDF's Thermal Scattering Law (TSL) formats as well as support for this data in the GND format (from which the new format is expected to evolve). Finally, I hope to begin a dialog with members of the thermal neutron scatteringmore » community so that their data needs can be accurately and easily accommodated by the new format and tools, as captured by the requirements document. During this discussion, we must keep in mind that the new tools and format must; Support what is in existing data files; Support new things we want to put in data files; and Be flexible enough for us to adapt it to future unanticipated challenges.« less

  16. Digital data for preliminary geologic map of the Mount Hood 30- by 60-minute quadrangle, northern Cascade Range, Oregon

    USGS Publications Warehouse

    Lina Ma,; Sherrod, David R.; Scott, William E.

    2014-01-01

    This geodatabase contains information derived from legacy mapping that was published in 1995 as U.S. Geological Survey Open-File Report 95-219. The main component of this publication is a geologic map database prepared using geographic information system (GIS) applications. Included are pdf files to view or print the map sheet, the accompanying pamphlet from Open-File Report 95-219, and links to the original publication, which is available as scanned files in pdf format.

  17. cljam: a library for handling DNA sequence alignment/map (SAM) with parallel processing.

    PubMed

    Takeuchi, Toshiki; Yamada, Atsuo; Aoki, Takashi; Nishimura, Kunihiro

    2016-01-01

    Next-generation sequencing can determine DNA bases and the results of sequence alignments are generally stored in files in the Sequence Alignment/Map (SAM) format and the compressed binary version (BAM) of it. SAMtools is a typical tool for dealing with files in the SAM/BAM format. SAMtools has various functions, including detection of variants, visualization of alignments, indexing, extraction of parts of the data and loci, and conversion of file formats. It is written in C and can execute fast. However, SAMtools requires an additional implementation to be used in parallel with, for example, OpenMP (Open Multi-Processing) libraries. For the accumulation of next-generation sequencing data, a simple parallelization program, which can support cloud and PC cluster environments, is required. We have developed cljam using the Clojure programming language, which simplifies parallel programming, to handle SAM/BAM data. Cljam can run in a Java runtime environment (e.g., Windows, Linux, Mac OS X) with Clojure. Cljam can process and analyze SAM/BAM files in parallel and at high speed. The execution time with cljam is almost the same as with SAMtools. The cljam code is written in Clojure and has fewer lines than other similar tools.

  18. WhopGenome: high-speed access to whole-genome variation and sequence data in R.

    PubMed

    Wittelsbürger, Ulrich; Pfeifer, Bastian; Lercher, Martin J

    2015-02-01

    The statistical programming language R has become a de facto standard for the analysis of many types of biological data, and is well suited for the rapid development of new algorithms. However, variant call data from population-scale resequencing projects are typically too large to be read and processed efficiently with R's built-in I/O capabilities. WhopGenome can efficiently read whole-genome variation data stored in the widely used variant call format (VCF) file format into several R data types. VCF files can be accessed either on local hard drives or on remote servers. WhopGenome can associate variants with annotations such as those available from the UCSC genome browser, and can accelerate the reading process by filtering loci according to user-defined criteria. WhopGenome can also read other Tabix-indexed files and create indices to allow fast selective access to FASTA-formatted sequence files. The WhopGenome R package is available on CRAN at http://cran.r-project.org/web/packages/WhopGenome/. A Bioconductor package has been submitted. lercher@cs.uni-duesseldorf.de. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. Data management in large-scale collaborative toxicity studies: how to file experimental data for automated statistical analysis.

    PubMed

    Stanzel, Sven; Weimer, Marc; Kopp-Schneider, Annette

    2013-06-01

    High-throughput screening approaches are carried out for the toxicity assessment of a large number of chemical compounds. In such large-scale in vitro toxicity studies several hundred or thousand concentration-response experiments are conducted. The automated evaluation of concentration-response data using statistical analysis scripts saves time and yields more consistent results in comparison to data analysis performed by the use of menu-driven statistical software. Automated statistical analysis requires that concentration-response data are available in a standardised data format across all compounds. To obtain consistent data formats, a standardised data management workflow must be established, including guidelines for data storage, data handling and data extraction. In this paper two procedures for data management within large-scale toxicological projects are proposed. Both procedures are based on Microsoft Excel files as the researcher's primary data format and use a computer programme to automate the handling of data files. The first procedure assumes that data collection has not yet started whereas the second procedure can be used when data files already exist. Successful implementation of the two approaches into the European project ACuteTox is illustrated. Copyright © 2012 Elsevier Ltd. All rights reserved.

  20. Embedding and Publishing Interactive, 3-Dimensional, Scientific Figures in Portable Document Format (PDF) Files

    PubMed Central

    Barnes, David G.; Vidiassov, Michail; Ruthensteiner, Bernhard; Fluke, Christopher J.; Quayle, Michelle R.; McHenry, Colin R.

    2013-01-01

    With the latest release of the S2PLOT graphics library, embedding interactive, 3-dimensional (3-d) scientific figures in Adobe Portable Document Format (PDF) files is simple, and can be accomplished without commercial software. In this paper, we motivate the need for embedding 3-d figures in scholarly articles. We explain how 3-d figures can be created using the S2PLOT graphics library, exported to Product Representation Compact (PRC) format, and included as fully interactive, 3-d figures in PDF files using the movie15 LaTeX package. We present new examples of 3-d PDF figures, explain how they have been made, validate them, and comment on their advantages over traditional, static 2-dimensional (2-d) figures. With the judicious use of 3-d rather than 2-d figures, scientists can now publish, share and archive more useful, flexible and faithful representations of their study outcomes. The article you are reading does not have embedded 3-d figures. The full paper, with embedded 3-d figures, is recommended and is available as a supplementary download from PLoS ONE (File S2). PMID:24086243

  1. Embedding and publishing interactive, 3-dimensional, scientific figures in Portable Document Format (PDF) files.

    PubMed

    Barnes, David G; Vidiassov, Michail; Ruthensteiner, Bernhard; Fluke, Christopher J; Quayle, Michelle R; McHenry, Colin R

    2013-01-01

    With the latest release of the S2PLOT graphics library, embedding interactive, 3-dimensional (3-d) scientific figures in Adobe Portable Document Format (PDF) files is simple, and can be accomplished without commercial software. In this paper, we motivate the need for embedding 3-d figures in scholarly articles. We explain how 3-d figures can be created using the S2PLOT graphics library, exported to Product Representation Compact (PRC) format, and included as fully interactive, 3-d figures in PDF files using the movie15 LaTeX package. We present new examples of 3-d PDF figures, explain how they have been made, validate them, and comment on their advantages over traditional, static 2-dimensional (2-d) figures. With the judicious use of 3-d rather than 2-d figures, scientists can now publish, share and archive more useful, flexible and faithful representations of their study outcomes. The article you are reading does not have embedded 3-d figures. The full paper, with embedded 3-d figures, is recommended and is available as a supplementary download from PLoS ONE (File S2).

  2. The geochemical landscape of northwestern Wisconsin and adjacent parts of northern Michigan and Minnesota (geochemical data files)

    USGS Publications Warehouse

    Cannon, William F.; Woodruff, Laurel G.

    2003-01-01

    This data set consists of nine files of geochemical information on various types of surficial deposits in northwestern Wisconsin and immediately adjacent parts of Michigan and Minnesota. The files are presented in two formats: as dbase files in dbaseIV form and Microsoft Excel form. The data present multi-element chemical analyses of soils, stream sediments, and lake sediments. Latitude and longitude values are provided in each file so that the dbf files can be readily imported to GIS applications. Metadata files are provided in outline form, question and answer form and text form. The metadata includes information on procedures for sample collection, sample preparation, and chemical analyses including sensitivity and precision.

  3. ProMC: Input-output data format for HEP applications using varint encoding

    NASA Astrophysics Data System (ADS)

    Chekanov, S. V.; May, E.; Strand, K.; Van Gemmeren, P.

    2014-10-01

    A new data format for Monte Carlo (MC) events, or any structural data, including experimental data, is discussed. The format is designed to store data in a compact binary form using variable-size integer encoding as implemented in the Google's Protocol Buffers package. This approach is implemented in the PROMC library which produces smaller file sizes for MC records compared to the existing input-output libraries used in high-energy physics (HEP). Other important features of the proposed format are a separation of abstract data layouts from concrete programming implementations, self-description and random access. Data stored in PROMC files can be written, read and manipulated in a number of programming languages, such C++, JAVA, FORTRAN and PYTHON.

  4. CD Recorders.

    ERIC Educational Resources Information Center

    Falk, Howard

    1998-01-01

    Discussion of CD (compact disc) recorders describes recording applications, including storing large graphic files, creating audio CDs, and storing material downloaded from the Internet; backing up files; lifespan; CD recording formats; continuous recording; recording software; recorder media; vulnerability of CDs; basic computer requirements; and…

  5. 14 CFR 221.31 - Rules and regulations governing passenger fares and services.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... TRANSPORTATION (AVIATION PROCEEDINGS) ECONOMIC REGULATIONS TARIFFS Manner of Filing Tariffs § 221.31 Rules and... (b) of this section may be filed in a paper format, subject to the requirements of this part and...

  6. 78 FR 77155 - Grant Program To Assess, Evaluate, and Promote Development of Tribal Energy and Mineral Resources

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-12-20

    ... through DEMD's in-house databases; Well log interpretation, including correlation of formation tops.... Files must have descriptive file names to help DEMD quickly locate specific components of the proposal...

  7. Prostate and Urologic Cancer | Division of Cancer Prevention

    Cancer.gov

    [[{"fid":"183","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"Prostate and Urologic Cancer Research Group Homepage Logo","field_file_image_title_text[und][0][value]":"Prostate and Urologic Cancer Research Group Homepage

  8. Publications - RDF 2007-1 | Alaska Division of Geological & Geophysical

    Science.gov Websites

    ://doi.org/10.14509/15759 Publication Products Report Report Information rdf2007_001.pdf (443.0 K) Digital Geospatial Data Digital Geospatial Data Fairbanks Mining District Geochemical Data Data File Format File Size

  9. Publications - RDF 2011-4 v. 2 | Alaska Division of Geological &

    Science.gov Websites

    ://doi.org/10.14509/23002 Publication Products Report Report Information rdf2011_004.pdf (519.0 K) Digital Geospatial Data Digital Geospatial Data Moran Geochemistry Data File Format File Size Info Download moran

  10. Publications - RI 2001-1D | Alaska Division of Geological & Geophysical

    Science.gov Websites

    -geologic map of the Chulitna region, southcentral Alaska, scale 1:63,360 (16.0 M) Digital Geospatial Data Digital Geospatial Data Chulitna region engineering geology Data File Format File Size Info Download

  11. The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience.

    PubMed

    Griss, Johannes; Jones, Andrew R; Sachsenberg, Timo; Walzer, Mathias; Gatto, Laurent; Hartler, Jürgen; Thallinger, Gerhard G; Salek, Reza M; Steinbeck, Christoph; Neuhauser, Nadin; Cox, Jürgen; Neumann, Steffen; Fan, Jun; Reisinger, Florian; Xu, Qing-Wei; Del Toro, Noemi; Pérez-Riverol, Yasset; Ghali, Fawaz; Bandeira, Nuno; Xenarios, Ioannis; Kohlbacher, Oliver; Vizcaíno, Juan Antonio; Hermjakob, Henning

    2014-10-01

    The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R. We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

  12. The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider Audience*

    PubMed Central

    Griss, Johannes; Jones, Andrew R.; Sachsenberg, Timo; Walzer, Mathias; Gatto, Laurent; Hartler, Jürgen; Thallinger, Gerhard G.; Salek, Reza M.; Steinbeck, Christoph; Neuhauser, Nadin; Cox, Jürgen; Neumann, Steffen; Fan, Jun; Reisinger, Florian; Xu, Qing-Wei; del Toro, Noemi; Pérez-Riverol, Yasset; Ghali, Fawaz; Bandeira, Nuno; Xenarios, Ioannis; Kohlbacher, Oliver; Vizcaíno, Juan Antonio; Hermjakob, Henning

    2014-01-01

    The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R. We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online. PMID:24980485

  13. Transcriptome Analysis of Nine Tissues to Discover Genes Involved in the Biosynthesis of Active Ingredients in Sophora flavescens.

    PubMed

    Han, Rongchun; Takahashi, Hiroki; Nakamura, Michimi; Bunsupa, Somnuk; Yoshimoto, Naoko; Yamamoto, Hirobumi; Suzuki, Hideyuki; Shibata, Daisuke; Yamazaki, Mami; Saito, Kazuki

    2015-01-01

    Sophora flavescens AITON (kurara) has long been used to treat various diseases. Although several research findings revealed the biosynthetic pathways of its characteristic chemical components as represented by matrine, insufficient analysis of transcriptome data hampered in-depth analysis of the underlying putative genes responsible for the biosynthesis of pharmaceutical chemical components. In this study, more than 200 million fastq format reads were generated by Illumina's next-generation sequencing approach using nine types of tissue from S. flavescens, followed by CLC de novo assembly, ultimately yielding 83,325 contigs in total. By mapping the reads back to the contigs, reads per kilobase of the transcript per million mapped reads values were calculated to demonstrate gene expression levels, and overrepresented gene ontology terms were evaluated using Fisher's exact test. In search of the putative genes relevant to essential metabolic pathways, all 1350 unique enzyme commission numbers were used to map pathways against the Kyoto Encyclopedia of Genes and Genomes. By analyzing expression patterns, we proposed some candidate genes involved in the biosynthesis of isoflavonoids and quinolizidine alkaloids. Adopting RNA-Seq analysis, we obtained substantially credible contigs for downstream work. The preferential expression of the gene for putative lysine/ornithine decarboxylase committed in the initial step of matrine biosynthesis in leaves and stems was confirmed in semi-quantitative polymerase chain reaction (PCR) analysis. The findings in this report may serve as a stepping-stone for further research into this promising medicinal plant.

  14. FEQinput—An editor for the full equations (FEQ) hydraulic modeling system

    USGS Publications Warehouse

    Ancalle, David S.; Ancalle, Pablo J.; Domanski, Marian M.

    2017-10-30

    IntroductionThe Full Equations Model (FEQ) is a computer program that solves the full, dynamic equations of motion for one-dimensional unsteady hydraulic flow in open channels and through control structures. As a result, hydrologists have used FEQ to design and operate flood-control structures, delineate inundation maps, and analyze peak-flow impacts. To aid in fighting floods, hydrologists are using the software to develop a system that uses flood-plain models to simulate real-time streamflow.Input files for FEQ are composed of text files that contain large amounts of parameters, data, and instructions that are written in a format exclusive to FEQ. Although documentation exists that can aid in the creation and editing of these input files, new users face a steep learning curve in order to understand the specific format and language of the files.FEQinput provides a set of tools to help a new user overcome the steep learning curve associated with creating and modifying input files for the FEQ hydraulic model and the related utility tool, Full Equations Utilities (FEQUTL).

  15. The Mark 3 data base handler

    NASA Technical Reports Server (NTRS)

    Ryan, J. W.; Ma, C.; Schupler, B. R.

    1980-01-01

    A data base handler which would act to tie Mark 3 system programs together is discussed. The data base handler is written in FORTRAN and is implemented on the Hewlett-Packard 21MX and the IBM 360/91. The system design objectives were to (1) provide for an easily specified method of data interchange among programs, (2) provide for a high level of data integrity, (3) accommodate changing requirments, (4) promote program accountability, (5) provide a single source of program constants, and (6) provide a central point for data archiving. The system consists of two distinct parts: a set of files existing on disk packs and tapes; and a set of utility subroutines which allow users to access the information in these files. Users never directly read or write the files and need not know the details of how the data are formatted in the files. To the users, the storage medium is format free. A user does need to know something about the sequencing of his data in the files but nothing about data in which he has no interest.

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dolan, Daniel H.; Ao, Tommy

    The Sandia Data Archive (SDA) format is a specific implementation of the HDF5 (Hierarchal Data Format version 5) standard. The format was developed for storing data in a universally accessible manner. SDA files may contain one or more data records, each associated with a distinct text label. Primitive records provide basic data storage, while compound records support more elaborate grouping. External records allow text/binary files to be carried inside an archive and later recovered. This report documents version 1.0 of the SDA standard. The information provided here is sufficient for reading from and writing to an archive. Although the formatmore » was original designed for use in MATLAB, broader use is encouraged.« less

  17. Do you also have problems with the file format syndrome?

    PubMed

    De Cuyper, B; Nyssen, E; Christophe, Y; Cornelis, J

    1991-11-01

    In a biomedical data processing environment, an essential requirement is the ability to integrate a large class of standard modules for the acquisition, processing and display of the (image) data. Our approach to the management and manipulation of the different data formats is based on the specification of a common standard for the representation of data formats, called 'data nature descriptions' to emphasise that this representation not only specifies the structure but also the contents of data objects (files). The idea behind this concept is to associate each hardware and software component that produces or uses medical data with a description of the data objects manipulated by that component. In our approach a special software module (a format convertor generator) takes care of the appropriate data format conversions, required when two or more components of the system exchange data.

  18. Use of Schema on Read in Earth Science Data Archives

    NASA Astrophysics Data System (ADS)

    Petrenko, M.; Hegde, M.; Smit, C.; Pilone, P.; Pham, L.

    2017-12-01

    Traditionally, NASA Earth Science data archives have file-based storage using proprietary data file formats, such as HDF and HDF-EOS, which are optimized to support fast and efficient storage of spaceborne and model data as they are generated. The use of file-based storage essentially imposes an indexing strategy based on data dimensions. In most cases, NASA Earth Science data uses time as the primary index, leading to poor performance in accessing data in spatial dimensions. For example, producing a time series for a single spatial grid cell involves accessing a large number of data files. With exponential growth in data volume due to the ever-increasing spatial and temporal resolution of the data, using file-based archives poses significant performance and cost barriers to data discovery and access. Storing and disseminating data in proprietary data formats imposes an additional access barrier for users outside the mainstream research community. At the NASA Goddard Earth Sciences Data Information Services Center (GES DISC), we have evaluated applying the "schema-on-read" principle to data access and distribution. We used Apache Parquet to store geospatial data, and have exposed data through Amazon Web Services (AWS) Athena, AWS Simple Storage Service (S3), and Apache Spark. Using the "schema-on-read" approach allows customization of indexing—spatial or temporal—to suit the data access pattern. The storage of data in open formats such as Apache Parquet has widespread support in popular programming languages. A wide range of solutions for handling big data lowers the access barrier for all users. This presentation will discuss formats used for data storage, frameworks with support for "schema-on-read" used for data access, and common use cases covering data usage patterns seen in a geospatial data archive.

  19. 17 CFR 240.13d-2 - Filing of amendments to Schedules 13D or 13G.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ...) The first electronic amendment to a paper format Schedule 13D (§ 240.13d-101 of this chapter) or... 17 Commodity and Securities Exchanges 3 2013-04-01 2013-04-01 false Filing of amendments to... Under the Securities Exchange Act of 1934 Regulation 13d-G § 240.13d-2 Filing of amendments to Schedules...

  20. 17 CFR 240.13d-2 - Filing of amendments to Schedules 13D or 13G.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ...) The first electronic amendment to a paper format Schedule 13D (§ 240.13d-101 of this chapter) or... 17 Commodity and Securities Exchanges 4 2014-04-01 2014-04-01 false Filing of amendments to... Under the Securities Exchange Act of 1934 Regulation 13d-G § 240.13d-2 Filing of amendments to Schedules...

  1. Analysis, modeling, and simulation (AMS) testbed development and evaluation to support dynamic mobility applications (DMA) and active transportation and demand management (ATDM) programs — evaluation report for ATDM program. [supporting datasets - Pasadena Testbed

    DOT National Transportation Integrated Search

    2017-07-26

    This zip file contains POSTDATA.ATT (.ATT); Print to File (.PRN); Portable Document Format (.PDF); and document (.DOCX) files of data to support FHWA-JPO-16-385, Analysis, modeling, and simulation (AMS) testbed development and evaluation to support d...

  2. Geographic Information for Analysis of Highway Runoff-Quality Data on a National or Regional Scale in the Conterminous United States

    USGS Publications Warehouse

    Smieszek, Tomas W.; Granato, Gregory E.

    2000-01-01

    Spatial data are important for interpretation of water-quality information on a regional or national scale. Geographic information systems (GIS) facilitate interpretation and integration of spatial data. The geographic information and data compiled for the conterminous United States during the National Highway Runoff Water-Quality Data and Methodology Synthesis project is described in this document, which also includes information on the structure, file types, and the geographic information in the data files. This 'geodata' directory contains two subdirectories, labeled 'gisdata' and 'gisimage.' The 'gisdata' directory contains ArcInfo coverages, ArcInfo export files, shapefiles (used in ArcView), Spatial Data Transfer Standard Topological Vector Profile format files, and meta files in subdirectories organized by file type. The 'gisimage' directory contains the GIS data in common image-file formats. The spatial geodata includes two rain-zone region maps and a map of national ecosystems originally published by the U.S. Environmental Protection Agency; regional estimates of mean annual streamflow, and water hardness published by the Federal Highway Administration; and mean monthly temperature, mean annual precipitation, and mean monthly snowfall modified from data published by the National Climatic Data Center and made available to the public by the Oregon Climate Service at Oregon State University. These GIS files were compiled for qualitative spatial analysis of available data on a national and(or) regional scale and therefore should be considered as qualitative representations, not precise geographic location information.

  3. Electronic hand-drafting and picture management system.

    PubMed

    Yang, Tsung-Han; Ku, Cheng-Yuan; Yen, David C; Hsieh, Wen-Huai

    2012-08-01

    The Department of Health of Executive Yuan in Taiwan (R.O.C.) is implementing a five-stage project entitled Electronic Medical Record (EMR) converting all health records from written to electronic form. Traditionally, physicians record patients' symptoms, related examinations, and suggested treatments on paper medical records. Currently when implementing the EMR, all text files and image files in the Hospital Information System (HIS) and Picture Archiving and Communication Systems (PACS) are kept separate. The current medical system environment is unable to combine text files, hand-drafted files, and photographs in the same system, so it is difficult to support physicians with the recording of medical data. Furthermore, in surgical and other related departments, physicians need immediate access to medical records in order to understand the details of a patient's condition. In order to address these problems, the Department of Health has implemented an EMR project, with the primary goal of building an electronic hand-drafting and picture management system (HDP system) that can be used by medical personnel to record medical information in a convenient way. This system can simultaneously edit text files, hand-drafted files, and image files and then integrate these data into Portable Document Format (PDF) files. In addition, the output is designed to fit a variety of formats in order to meet various laws and regulations. By combining the HDP system with HIS and PACS, the applicability can be enhanced to fit various scenarios and can assist the medical industry in moving into the final phase of EMR.

  4. Organic geochemistry data of Alaska

    USGS Publications Warehouse

    complied by Threlkeld, Charles N.; Obuch, Raymond C.; Gunther, G.L.

    2000-01-01

    In order to archive the results of various petroleum geochemical analyses of the Alaska resource assessment, the USGS developed an Alaskan Organic Geochemical Data Base (AOGDB) in 1978 to house the data generated from USGS and subcontracted laboratories. Prior to the AOGDB, the accumulated data resided in a flat data file entitled 'PGS' that was maintained by Petroleum Information Corporation with technical input from the USGS. The information herein is a breakout of the master flat file format into a relational data base table format (akdata).

  5. As-built design specification for the CLASFYG program

    NASA Technical Reports Server (NTRS)

    Horton, C. L. (Principal Investigator)

    1981-01-01

    This program produces a file with a Universal-formatted header and data records in a nonstandard format. Trajectory coefficients are calculated from 5 to 8 acquisitions of radiance values in the training field corresponding to an agricultural product. These coefficients are then used to calculate a time of emergence and corresponding trajectory coefficients for each pixel in the test field. The time of emergence, two of the coefficients, and the sigma value for each pixel are written to the file.

  6. SSE Global Data

    Atmospheric Science Data Center

    2018-04-12

    SSE Global Data Text files of monthly averaged data for the entire ... Version:  V6 Location:  Global Spatial Coverage:  (90N, 90S)(180W,180E) ... File Format:  ASCII Order Data:  SSE Global Data: Order Data SCAR-B Block:  ...

  7. Easy Online Access to Helpful Internet Guides.

    ERIC Educational Resources Information Center

    Tuss, Joan

    1993-01-01

    Lists recommended guides to the Internet that are available electronically. Basic commands needed to use anonymous ftp (file transfer protocol) are explained. An annotation and command formats to access, scan, retrieve, and exit each file are included for 11 titles. (EAM)

  8. Publications - RI 94-25 | Alaska Division of Geological & Geophysical

    Science.gov Websites

    -materials map of the Anchorage C-7 NW Quadrangle, Alaska, scale 1:25,000 (1.4 M) Digital Geospatial Data Digital Geospatial Data Anchorage C-7 NW Derivative materials Data File Format File Size Info Download

  9. Publications - RI 94-26 | Alaska Division of Geological & Geophysical

    Science.gov Websites

    -materials map of the Anchorage C-8 NE Quadrangle, Alaska, scale 1:25,000 (3.8 M) Digital Geospatial Data Digital Geospatial Data Anchorage C-8 NE Derivative materials Data File Format File Size Info Download

  10. Publications - RI 94-27 | Alaska Division of Geological & Geophysical

    Science.gov Websites

    -materials map of the Anchorage C-8 NW Quadrangle, Alaska, scale 1:25,000 (676.0 M) Digital Geospatial Data Digital Geospatial Data Anchorage C-8 NW Derivative materials Data File Format File Size Info Download

  11. Publications - RI 94-24 | Alaska Division of Geological & Geophysical

    Science.gov Websites

    -materials map of the Anchorage C-7 NE Quadrangle, Alaska, scale 1:25,000 (2.4 M) Digital Geospatial Data Digital Geospatial Data Anchorage C-7 NE Derivative materials Data File Format File Size Info Download

  12. Astronomical Instrumentation System Markup Language

    NASA Astrophysics Data System (ADS)

    Goldbaum, Jesse M.

    2016-05-01

    The Astronomical Instrumentation System Markup Language (AISML) is an Extensible Markup Language (XML) based file format for maintaining and exchanging information about astronomical instrumentation. The factors behind the need for an AISML are first discussed followed by the reasons why XML was chosen as the format. Next it's shown how XML also provides the framework for a more precise definition of an astronomical instrument and how these instruments can be combined to form an Astronomical Instrumentation System (AIS). AISML files for several instruments as well as one for a sample AIS are provided. The files demonstrate how AISML can be utilized for various tasks from web page generation and programming interface to instrument maintenance and quality management. The advantages of widespread adoption of AISML are discussed.

  13. An extended BET format for La RC shuttle experiments: Definition and development

    NASA Technical Reports Server (NTRS)

    Findlay, J. T.; Kelly, G. M.; Henry, M. W.

    1981-01-01

    A program for shuttle post-flight data reduction is discussed. An extended Best Estimate Trajectory (BET) file was developed. The extended format results in some subtle changes to the header record. The major change is the addition of twenty-six words to each data record. These words include atmospheric related parameters, body axis rate and acceleration data, computed aerodynamic coefficients, and angular accelerations. These parameters were added to facilitate post-flight aerodynamic coefficient determinations as well as shuttle entry air data sensor analyses. Software (NEWBET) was developed to generate the extended BET file utilizing the previously defined ENTREE BET, a dynamic data file which may be either derived inertial measurement unit data or aerodynamic coefficient instrument package data, and some atmospheric information.

  14. Digital atlas of Oklahoma

    USGS Publications Warehouse

    Rea, A.H.; Becker, C.J.

    1997-01-01

    This compact disc contains 25 digital map data sets covering the State of Oklahoma that may be of interest to the general public, private industry, schools, and government agencies. Fourteen data sets are statewide. These data sets include: administrative boundaries; 104th U.S. Congressional district boundaries; county boundaries; latitudinal lines; longitudinal lines; geographic names; indexes of U.S. Geological Survey 1:100,000, and 1:250,000-scale topographic quadrangles; a shaded-relief image; Oklahoma State House of Representatives district boundaries; Oklahoma State Senate district boundaries; locations of U.S. Geological Survey stream gages; watershed boundaries and hydrologic cataloging unit numbers; and locations of weather stations. Eleven data sets are divided by county and are located in 77 county subdirectories. These data sets include: census block group boundaries with selected demographic data; city and major highways text; geographic names; land surface elevation contours; elevation points; an index of U.S. Geological Survey 1:24,000-scale topographic quadrangles; roads, streets and address ranges; highway text; school district boundaries; streams, river and lakes; and the public land survey system. All data sets are provided in a readily accessible format. Most data sets are provided in Digital Line Graph (DLG) format. The attributes for many of the DLG files are stored in related dBASE(R)-format files and may be joined to the data set polygon attribute or arc attribute tables using dBASE(R)-compatible software. (Any use of trade names in this publication is for descriptive purposes only and does not imply endorsement by the U.S. Government.) Point attribute tables are provided in dBASE(R) format only, and include the X and Y map coordinates of each point. Annotation (text plotted in map coordinates) are provided in AutoCAD Drawing Exchange format (DXF) files. The shaded-relief image is provided in TIFF format. All data sets except the shaded-relief image also are provided in ARC/INFO export-file format.

  15. Filtering NetCDF Files by Using the EverVIEW Slice and Dice Tool

    USGS Publications Warehouse

    Conzelmann, Craig; Romañach, Stephanie S.

    2010-01-01

    Network Common Data Form (NetCDF) is a self-describing, machine-independent file format for storing array-oriented scientific data. It was created to provide a common interface between applications and real-time meteorological and other scientific data. Over the past few years, there has been a growing movement within the community of natural resource managers in The Everglades, Fla., to use NetCDF as the standard data container for datasets based on multidimensional arrays. As a consequence, a need surfaced for additional tools to view and manipulate NetCDF datasets, specifically to filter the files by creating subsets of large NetCDF files. The U.S. Geological Survey (USGS) and the Joint Ecosystem Modeling (JEM) group are working to address these needs with applications like the EverVIEW Slice and Dice Tool, which allows users to filter grid-based NetCDF files, thus targeting those data most important to them. The major functions of this tool are as follows: (1) to create subsets of NetCDF files temporally, spatially, and by data value; (2) to view the NetCDF data in table form; and (3) to export the filtered data to a comma-separated value (CSV) file format. The USGS and JEM will continue to work with scientists and natural resource managers across The Everglades to solve complex restoration problems through technological advances.

  16. morph

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goodall, John; Iannacone, Mike; Athalye, Anish

    2013-08-01

    Morph is a framework and domain-specific language (DSL) that helps parse and transform structured documents. It currently supports several file formats including XML, JSON, and CSV, and custom formats are usable as well.

  17. ISMRM Raw Data Format: A Proposed Standard for MRI Raw Datasets

    PubMed Central

    Inati, Souheil J.; Naegele, Joseph D.; Zwart, Nicholas R.; Roopchansingh, Vinai; Lizak, Martin J.; Hansen, David C.; Liu, Chia-Ying; Atkinson, David; Kellman, Peter; Kozerke, Sebastian; Xue, Hui; Campbell-Washburn, Adrienne E.; Sørensen, Thomas S.; Hansen, Michael S.

    2015-01-01

    Purpose This work proposes the ISMRM Raw Data (ISMRMRD) format as a common MR raw data format, which promotes algorithm and data sharing. Methods A file format consisting of a flexible header and tagged frames of k-space data was designed. Application Programming Interfaces were implemented in C/C++, MATLAB, and Python. Converters for Bruker, General Electric, Philips, and Siemens proprietary file formats were implemented in C++. Raw data were collected using MRI scanners from four vendors, converted to ISMRMRD format, and reconstructed using software implemented in three programming languages (C++, MATLAB, Python). Results Images were obtained by reconstructing the raw data from all vendors. The source code, raw data, and images comprising this work are shared online, serving as an example of an image reconstruction project following a paradigm of reproducible research. Conclusion The proposed raw data format solves a practical problem for the MRI community. It may serve as a foundation for reproducible research and collaborations. The ISMRMRD format is a completely open and community-driven format, and the scientific community is invited (including commercial vendors) to participate either as users or developers. PMID:26822475

  18. MINC 2.0: A Flexible Format for Multi-Modal Images.

    PubMed

    Vincent, Robert D; Neelin, Peter; Khalili-Mahani, Najmeh; Janke, Andrew L; Fonov, Vladimir S; Robbins, Steven M; Baghdadi, Leila; Lerch, Jason; Sled, John G; Adalat, Reza; MacDonald, David; Zijdenbos, Alex P; Collins, D Louis; Evans, Alan C

    2016-01-01

    It is often useful that an imaging data format can afford rich metadata, be flexible, scale to very large file sizes, support multi-modal data, and have strong inbuilt mechanisms for data provenance. Beginning in 1992, MINC was developed as a system for flexible, self-documenting representation of neuroscientific imaging data with arbitrary orientation and dimensionality. The MINC system incorporates three broad components: a file format specification, a programming library, and a growing set of tools. In the early 2000's the MINC developers created MINC 2.0, which added support for 64-bit file sizes, internal compression, and a number of other modern features. Because of its extensible design, it has been easy to incorporate details of provenance in the header metadata, including an explicit processing history, unique identifiers, and vendor-specific scanner settings. This makes MINC ideal for use in large scale imaging studies and databases. It also makes it easy to adapt to new scanning sequences and modalities.

  19. Simplified generation of biomedical 3D surface model data for embedding into 3D portable document format (PDF) files for publication and education.

    PubMed

    Newe, Axel; Ganslandt, Thomas

    2013-01-01

    The usefulness of the 3D Portable Document Format (PDF) for clinical, educational, and research purposes has recently been shown. However, the lack of a simple tool for converting biomedical data into the model data in the necessary Universal 3D (U3D) file format is a drawback for the broad acceptance of this new technology. A new module for the image processing and rapid prototyping framework MeVisLab does not only provide a platform-independent possibility to create surface meshes out of biomedical/DICOM and other data and to export them into U3D--it also lets the user add meta data to these meshes to predefine colors and names that can be processed by a PDF authoring software while generating 3D PDF files. Furthermore, the source code of the respective module is available and well documented so that it can easily be modified for own purposes.

  20. Caryoscope: An Open Source Java application for viewing microarray data in a genomic context

    PubMed Central

    Awad, Ihab AB; Rees, Christian A; Hernandez-Boussard, Tina; Ball, Catherine A; Sherlock, Gavin

    2004-01-01

    Background Microarray-based comparative genome hybridization experiments generate data that can be mapped onto the genome. These data are interpreted more easily when represented graphically in a genomic context. Results We have developed Caryoscope, which is an open source Java application for visualizing microarray data from array comparative genome hybridization experiments in a genomic context. Caryoscope can read General Feature Format files (GFF files), as well as comma- and tab-delimited files, that define the genomic positions of the microarray reporters for which data are obtained. The microarray data can be browsed using an interactive, zoomable interface, which helps users identify regions of chromosomal deletion or amplification. The graphical representation of the data can be exported in a number of graphic formats, including publication-quality formats such as PostScript. Conclusion Caryoscope is a useful tool that can aid in the visualization, exploration and interpretation of microarray data in a genomic context. PMID:15488149

  1. Informatics in radiology (infoRAD): Vendor-neutral case input into a server-based digital teaching file system.

    PubMed

    Kamauu, Aaron W C; DuVall, Scott L; Robison, Reid J; Liimatta, Andrew P; Wiggins, Richard H; Avrin, David E

    2006-01-01

    Although digital teaching files are important to radiology education, there are no current satisfactory solutions for export of Digital Imaging and Communications in Medicine (DICOM) images from picture archiving and communication systems (PACS) in desktop publishing format. A vendor-neutral digital teaching file, the Radiology Interesting Case Server (RadICS), offers an efficient tool for harvesting interesting cases from PACS without requiring modifications of the PACS configurations. Radiologists push imaging studies from PACS to RadICS via the standard DICOM Send process, and the RadICS server automatically converts the DICOM images into the Joint Photographic Experts Group format, a common desktop publishing format. They can then select key images and create an interesting case series at the PACS workstation. RadICS was tested successfully against multiple unmodified commercial PACS. Using RadICS, radiologists are able to harvest and author interesting cases at the point of clinical interpretation with minimal disruption in clinical work flow. RSNA, 2006

  2. ROSAT implementation of a proposed multi-mission x ray data format

    NASA Technical Reports Server (NTRS)

    Corcoran, M.; Pence, W.; White, R.; Conroy, M.

    1992-01-01

    Until recently little effort has been made to ensure that data from X-ray telescopes are delivered in a format that reflects the common characteristics that most X-ray datasets share. Instrument-specific data-product design hampers the comparison of X-ray measurements made by different detectors and should be avoided whenever possible. The ROSAT project and the High Energy Astrophysics Science Archive Research Center (HEASARC) have defined a set of X-ray data products ('rationalized files') for ROSAT data that can be used for distribution and archiving of data from other X-ray missions. This set of 'rationalized files' has been defined to isolate instrument-independent and instrument-specific quantities using standards FITS constructs to ensure portability. We discuss the usage of the 'rationalized files' by ROSAT for data distribution and archiving, with particular emphasis on discrimination between instrument-independent and instrument-specific quantities, and discuss application of this format to data from other X-ray missions.

  3. ORPC RivGen controller performance raw data - Igiugig 2015

    DOE Data Explorer

    McEntee, Jarlath

    2015-12-18

    Contains raw data for operations of Ocean Renewable Power Company (ORPC) RivGen Power System in Igiugig 2015 in Matlab data file format. Two data files capture the data and timestamps for data, including power in, voltage, rotation rate, and velocity.

  4. Putting "Reference" in the Publications Reference File.

    ERIC Educational Resources Information Center

    Zink, Steven D.

    1980-01-01

    Argues for more widespread utilization of the U.S. Government Printing Office's Publications Reference File, a reference tool in microfiche format used to answer questions about current U.S. government documents and their availability. Ways to accomplish this task are suggested. (Author/JD)

  5. XML Files

    MedlinePlus

    ... this page, please enable JavaScript. MedlinePlus produces XML data sets that you are welcome to download and use. If you have questions about the MedlinePlus XML files, please contact us . For additional sources of MedlinePlus data in XML format, visit our Web service page, ...

  6. Merged analog and photon counting profiles used as input for other RLPROF VAPs

    DOE Data Explorer

    Newsom, Rob

    2014-10-03

    The rlprof_merge VAP "merges" the photon counting and analog signals appropriately for each channel, creating an output data file that is very similar to the original raw data file format that the Raman lidar initially had.

  7. Merged analog and photon counting profiles used as input for other RLPROF VAPs

    DOE Data Explorer

    Newsom, Rob

    1998-03-01

    The rlprof_merge VAP "merges" the photon counting and analog signals appropriately for each channel, creating an output data file that is very similar to the original raw data file format that the Raman lidar initially had.

  8. MISR Level 3 Radiance Versioning

    Atmospheric Science Data Center

    2016-11-04

    ... ESDT Product File Name Prefix Current Quality Designations MIL3DRD, MIL3MRD, MIL3QRD, and MIL3YRD ... Data Product Specification Rev K  (PDF). Update to work with new format of the input PGE 1 files.   F02_0007 ...

  9. An Overview of ARL’s Multimodal Signatures Database and Web Interface

    DTIC Science & Technology

    2007-12-01

    ActiveX components, which hindered distribution due to license agreements and run-time license software to use such components. g. Proprietary...Overview The database consists of multimodal signature data files in the HDF5 format. Generally, each signature file contains all the ancillary...only contains information in the database, Web interface, and signature files that is releasable to the public. The Web interface consists of static

  10. Web servlet-assisted, dial-in flow cytometry data analysis.

    PubMed

    Battye, F

    2001-02-01

    The obvious benefits of centralized data storage notwithstanding, the size of modern flow cytometry data files discourages their transmission over commonly used telephone modem connections. The proposed solution is to install at the central location a web servlet that can extract compact data arrays, of a form dependent on the requested display type, from the stored files and transmit them to a remote client computer program for display. A client program and a web servlet, both written in the Java programming language, were designed to communicate over standard network connections. The client program creates familiar numerical and graphical display types and allows the creation of gates from combinations of user-defined regions. Data compression techniques further reduce transmission times for data arrays that are already much smaller than the data file itself. For typical data files, network transmission times were reduced more than 700-fold for extraction of one-dimensional (1-D) histograms, between 18 and 120-fold for 2-D histograms, and 6-fold for color-coded dot plots. Numerous display formats are possible without further access to the data file. This scheme enables telephone modem access to centrally stored data without restricting flexibility of display format or preventing comparisons with locally stored files. Copyright 2001 Wiley-Liss, Inc.

  11. proBAMconvert: A Conversion Tool for proBAM/proBed.

    PubMed

    Olexiouk, Volodimir; Menschaert, Gerben

    2017-07-07

    The introduction of new standard formats, proBAM and proBed, improves the integration of genomics and proteomics information, thus aiding proteogenomics applications. These novel formats enable peptide spectrum matches (PSM) to be stored, inspected, and analyzed within the context of the genome. However, an easy-to-use and transparent tool to convert mass spectrometry identification files to these new formats is indispensable. proBAMconvert enables the conversion of common identification file formats (mzIdentML, mzTab, and pepXML) to proBAM/proBed using an intuitive interface. Furthermore, ProBAMconvert enables information to be output both at the PSM and peptide levels and has a command line interface next to the graphical user interface. Detailed documentation and a completely worked-out tutorial is available at http://probam.biobix.be .

  12. Cardio-PACs: a new opportunity

    NASA Astrophysics Data System (ADS)

    Heupler, Frederick A., Jr.; Thomas, James D.; Blume, Hartwig R.; Cecil, Robert A.; Heisler, Mary

    2000-05-01

    It is now possible to replace film-based image management in the cardiac catheterization laboratory with a Cardiology Picture Archiving and Communication System (Cardio-PACS) based on digital imaging technology. The first step in the conversion process is installation of a digital image acquisition system that is capable of generating high-quality DICOM-compatible images. The next three steps, which are the subject of this presentation, involve image display, distribution, and storage. Clinical requirements and associated cost considerations for these three steps are listed below: Image display: (1) Image quality equal to film, with DICOM format, lossless compression, image processing, desktop PC-based with color monitor, and physician-friendly imaging software; (2) Performance specifications include: acquire 30 frames/sec; replay 15 frames/sec; access to file server 5 seconds, and to archive 5 minutes; (3) Compatibility of image file, transmission, and processing formats; (4) Image manipulation: brightness, contrast, gray scale, zoom, biplane display, and quantification; (5) User-friendly control of image review. Image distribution: (1) Standard IP-based network between cardiac catheterization laboratories, file server, long-term archive, review stations, and remote sites; (2) Non-proprietary formats; (3) Bidirectional distribution. Image storage: (1) CD-ROM vs disk vs tape; (2) Verification of data integrity; (3) User-designated storage capacity for catheterization laboratory, file server, long-term archive. Costs: (1) Image acquisition equipment, file server, long-term archive; (2) Network infrastructure; (3) Review stations and software; (4) Maintenance and administration; (5) Future upgrades and expansion; (6) Personnel.

  13. Possible costs associated with investigating and mitigating geologic hazards in rural areas of western San Mateo County, California with a section on using the USGS website to determine the cost of developing property for residences in rural parts of San Mateo County, California

    USGS Publications Warehouse

    Brabb, Earl E.; Roberts, Sebastian; Cotton, William R.; Kropp, Alan L.; Wright, Robert H.; Zinn, Erik N.; Digital database by Roberts, Sebastian; Mills, Suzanne K.; Barnes, Jason B.; Marsolek, Joanna E.

    2000-01-01

    This publication consists of a digital map database on a geohazards web site, http://kaibab.wr.usgs.gov/geohazweb/intro.htm, this text, and 43 digital map images available for downloading at this site. The report is stored as several digital files, in ARC export (uncompressed) format for the database, and Postscript and PDF formats for the map images. Several of the source data layers for the images have already been released in other publications by the USGS and are available for downloading on the Internet. These source layers are not included in this digital database, but rather a reference is given for the web site where the data can be found in digital format. The exported ARC coverages and grids lie in UTM zone 10 projection. The pamphlet, which only describes the content and character of the digital map database, is included as Postscript, PDF, and ASCII text files and is also available on paper as USGS Open-File Report 00-127. The full versatility of the spatial database is realized by importing the ARC export files into ARC/INFO or an equivalent GIS. Other GIS packages, including MapInfo and ARCVIEW, can also use the ARC export files. The Postscript map image can be used for viewing or plotting in computer systems with sufficient capacity, and the considerably smaller PDF image files can be viewed or plotted in full or in part from Adobe ACROBAT software running on Macintosh, PC, or UNIX platforms.

  14. CAPRICE positively regulates stomatal formation in the Arabidopsis hypocotyl

    PubMed Central

    2008-01-01

    In the Arabidopsis hypocotyl, stomata develop only from a set of epidermal cell files. Previous studies have identified several negative regulators of stomata formation. Such regulators also trigger non-hair cell fate in the root. Here, it is shown that TOO MANY MOUTHS (TMM) positively regulates CAPRICE (CPC) expression in differentiating stomaless-forming cell files, and that the CPC protein might move to the nucleus of neighbouring stoma-forming cells, where it promotes stomata formation in a redundant manner with TRIPTYCHON (TRY). Unexpectedly, the CPC protein was also localized in the nucleus and peripheral cytoplasm of hypocotyl fully differentiated epidermal cells, suggesting that CPC plays an additional role to those related to stomata formation. These results identify CPC and TRY as positive regulators of stomata formation in the embryonic stem, which increases the similarity between the genetic control of root hair and stoma cell fate determination. PMID:19513241

  15. Preliminary geologic map of the Elsinore 7.5' Quadrangle, Riverside County, California

    USGS Publications Warehouse

    Morton, Douglas M.; Weber, F. Harold; Digital preparation: Alvarez, Rachel M.; Burns, Diane

    2003-01-01

    Open-File Report 03-281 contains a digital geologic map database of the Elsinore 7.5’ quadrangle, Riverside County, California that includes: 1. ARC/INFO (Environmental Systems Research Institute, http://www.esri.com) version 7.2.1 coverages of the various elements of the geologic map. 2. A Postscript file to plot the geologic map on a topographic base, and containing a Correlation of Map Units diagram (CMU), a Description of Map Units (DMU), and an index map. 3. Portable Document Format (.pdf) files of: a. This Readme; includes in Appendix I, data contained in els_met.txt b. The same graphic as plotted in 2 above. Test plots have not produced precise 1:24,000-scale map sheets. Adobe Acrobat page size setting influences map scale. The Correlation of Map Units and Description of Map Units is in the editorial format of USGS Geologic Investigations Series (I-series) maps but has not been edited to comply with I-map standards. Within the geologic map data package, map units are identified by standard geologic map criteria such as formation-name, age, and lithology. Where known, grain size is indicated on the map by a subscripted letter or letters following the unit symbols as follows: lg, large boulders; b, boulder; g, gravel; a, arenaceous; s, silt; c, clay; e.g. Qyfa is a predominantly young alluvial fan deposit that is arenaceous. Multiple letters are used for more specific identification or for mixed units, e.g., Qfysa is a silty sand. In some cases, mixed units are indicated by a compound symbol; e.g., Qyf2sc. Even though this is an Open-File Report and includes the standard USGS Open-File disclaimer, the report closely adheres to the stratigraphic nomenclature of the U.S. Geological Survey. Descriptions of units can be obtained by viewing or plotting the .pdf file (3b above) or plotting the postscript file (2 above).

  16. What software tools can I use to view ERBE HDF data products?

    Atmospheric Science Data Center

    2014-12-08

    Visualize ERBE data with view_hdf: view_hdf a visualization and analysis tool for accessing data stored in Hierarchical Data Format (HDF) and HDF-EOS. ... Start HDFView Select File Select Open Select the file to be viewed ERBE: Data Access ...

  17. Highway Safety Information System guidebook for the California state data files. Volume I : SAS file formats

    DOT National Transportation Integrated Search

    1996-06-01

    This manual has been developed to provide information and guidance to engineering staffs involved with project develop and design of highways. It identifies those standards, specifications, guides, and references approved for use in carrying out the ...

  18. Highway Safety Information System guidebook for the Maine state data files. Volume 1 : SAS file formats

    DOT National Transportation Integrated Search

    2012-05-05

    As part of the Federal Highway Administration (FHWA) Traffic Analysis Toolbox (Volume XIII), this guide was designed to help corridor stakeholders implement the ICM AMS methodology successfully and effectively. It provides a step-by-step approach to ...

  19. FTP: Full-Text Publishing?

    ERIC Educational Resources Information Center

    Jul, Erik

    1992-01-01

    Describes the use of file transfer protocol (FTP) on the INTERNET computer network and considers its use as an electronic publishing system. The differing electronic formats of text files are discussed; the preparation and access of documents are described; and problems are addressed, including a lack of consistency. (LRW)

  20. LOGISTIC MANAGEMENT INFORMATION SYSTEM - MANUAL DATA STORAGE AND RETRIEVAL SYSTEM.

    DTIC Science & Technology

    Logistics Management Information System . The procedures are applicable to manual storage and retrieval of all data used in the Logistics Management ... Information System (LMIS) and include the following: (1) Action Officer data source file. (2) Action Officer presentation format file. (3) LMI Coordination

  1. TIM Version 3.0 beta Technical Description and User Guide - Appendix B - Example input file for TIMv3.0

    EPA Pesticide Factsheets

    Terrestrial Investigation Model, TIM, has several appendices to its user guide. This is the appendix that includes an example input file in its preserved format. Both parameters and comments defining them are included.

  2. Portable system to luminaries characterization

    NASA Astrophysics Data System (ADS)

    Tecpoyotl-Torres, M.; Vera-Dimas, J. G.; Koshevaya, S.; Escobedo-Alatorre, J.; Cisneros-Villalobos, L.; Sanchez-Mondragon, J.

    2014-09-01

    For illumination sources designers is important to know the illumination distribution of their products. They can use several viewers of IES files (standard file format determined by Illuminating Engineering Society). This files are necessary not only know the distribution of illumination, but also to plain the construction of buildings by means of specialized softwares, such as Autodesk Revit. In this paper, a complete portable system for luminaries' characterization is given. The components of the systems are: Irradiance profile meter, which can generate photometry of luminaries of small sizes which covers indoor illumination requirements and luminaries for general areas. One of the meteŕs attributes is given by the color sensor implemented, which allows knowing the color temperature of luminary under analysis. The Graphic Unit Interface (GUI) has several characteristics: It can control the meter, acquires the data obtained by the sensor and graphs them in 2D under Cartesian and polar formats or 3D, in Cartesian format. The graph can be exported to png, jpg, or bmp formats, if necessary. These remarkable characteristics differentiate this GUI. This proposal can be considered as a viable option for enterprises of illumination design and manufacturing, due to the relatively low investment level and considering the complete illumination characterization provided.

  3. jmzReader: A Java parser library to process and visualize multiple text and XML-based mass spectrometry data formats.

    PubMed

    Griss, Johannes; Reisinger, Florian; Hermjakob, Henning; Vizcaíno, Juan Antonio

    2012-03-01

    We here present the jmzReader library: a collection of Java application programming interfaces (APIs) to parse the most commonly used peak list and XML-based mass spectrometry (MS) data formats: DTA, MS2, MGF, PKL, mzXML, mzData, and mzML (based on the already existing API jmzML). The library is optimized to be used in conjunction with mzIdentML, the recently released standard data format for reporting protein and peptide identifications, developed by the HUPO proteomics standards initiative (PSI). mzIdentML files do not contain spectra data but contain references to different kinds of external MS data files. As a key functionality, all parsers implement a common interface that supports the various methods used by mzIdentML to reference external spectra. Thus, when developing software for mzIdentML, programmers no longer have to support multiple MS data file formats but only this one interface. The library (which includes a viewer) is open source and, together with detailed documentation, can be downloaded from http://code.google.com/p/jmzreader/. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. NLEdit: A generic graphical user interface for Fortran programs

    NASA Technical Reports Server (NTRS)

    Curlett, Brian P.

    1994-01-01

    NLEdit is a generic graphical user interface for the preprocessing of Fortran namelist input files. The interface consists of a menu system, a message window, a help system, and data entry forms. A form is generated for each namelist. The form has an input field for each namelist variable along with a one-line description of that variable. Detailed help information, default values, and minimum and maximum allowable values can all be displayed via menu picks. Inputs are processed through a scientific calculator program that allows complex equations to be used instead of simple numeric inputs. A custom user interface is generated simply by entering information about the namelist input variables into an ASCII file. There is no need to learn a new graphics system or programming language. NLEdit can be used as a stand-alone program or as part of a larger graphical user interface. Although NLEdit is intended for files using namelist format, it can be easily modified to handle other file formats.

  5. A Survey of Complex Object Technologies for Digital Libraries

    NASA Technical Reports Server (NTRS)

    Nelson, Michael L.; Argue, Brad; Efron, Miles; Denn, Sheila; Pattuelli, Maria Cristina

    2001-01-01

    Many early web-based digital libraries (DLs) had implicit assumptions reflected in their architecture that the unit of focus in the DL (frequently "reports" or "e-prints") would only be manifested in a single, or at most a few, common file formats such as PDF or PostScript. DLs have now matured to the point where their contents are commonly no longer simple files. Complex objects in DLs have emerged from in response to various requirements, including: simple aggregation of formats and supporting files, bundling additional information to aid digital preservation, creating opaque digital objects for e-commerce applications, and the incorporation of dynamic services with the traditional data files. We examine a representative (but not necessarily exhaustive) number of current and recent historical web-based complex object technologies and projects that are applicable to DLs: Aurora, Buckets, ComMentor, Cryptolopes, Digibox, Document Management Alliance, FEDORA, Kahn-Wilensky Framework Digital Objects, Metadata Encoding & Transmission Standard, Multivalent Documents, Open eBooks, VERS Encapsulated Objects, and the Warwick Framework.

  6. MAGIC: Model and Graphic Information Converter

    NASA Technical Reports Server (NTRS)

    Herbert, W. C.

    2009-01-01

    MAGIC is a software tool capable of converting highly detailed 3D models from an open, standard format, VRML 2.0/97, into the proprietary DTS file format used by the Torque Game Engine from GarageGames. MAGIC is used to convert 3D simulations from authoritative sources into the data needed to run the simulations in NASA's Distributed Observer Network. The Distributed Observer Network (DON) is a simulation presentation tool built by NASA to facilitate the simulation sharing requirements of the Data Presentation and Visualization effort within the Constellation Program. DON is built on top of the Torque Game Engine (TGE) and has chosen TGE's Dynamix Three Space (DTS) file format to represent 3D objects within simulations.

  7. SWIFT MODELLER: a Java based GUI for molecular modeling.

    PubMed

    Mathur, Abhinav; Shankaracharya; Vidyarthi, Ambarish S

    2011-10-01

    MODELLER is command line argument based software which requires tedious formatting of inputs and writing of Python scripts which most people are not comfortable with. Also the visualization of output becomes cumbersome due to verbose files. This makes the whole software protocol very complex and requires extensive study of MODELLER manuals and tutorials. Here we describe SWIFT MODELLER, a GUI that automates formatting, scripting and data extraction processes and present it in an interactive way making MODELLER much easier to use than before. The screens in SWIFT MODELLER are designed keeping homology modeling in mind and their flow is a depiction of its steps. It eliminates the formatting of inputs, scripting processes and analysis of verbose output files through automation and makes pasting of the target sequence as the only prerequisite. Jmol (3D structure visualization tool) has been integrated into the GUI which opens and demonstrates the protein data bank files created by the MODELLER software. All files required and created by the software are saved in a folder named after the work instance's date and time of execution. SWIFT MODELLER lowers the skill level required for the software through automation of many of the steps in the original software protocol, thus saving an enormous amount of time per instance and making MODELLER very easy to work with.

  8. Informatics in radiology (infoRAD): free DICOM image viewing and processing software for the Macintosh computer: what's available and what it can do for you.

    PubMed

    Escott, Edward J; Rubinstein, David

    2004-01-01

    It is often necessary for radiologists to use digital images in presentations and conferences. Most imaging modalities produce images in the Digital Imaging and Communications in Medicine (DICOM) format. The image files tend to be large and thus cannot be directly imported into most presentation software, such as Microsoft PowerPoint; the large files also consume storage space. There are many free programs that allow viewing and processing of these files on a personal computer, including conversion to more common file formats such as the Joint Photographic Experts Group (JPEG) format. Free DICOM image viewing and processing software for computers running on the Microsoft Windows operating system has already been evaluated. However, many people use the Macintosh (Apple Computer) platform, and a number of programs are available for these users. The World Wide Web was searched for free DICOM image viewing or processing software that was designed for the Macintosh platform or is written in Java and is therefore platform independent. The features of these programs and their usability were evaluated. There are many free programs for the Macintosh platform that enable viewing and processing of DICOM images. (c) RSNA, 2004.

  9. Geologic map of the San Bernardino North 7.5' quadrangle, San Bernardino County, California

    USGS Publications Warehouse

    Miller, F.K.; Matti, J.C.

    2001-01-01

    3. Portable Document Format (.pdf) files of: a. This Readme; includes an Appendix, containing data found in sbnorth_met.txt . b. The Description of Map Units identical to that found on the plot of the PostScript file. c. The same graphic as plotted in 2 above. (Test plots from this .pdf do not produce 1:24,000-scale maps. Use Adobe Acrobat pagesize setting to control map scale.) The Correlation of Map Units and Description of Map Units is in the editorial format of USGS Miscellaneous Investigations Series (I-series) maps. Within the geologic map data package, map units are identified by standard geologic map criteria such as formation-name, age, and lithology. Even though this is an author-prepared report, every attempt has been made to closely adhere to the stratigraphic nomenclature of the U. S. Geological Survey. Descriptions of units can be obtained by viewing or plotting the .pdf file (3b above) or plotting the postscript file (2 above). If roads in some areas, especially forest roads that parallel topographic contours, do not show well on plots of the geologic map, we recommend use of the USGS San Bernardino North 7.5’ topographic quadrangle in conjunction with the geologic map.

  10. HDFT Webtool

    EPA Pesticide Factsheets

    Because HSPF requires extensive input data, its Data-Formatting Tool (HDFT) allows users to format that data and import it to a WDM file. HDFT aids urban watershed modeling applications that use sub-hourly temporal resolutions.

  11. Geologic map and digital database of the Romoland 7.5' quadrangle, Riverside County, California

    USGS Publications Warehouse

    Morton, Douglas M.; Digital preparation by Bovard, Kelly R.; Morton, Gregory

    2003-01-01

    Portable Document Format (.pdf) files of: This Readme; includes in Appendix I, data contained in rom_met.txt The same graphic as plotted in 2 above. Test plots have not produced precise 1:24,000- scale map sheets. Adobe Acrobat page size setting influences map scale. The Correlation of Map Units and Description of Map Units is in the editorial format of USGS Geologic Investigations Series (I-series) maps but has not been edited to comply with I-map standards. Within the geologic map data package, map units are identified by standard geologic map criteria such as formationname, age, and lithology. Where known, grain size is indicated on the map by a subscripted letter or letters following the unit symbols as follows: lg, large boulders; b, boulder; g, gravel; a, arenaceous; s, silt; c, clay; e.g. Qyfa is a predominantly young alluvial fan deposit that is arenaceous. Multiple letters are used for more specific identification or for mixed units, e.g., Qfysa is a silty sand. In some cases, mixed units are indicated by a compound symbol; e.g., Qyf2sc. Even though this is an Open-File Report and includes the standard USGS Open-File disclaimer, the report closely adheres to the stratigraphic nomenclature of the U.S. Geological Survey. Descriptions of units can be obtained by viewing or plotting the .pdf file (3b above) or plotting the postscript file (2 above). This Readme file describes the digital data, such as types and general contents of files making up the database, and includes information on how to extract and plot the map and accompanying graphic file. Metadata information can be accessed at http://geo-nsdi.er.usgs.gov/metadata/open-file/03-102 and is included in Appendix I of this Readme.

  12. Improving transmission efficiency of large sequence alignment/map (SAM) files.

    PubMed

    Sakib, Muhammad Nazmus; Tang, Jijun; Zheng, W Jim; Huang, Chin-Tser

    2011-01-01

    Research in bioinformatics primarily involves collection and analysis of a large volume of genomic data. Naturally, it demands efficient storage and transfer of this huge amount of data. In recent years, some research has been done to find efficient compression algorithms to reduce the size of various sequencing data. One way to improve the transmission time of large files is to apply a maximum lossless compression on them. In this paper, we present SAMZIP, a specialized encoding scheme, for sequence alignment data in SAM (Sequence Alignment/Map) format, which improves the compression ratio of existing compression tools available. In order to achieve this, we exploit the prior knowledge of the file format and specifications. Our experimental results show that our encoding scheme improves compression ratio, thereby reducing overall transmission time significantly.

  13. cyvcf2: fast, flexible variant analysis with Python.

    PubMed

    Pedersen, Brent S; Quinlan, Aaron R

    2017-06-15

    Variant call format (VCF) files document the genetic variation observed after DNA sequencing, alignment and variant calling of a sample cohort. Given the complexity of the VCF format as well as the diverse variant annotations and genotype metadata, there is a need for fast, flexible methods enabling intuitive analysis of the variant data within VCF and BCF files. We introduce cyvcf2 , a Python library and software package for fast parsing and querying of VCF and BCF files and illustrate its speed, simplicity and utility. bpederse@gmail.com or aaronquinlan@gmail.com. cyvcf2 is available from https://github.com/brentp/cyvcf2 under the MIT license and from common python package managers. Detailed documentation is available at http://brentp.github.io/cyvcf2/. © The Author 2017. Published by Oxford University Press.

  14. TM digital image products for applications. [computer compatible tapes

    NASA Technical Reports Server (NTRS)

    Barker, J. L.; Gunther, F. J.; Abrams, R. B.; Ball, D.

    1984-01-01

    The image characteristics of digital data generated by LANDSAT 4 thematic mapper (TM) are discussed. Digital data from the TM resides in tape files at various stages of image processing. Within each image data file, the image lines are blocked by a factor of either 5 for a computer compatible tape CCT-BT, or 4 for a CCT-AT and CCT-PT; in each format, the image file has a different format. Nominal geometric corrections which provide proper geodetic relationships between different parts of the image are available only for the CCT-PT. It is concluded that detector 3 of band 5 on the TM does not respond; this channel of data needs replacement. The empty bin phenomenon in CCT-AT images results from integer truncations of mixed-mode arithmetric operations.

  15. Final Report: The DNA Files: Unraveling the mysteries of genetics, January 1, 1998-March 31, 1999

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scott, Bari

    1999-05-01

    The DNA Files is an award-winning radio documentary series on genetics created by SoundVision Productions. The DNA Files was hosted by John Hockenberry and was presented in documentary and discussion format. The programs covered a range of topics from prenatal and predictive gene testing, gene therapy, and commercialization of genetic information to new evolutionary genetic evidence, transgenic vegetables and use of DNA in forensics.

  16. Utilizing HDF4 File Content Maps for the Cloud

    NASA Technical Reports Server (NTRS)

    Lee, Hyokyung Joe

    2016-01-01

    We demonstrate a prototype study that HDF4 file content map can be used for efficiently organizing data in cloud object storage system to facilitate cloud computing. This approach can be extended to any binary data formats and to any existing big data analytics solution powered by cloud computing because HDF4 file content map project started as long term preservation of NASA data that doesn't require HDF4 APIs to access data.

  17. Divergence Measures Tool:An Introduction with Brief Tutorial

    DTIC Science & Technology

    2014-03-01

    in detecting differences across a wide range of Arabic -language text files (they varied by genre, domain, spelling variation, size, etc.), our...other. 2 These measures have been put to many uses in natural language processing ( NLP ). In the evaluation of machine translation (MT...files uploaded into the tool must be .txt files in ASCII or UTF-8 format. • This tool has been tested on English and Arabic script**, but should

  18. snpTree--a web-server to identify and construct SNP trees from whole genome sequence data.

    PubMed

    Leekitcharoenphon, Pimlapas; Kaas, Rolf S; Thomsen, Martin Christen Frølund; Friis, Carsten; Rasmussen, Simon; Aarestrup, Frank M

    2012-01-01

    The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script.The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evaluation results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree. The snpTree server is an easy to use option for rapid standardised and automatic SNP analysis in epidemiological studies also for users with limited bioinformatic experience. The web server is freely accessible at http://www.cbs.dtu.dk/services/snpTree-1.0/.

  19. Master Metadata Repository and Metadata-Management System

    NASA Technical Reports Server (NTRS)

    Armstrong, Edward; Reed, Nate; Zhang, Wen

    2007-01-01

    A master metadata repository (MMR) software system manages the storage and searching of metadata pertaining to data from national and international satellite sources of the Global Ocean Data Assimilation Experiment (GODAE) High Resolution Sea Surface Temperature Pilot Project [GHRSSTPP]. These sources produce a total of hundreds of data files daily, each file classified as one of more than ten data products representing global sea-surface temperatures. The MMR is a relational database wherein the metadata are divided into granulelevel records [denoted file records (FRs)] for individual satellite files and collection-level records [denoted data set descriptions (DSDs)] that describe metadata common to all the files from a specific data product. FRs and DSDs adhere to the NASA Directory Interchange Format (DIF). The FRs and DSDs are contained in separate subdatabases linked by a common field. The MMR is configured in MySQL database software with custom Practical Extraction and Reporting Language (PERL) programs to validate and ingest the metadata records. The database contents are converted into the Federal Geographic Data Committee (FGDC) standard format by use of the Extensible Markup Language (XML). A Web interface enables users to search for availability of data from all sources.

  20. Author fees for online publication

    NASA Astrophysics Data System (ADS)

    Like the journals themselves, AGU publication fees have been restructured to accommodate the new online, publish-as-ready approach. The new fee structure is based on authors' providing electronic files of their text and art in acceptable formats (Word, WordPerfect, and LaTeX for text, and .eps or .tif for digital art). However, if you are unable to supply electronic files, you can opt for a higher-charge, full-service route in which AGU will create electronic files from hard copy. All authors for AGU journals are expected to support the journal archive through fees based on number as well as size of article files. The revenue from these fees is set aside for the "Perpetual Care Trust Fund," which will support the migration of the journal archive to new formats or media as technology changes. For several journals, excess length fees remain in place to encourage submission of concisely written articles. During this first transition year, most author fees are based on the number of print page equivalents (pdf) in an article; in the future, however, charges are expected to be associated with file size. The specific fees for each journal are posted on AGU's Web site under Publications-Tools for Authors.

  1. ascii2gdocs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nightingale, Trever

    2011-11-30

    Enables UNIX and Mac OS X command line users to put (individually or batch mode) local ascii files into Google Documents, where the ascii is converted to Google Document format using formatting the user can specify.

  2. GenIce: Hydrogen-Disordered Ice Generator.

    PubMed

    Matsumoto, Masakazu; Yagasaki, Takuma; Tanaka, Hideki

    2018-01-05

    GenIce is an efficient and user-friendly tool to generate hydrogen-disordered ice structures. It makes ice and clathrate hydrate structures in various file formats. More than 100 kinds of structures are preset. Users can install their own crystal structures, guest molecules, and file formats as plugins. The algorithm certifies that the generated structures are completely randomized hydrogen-disordered networks obeying the ice rule with zero net polarization. © 2017 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. © 2017 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.

  3. Behavioral Studies Following Ionizing Radiation Exposures: A Data Base.

    DTIC Science & Technology

    1981-08-01

    48 APPENDIX B. PERFORMANCE DATA FILE FORMAT 63 Tasks 63 Cued 63 Uncued 63 Mixed 64 Data File Format 64 Record 1 Variables 64 Record 2 Through Record N ...Variables 65 Record N + 1 65 Last Four Records 66 APPENDIX C. CROSS-REFERENCE TABLES 67 Subject Search Items 68 Dose Search Items 70 APPENDIX D. TASKS...storage. N EWSPP/SCAT R Because the PDP-8 is a 12-bit machine and the PDP-11’s are 16-bit machines, direct transmission of data collected by the SCAT

  4. U.S. Geological Survey National Computer Technology Meeting (7th): Program and Abstracts, Held in New Orleans, Louisiana, April 10-15, 1994

    DTIC Science & Technology

    1994-01-01

    Magnolia Room 8:00 pm - 10:00 pm FrameMaker Techniques - Moderator, Terry A. Reinitz, USGS, WRD, Reston, Va. Wednesday, April 13,1994 7:30 am...Maker Interchange Format (MIF) strings, to an MIF file. The MIF file is imported into a blank FrameMaker template, creating a word-processor-formatted...draft to camera-ready stages using Data General workstations and software packages that include FrameMaker , CorelDRAW, USGS-G2, Statit, and

  5. Documentation for the machine-readable version of the Survey of the Astrographic Catalogue From 1 to 31 Degrees of Northern Declination (Fresneau 1983)

    NASA Technical Reports Server (NTRS)

    Warren, W. H., Jr.

    1983-01-01

    A description of the machine readable catalog, including detailed format and tape file characteristics, is given. The machine file is a computation of mean values for position and magnitude at a mean epoch of observation for each unique star in the Oxford, Paris, Bordeaux, Toulouse and Northern Hemisphere Algiers zone. The format was changed to effect more efficient data searching by position and additional duplicate entries were removed. The final catalog contains data for 997311 stars.

  6. 76 FR 68194 - Submission for OMB Review; Comment Request

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-11-03

    ... Layouts for HPBS Work Measures. OMB No.: 0970-0230. Description: There is no longer a High Performance... information on States' performance. The Transmission File Layouts form provides the format that States will... are not requesting any changes to the Transmission File Layouts form. Respondents: Respondents may...

  7. As-built design specification for segment map (Sgmap) program

    NASA Technical Reports Server (NTRS)

    Tompkins, M. A. (Principal Investigator)

    1981-01-01

    The segment map program (SGMAP), which is part of the CLASFYT package, is described in detail. This program is designed to output symbolic maps or numerical dumps from LANDSAT cluster/classification files or aircraft ground truth/processed ground truth files which are in 'universal' format.

  8. 77 FR 65188 - Western Area Power Administration; Notice of Filing

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-25

    ... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. EF11-4-002] Western Area Power Administration; Notice of Filing Take notice that on September 12, 2012, Western Area Power Administration submitted revisions to its Open Access Transmission Tariff to correct formatting and technical...

  9. 76 FR 39757 - Filing Procedures

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-06

    ... an optical character recognition process, such a document may contain recognition errors. CAUTION... network speed e-filing of these documents may be difficult. Pursuant to section II(C) above, the Secretary... optical scan format or a typed ``electronic signature,'' e.g., ``/s/Jane Doe.'' (3) In the case of a...

  10. Highway Safety Information System guidebook for the Utah state data files. Volume 1 : SAS file formats

    DOT National Transportation Integrated Search

    1996-06-01

    This volume expands on the presentations in the main manual by presenting further discussions and examples. Contents: Appendix A: The Costs of Travel Surveys; Appendix B: Census Data for Travel Surveys; Appendix C: An Example of the Systems Capabilit...

  11. TOPPE: A framework for rapid prototyping of MR pulse sequences.

    PubMed

    Nielsen, Jon-Fredrik; Noll, Douglas C

    2018-06-01

    To introduce a framework for rapid prototyping of MR pulse sequences. We propose a simple file format, called "TOPPE", for specifying all details of an MR imaging experiment, such as gradient and radiofrequency waveforms and the complete scan loop. In addition, we provide a TOPPE file "interpreter" for GE scanners, which is a binary executable that loads TOPPE files and executes the sequence on the scanner. We also provide MATLAB scripts for reading and writing TOPPE files and previewing the sequence prior to hardware execution. With this setup, the task of the pulse sequence programmer is reduced to creating TOPPE files, eliminating the need for hardware-specific programming. No sequence-specific compilation is necessary; the interpreter only needs to be compiled once (for every scanner software upgrade). We demonstrate TOPPE in three different applications: k-space mapping, non-Cartesian PRESTO whole-brain dynamic imaging, and myelin mapping in the brain using inhomogeneous magnetization transfer. We successfully implemented and executed the three example sequences. By simply changing the various TOPPE sequence files, a single binary executable (interpreter) was used to execute several different sequences. The TOPPE file format is a complete specification of an MR imaging experiment, based on arbitrary sequences of a (typically small) number of unique modules. Along with the GE interpreter, TOPPE comprises a modular and flexible platform for rapid prototyping of new pulse sequences. Magn Reson Med 79:3128-3134, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.

  12. Forensic Analysis of Compromised Computers

    NASA Technical Reports Server (NTRS)

    Wolfe, Thomas

    2004-01-01

    Directory Tree Analysis File Generator is a Practical Extraction and Reporting Language (PERL) script that simplifies and automates the collection of information for forensic analysis of compromised computer systems. During such an analysis, it is sometimes necessary to collect and analyze information about files on a specific directory tree. Directory Tree Analysis File Generator collects information of this type (except information about directories) and writes it to a text file. In particular, the script asks the user for the root of the directory tree to be processed, the name of the output file, and the number of subtree levels to process. The script then processes the directory tree and puts out the aforementioned text file. The format of the text file is designed to enable the submission of the file as input to a spreadsheet program, wherein the forensic analysis is performed. The analysis usually consists of sorting files and examination of such characteristics of files as ownership, time of creation, and time of most recent access, all of which characteristics are among the data included in the text file.

  13. An EXCEL macro for importing log ASCII standard (LAS) files into EXCEL worksheets

    NASA Astrophysics Data System (ADS)

    Özkaya, Sait Ismail

    1996-02-01

    An EXCEL 5.0 macro is presented for converting a LAS text file into an EXCEL worksheet. Although EXCEL has commands for importing text files and parsing text lines, LAS files must be decoded line-by-line because three different delimiters are used to separate fields of differing length. The macro is intended to eliminate manual decoding of LAS version 2.0. LAS is a floppy disk format for storage and transfer of log data as text files. LAS was proposed by the Canadian Well Logging Society. The present EXCEL macro decodes different sections of a LAS file, separates, and places the fields into different columns of an EXCEL worksheet. To import a LAS file into EXCEL without errors, the file must not contain any unrecognized symbols, and the data section must be the last section. The program does not check for the presence of mandatory sections or fields as required by LAS rules. Once a file is incorporated into EXCEL, mandatory sections and fields may be inspected visually.

  14. Cloud Optimized Image Format and Compression

    NASA Astrophysics Data System (ADS)

    Becker, P.; Plesea, L.; Maurer, T.

    2015-04-01

    Cloud based image storage and processing requires revaluation of formats and processing methods. For the true value of the massive volumes of earth observation data to be realized, the image data needs to be accessible from the cloud. Traditional file formats such as TIF and NITF were developed in the hay day of the desktop and assumed fast low latency file access. Other formats such as JPEG2000 provide for streaming protocols for pixel data, but still require a server to have file access. These concepts no longer truly hold in cloud based elastic storage and computation environments. This paper will provide details of a newly evolving image storage format (MRF) and compression that is optimized for cloud environments. Although the cost of storage continues to fall for large data volumes, there is still significant value in compression. For imagery data to be used in analysis and exploit the extended dynamic range of the new sensors, lossless or controlled lossy compression is of high value. Compression decreases the data volumes stored and reduces the data transferred, but the reduced data size must be balanced with the CPU required to decompress. The paper also outlines a new compression algorithm (LERC) for imagery and elevation data that optimizes this balance. Advantages of the compression include its simple to implement algorithm that enables it to be efficiently accessed using JavaScript. Combing this new cloud based image storage format and compression will help resolve some of the challenges of big image data on the internet.

  15. Preliminary Geologic Map of the Topanga 7.5' Quadrangle, Southern California: A Digital Database

    USGS Publications Warehouse

    Yerkes, R.F.; Campbell, R.H.

    1995-01-01

    INTRODUCTION This Open-File report is a digital geologic map database. This pamphlet serves to introduce and describe the digital data. There is no paper map included in the Open-File report. This digital map database is compiled from previously published sources combined with some new mapping and modifications in nomenclature. The geologic map database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U. S. Geological Survey. For detailed descriptions of the units, their stratigraphic relations and sources of geologic mapping consult Yerkes and Campbell (1994). More specific information about the units may be available in the original sources. The content and character of the database and methods of obtaining it are described herein. The geologic map database itself, consisting of three ARC coverages and one base layer, can be obtained over the Internet or by magnetic tape copy as described below. The processes of extracting the geologic map database from the tar file, and importing the ARC export coverages (procedure described herein), will result in the creation of an ARC workspace (directory) called 'topnga.' The database was compiled using ARC/INFO version 7.0.3, a commercial Geographic Information System (Environmental Systems Research Institute, Redlands, California), with version 3.0 of the menu interface ALACARTE (Fitzgibbon and Wentworth, 1991, Fitzgibbon, 1991, Wentworth and Fitzgibbon, 1991). It is stored in uncompressed ARC export format (ARC/INFO version 7.x) in a compressed UNIX tar (tape archive) file. The tar file was compressed with gzip, and may be uncompressed with gzip, which is available free of charge via the Internet from the gzip Home Page (http://w3.teaser.fr/~jlgailly/gzip). A tar utility is required to extract the database from the tar file. This utility is included in most UNIX systems, and can be obtained free of charge via the Internet from Internet Literacy's Common Internet File Formats Webpage http://www.matisse.net/files/formats.html). ARC/INFO export files (files with the .e00 extension) can be converted into ARC/INFO coverages in ARC/INFO (see below) and can be read by some other Geographic Information Systems, such as MapInfo via ArcLink and ESRI's ArcView (version 1.0 for Windows 3.1 to 3.11 is available for free from ESRI's web site: http://www.esri.com). 1. Different base layer - The original digital database included separates clipped out of the Los Angeles 1:100,000 sheet. This release includes a vectorized scan of a scale-stable negative of the Topanga 7.5 minute quadrangle. 2. Map projection - The files in the original release were in polyconic projection. The projection used in this release is state plane, which allows for the tiling of adjacent quadrangles. 3. File compression - The files in the original release were compressed with UNIX compression. The files in this release are compressed with gzip.

  16. 47 CFR 1.913 - Application and notification forms; electronic and manual filing.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... Portable Document Format (PDF) whenever possible. (2) Any associated documents submitted with an... possible. The attachment should be uploaded via ULS in Adobe Acrobat Portable Document Format (PDF... the table of contents, should be in Adobe Acrobat Portable Document Format (PDF) whenever possible...

  17. Collaborative Sharing of Multidimensional Space-time Data Using HydroShare

    NASA Astrophysics Data System (ADS)

    Gan, T.; Tarboton, D. G.; Horsburgh, J. S.; Dash, P. K.; Idaszak, R.; Yi, H.; Blanton, B.

    2015-12-01

    HydroShare is a collaborative environment being developed for sharing hydrological data and models. It includes capability to upload data in many formats as resources that can be shared. The HydroShare data model for resources uses a specific format for the representation of each type of data and specifies metadata common to all resource types as well as metadata unique to specific resource types. The Network Common Data Form (NetCDF) was chosen as the format for multidimensional space-time data in HydroShare. NetCDF is widely used in hydrological and other geoscience modeling because it contains self-describing metadata and supports the creation of array-oriented datasets that may include three spatial dimensions, a time dimension and other user defined dimensions. For example, NetCDF may be used to represent precipitation or surface air temperature fields that have two dimensions in space and one dimension in time. This presentation will illustrate how NetCDF files are used in HydroShare. When a NetCDF file is loaded into HydroShare, header information is extracted using the "ncdump" utility. Python functions developed for the Django web framework on which HydroShare is based, extract science metadata present in the NetCDF file, saving the user from having to enter it. Where the file follows Climate Forecast (CF) convention and Attribute Convention for Dataset Discovery (ACDD) standards, metadata is thus automatically populated. Users also have the ability to add metadata to the resource that may not have been present in the original NetCDF file. HydroShare's metadata editing functionality then writes this science metadata back into the NetCDF file to maintain consistency between the science metadata in HydroShare and the metadata in the NetCDF file. This further helps researchers easily add metadata information following the CF and ACDD conventions. Additional data inspection and subsetting functions were developed, taking advantage of Python and command line libraries for working with NetCDF files. We describe the design and implementation of these features and illustrate how NetCDF files from a modeling application may be curated in HydroShare and thus enhance reproducibility of the associated research. We also discuss future development planned for multidimensional space-time data in HydroShare.

  18. Wave data processing toolbox manual

    USGS Publications Warehouse

    Sullivan, Charlene M.; Warner, John C.; Martini, Marinna A.; Lightsom, Frances S.; Voulgaris, George; Work, Paul

    2006-01-01

    Researchers routinely deploy oceanographic equipment in estuaries, coastal nearshore environments, and shelf settings. These deployments usually include tripod-mounted instruments to measure a suite of physical parameters such as currents, waves, and pressure. Instruments such as the RD Instruments Acoustic Doppler Current Profiler (ADCP(tm)), the Sontek Argonaut, and the Nortek Aquadopp(tm) Profiler (AP) can measure these parameters. The data from these instruments must be processed using proprietary software unique to each instrument to convert measurements to real physical values. These processed files are then available for dissemination and scientific evaluation. For example, the proprietary processing program used to process data from the RD Instruments ADCP for wave information is called WavesMon. Depending on the length of the deployment, WavesMon will typically produce thousands of processed data files. These files are difficult to archive and further analysis of the data becomes cumbersome. More imperative is that these files alone do not include sufficient information pertinent to that deployment (metadata), which could hinder future scientific interpretation. This open-file report describes a toolbox developed to compile, archive, and disseminate the processed wave measurement data from an RD Instruments ADCP, a Sontek Argonaut, or a Nortek AP. This toolbox will be referred to as the Wave Data Processing Toolbox. The Wave Data Processing Toolbox congregates the processed files output from the proprietary software into two NetCDF files: one file contains the statistics of the burst data and the other file contains the raw burst data (additional details described below). One important advantage of this toolbox is that it converts the data into NetCDF format. Data in NetCDF format is easy to disseminate, is portable to any computer platform, and is viewable with public-domain freely-available software. Another important advantage is that a metadata structure is embedded with the data to document pertinent information regarding the deployment and the parameters used to process the data. Using this format ensures that the relevant information about how the data was collected and converted to physical units is maintained with the actual data. EPIC-standard variable names have been utilized where appropriate. These standards, developed by the NOAA Pacific Marine Environmental Laboratory (PMEL) (http://www.pmel.noaa.gov/epic/), provide a universal vernacular allowing researchers to share data without translation.

  19. NEMAR plotting computer program

    NASA Technical Reports Server (NTRS)

    Myler, T. R.

    1981-01-01

    A FORTRAN coded computer program which generates CalComp plots of trajectory parameters is examined. The trajectory parameters are calculated and placed on a data file by the Near Earth Mission Analysis Routine computer program. The plot program accesses the data file and generates the plots as defined by inputs to the plot program. Program theory, user instructions, output definitions, subroutine descriptions and detailed FORTRAN coding information are included. Although this plot program utilizes a random access data file, a data file of the same type and formatted in 102 numbers per record could be generated by any computer program and used by this plot program.

  20. 43 CFR 46.415 - Environmental impact statement content, alternatives, circulation and filing requirements.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 43 Public Lands: Interior 1 2010-10-01 2010-10-01 false Environmental impact statement content... Impact Statements § 46.415 Environmental impact statement content, alternatives, circulation and filing requirements. The Responsible Official may use any environmental impact statement format and design as long as...

Top