Sample records for accurate mass database

  1. Broad screening of illicit ingredients in cosmetics using ultra-high-performance liquid chromatography-hybrid quadrupole-Orbitrap mass spectrometry with customized accurate-mass database and mass spectral library.

    PubMed

    Meng, Xianshuang; Bai, Hua; Guo, Teng; Niu, Zengyuan; Ma, Qiang

    2017-12-15

    Comprehensive identification and quantitation of 100 multi-class regulated ingredients in cosmetics was achieved using ultra-high-performance liquid chromatography (UHPLC) coupled with hybrid quadrupole-Orbitrap high-resolution mass spectrometry (Q-Orbitrap HRMS). A simple, efficient, and inexpensive sample pretreatment protocol was developed using ultrasound-assisted extraction (UAE), followed by dispersive solid-phase extraction (dSPE). The cosmetic samples were analyzed by UHPLC-Q-Orbitrap HRMS under synchronous full-scan MS and data-dependent MS/MS (full-scan MS 1 /dd-MS 2 ) acquisition mode. The mass resolution was set to 70,000 FWHM (full width at half maximum) for full-scan MS 1 and 17,500 FWHM for dd-MS 2 stage with the experimentally measured mass deviations of less than 2ppm (parts per million) for quasi-molecular ions and 5ppm for characteristic fragment ions for each individual analyte. An accurate-mass database and a mass spectral library were built in house for searching the 100 target compounds. Broad screening was conducted by comparing the experimentally measured exact mass of precursor and fragment ions, retention time, isotopic pattern, and ionic ratio with the accurate-mass database and by matching the acquired MS/MS spectra against the mass spectral library. The developed methodology was evaluated and validated in terms of limits of detection (LODs), limits of quantitation (LOQs), linearity, stability, accuracy, and matrix effect. The UHPLC-Q-Orbitrap HRMS approach was applied for the analysis of 100 target illicit ingredients in 123 genuine cosmetic samples, and exhibited great potential for high-throughput, sensitive, and reliable screening of multi-class illicit compounds in cosmetics. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Comprehensive identification and structural characterization of target components from Gelsemium elegans by high-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry based on accurate mass databases combined with MS/MS spectra.

    PubMed

    Liu, Yan-Chun; Xiao, Sa; Yang, Kun; Ling, Li; Sun, Zhi-Liang; Liu, Zhao-Ying

    2017-06-01

    This study reports an applicable analytical strategy of comprehensive identification and structure characterization of target components from Gelsemium elegans by using high-performance liquid chromatography quadrupole time-of-flight mass spectrometry (LC-QqTOF MS) based on the use of accurate mass databases combined with MS/MS spectra. The databases created included accurate masses and elemental compositions of 204 components from Gelsemium and their structural data. The accurate MS and MS/MS spectra were acquired through data-dependent auto MS/MS mode followed by an extraction of the potential compounds from the LC-QqTOF MS raw data of the sample. The same was matched using the databases to search for targeted components in the sample. The structures for detected components were tentatively characterized by manually interpreting the accurate MS/MS spectra for the first time. A total of 57 components have been successfully detected and structurally characterized from the crude extracts of G. elegans, but has failed to differentiate some isomers. This analytical strategy is generic and efficient, avoids isolation and purification procedures, enables a comprehensive structure characterization of target components of Gelsemium and would be widely applicable for complicated mixtures that are derived from Gelsemium preparations. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  3. Identification of "Known Unknowns" Utilizing Accurate Mass Data and ChemSpider

    NASA Astrophysics Data System (ADS)

    Little, James L.; Williams, Antony J.; Pshenichnov, Alexey; Tkachenko, Valery

    2012-01-01

    In many cases, an unknown to an investigator is actually known in the chemical literature, a reference database, or an internet resource. We refer to these types of compounds as "known unknowns." ChemSpider is a very valuable internet database of known compounds useful in the identification of these types of compounds in commercial, environmental, forensic, and natural product samples. The database contains over 26 million entries from hundreds of data sources and is provided as a free resource to the community. Accurate mass mass spectrometry data is used to query the database by either elemental composition or a monoisotopic mass. Searching by elemental composition is the preferred approach. However, it is often difficult to determine a unique elemental composition for compounds with molecular weights greater than 600 Da. In these cases, searching by the monoisotopic mass is advantageous. In either case, the search results are refined by sorting the number of references associated with each compound in descending order. This raises the most useful candidates to the top of the list for further evaluation. These approaches were shown to be successful in identifying "known unknowns" noted in our laboratory and for compounds of interest to others.

  4. Screening of 439 Pesticide Residues in Fruits and Vegetables by Gas Chromatography-Quadrupole-Time-of-Flight Mass Spectrometry Based on TOF Accurate Mass Database and Q-TOF Spectrum Library.

    PubMed

    Li, Jian-Xun; Li, Xiao-Ying; Chang, Qiao-Ying; Li, Yan; Jin, Ling-He; Pang, Guo-Fang; Fan, Chun-Lin

    2018-05-03

    Because of its unique characteristics of accurate mass full-spectrum acquisition, high resolution, and fast acquisition rates, GC-quadrupole-time-of-flight MS (GC-Q-TOF/MS) has become a powerful tool for pesticide residue analysis. In this study, a TOF accurate mass database and Q-TOF spectrum library of 439 pesticides were established, and the parameters of the TOF database were optimized. Through solid-phase extraction (SPE), whereby pesticides are extracted from fruit and vegetable substrates by using 40 mL 1% acetic acid in acetonitrile (v/v), purified by the Carbon/NH₂ SPE cartridge, and finally detected by GC-Q-TOF/MS, the rapid analysis of 439 pesticides in fruits and vegetables can be achieved. The methodology verification results show that more than 70 and 91% of pesticides, spiked in fruits and vegetables with concentrations of 10 and 100 μg/kg, respectively, saw recoveries that conform to the European Commission's criterion of between 70 and 120% with RSD ≤20%. Eighty-one percent of pesticides have screening detection limits lower than 10 μg/kg, which makes this a reliable analysis technology for the monitoring of pesticide residues in fruits and vegetables. This technology was further validated for its characteristics of high precision, high speed, and high throughput through successful detection of 9817 samples during 2013-2015.

  5. Rapid Screening of Bovine Milk Oligosaccharides in a Whey Permeate Product and Domestic Animal Milks by Accurate Mass Database and Tandem Mass Spectral Library.

    PubMed

    Lee, Hyeyoung; Cuthbertson, Daniel J; Otter, Don E; Barile, Daniela

    2016-08-17

    A bovine milk oligosaccharide (BMO) library, prepared from cow colostrum, with 34 structures was generated and used to rapidly screen oligosaccharides in domestic animal milks and a whey permeate powder. The novel library was entered into a custom Personal Compound Database and Library (PCDL) and included accurate mass, retention time, and tandem mass spectra. Oligosaccharides in minute-sized samples were separated using nanoliquid chromatography (nanoLC) coupled to a high resolution and sensitive quadrupole-Time of Flight (Q-ToF) MS system. Using the PCDL, 18 oligosaccharides were found in a BMO-enriched product obtained from whey permeate processing. The usefulness of the analytical system and BMO library was further validated using milks from domestic sheep and buffaloes. Through BMO PCDL searching, 15 and 13 oligosaccharides in the BMO library were assigned in sheep and buffalo milks, respectively, thus demonstrating significant overlap between oligosaccharides in bovine (cow and buffalo) and ovine (sheep) milks. This method was shown to be an efficient, reliable, and rapid tool to identify oligosaccharide structures using automated spectral matching.

  6. Rapid Screening of Bovine Milk Oligosaccharides in a Whey Permeate Product and Domestic Animal Milks by Accurate Mass Database and Tandem Mass Spectral Library

    PubMed Central

    Lee, Hyeyoung; Cuthbertson, Daniel J.; Otter, Don E.; Barile, Daniela

    2018-01-01

    A bovine milk oligosaccharide (BMO) library, prepared from cow colostrum, with 34 structures was generated and used to rapidly screen oligosaccharides in domestic animal milks and a whey permeate powder. The novel library was entered into a custom Personal Compound Database and Library (PCDL) and included accurate mass, retention time, and tandem mass spectra. Oligosaccharides in minute-sized samples were separated using nanoliquid chromatography (nanoLC) coupled to a high resolution and sensitive quadrupole-Time of Flight (Q-ToF) MS system. Using the PCDL, 18 oligosaccharides were found in a BMO-enriched product obtained from whey permeate processing. The usefulness of the analytical system and BMO library was further validated using milks from domestic sheep and buffaloes. Through BMO PCDL searching, 15 and 13 oligosaccharides in the BMO library were assigned in sheep and buffalo milks, respectively, thus demonstrating significant overlap between oligosaccharides in bovine (cow and buffalo) and ovine (sheep) milks. This method was shown to be an efficient, reliable, and rapid tool to identify oligosaccharide structures using automated spectral matching. PMID:27428379

  7. An application of a relational database system for high-throughput prediction of elemental compositions from accurate mass values.

    PubMed

    Sakurai, Nozomu; Ara, Takeshi; Kanaya, Shigehiko; Nakamura, Yukiko; Iijima, Yoko; Enomoto, Mitsuo; Motegi, Takeshi; Aoki, Koh; Suzuki, Hideyuki; Shibata, Daisuke

    2013-01-15

    High-accuracy mass values detected by high-resolution mass spectrometry analysis enable prediction of elemental compositions, and thus are used for metabolite annotations in metabolomic studies. Here, we report an application of a relational database to significantly improve the rate of elemental composition predictions. By searching a database of pre-calculated elemental compositions with fixed kinds and numbers of atoms, the approach eliminates redundant evaluations of the same formula that occur in repeated calculations with other tools. When our approach is compared with HR2, which is one of the fastest tools available, our database search times were at least 109 times shorter than those of HR2. When a solid-state drive (SSD) was applied, the search time was 488 times shorter at 5 ppm mass tolerance and 1833 times at 0.1 ppm. Even if the search by HR2 was performed with 8 threads in a high-spec Windows 7 PC, the database search times were at least 26 and 115 times shorter without and with the SSD. These improvements were enhanced in a low spec Windows XP PC. We constructed a web service 'MFSearcher' to query the database in a RESTful manner. Available for free at http://webs2.kazusa.or.jp/mfsearcher. The web service is implemented in Java, MySQL, Apache and Tomcat, with all major browsers supported. sakurai@kazusa.or.jp Supplementary data are available at Bioinformatics online.

  8. Identification of imidacloprid metabolites in onion (Allium cepa L.) using high-resolution mass spectrometry and accurate mass tools.

    PubMed

    Thurman, E Michael; Ferrer, Imma; Zavitsanos, Paul; Zweigenbaum, Jerry A

    2013-09-15

    Imidacloprid is a potent and widely used insecticide on vegetable crops, such as onion (Allium cepa L.). Because of possible toxicity to beneficial insects, imidacloprid and several metabolites have raised safety concerns for pollenating insects, such as honey bees. Thus, imidacloprid metabolites continue to be an important subject for new methods that better understand its dissipation and fate in plants, such as onions. One month after a single addition of imidacloprid to soil containing onion plants, imidacloprid and its metabolites were extracted from pulverized onion with a methanol/water-buffer mixture and analyzed by liquid chromatography/quadrupole time-of-flight mass spectrometry (LC/QTOF-MS) using a labeled imidacloprid internal standard and tandem mass spectrometric (MS/MS) analysis. Accurate mass tools were developed and applied to detect seven new metabolites of imidacloprid with the goal to better understand its fate in onion. The accurate mass tools include: database searching, diagnostic ions, chlorine mass filters, Mass Profiler software, and manual use of metabolic analogy. The new metabolites discovered included an amine reduction product (m/z 226.0854), and its methylated analogue (m/z 240.1010), and five other metabolites, all of unknown toxicity to insects. The accurate mass tools were combined with LC/QTOF-MS and were able to detect both known and new metabolites of imidacloprid using fragmentation studies of both parent and labeled standards. New metabolites and their structures were inferred from these MS/MS studies with accurate mass, which makes it possible to better understand imidacloprid metabolism in onion as well as new metabolite targets for toxicity studies. Copyright © 2013 John Wiley & Sons, Ltd.

  9. Application of the accurate mass and time tag approach in studies of the human blood lipidome

    PubMed Central

    Ding, Jie; Sorensen, Christina M.; Jaitly, Navdeep; Jiang, Hongliang; Orton, Daniel J.; Monroe, Matthew E.; Moore, Ronald J.; Smith, Richard D.; Metz, Thomas O.

    2008-01-01

    We report a preliminary demonstration of the accurate mass and time (AMT) tag approach for lipidomics. Initial data-dependent LC-MS/MS analyses of human plasma, erythrocyte, and lymphocyte lipids were performed in order to identify lipid molecular species in conjunction with complementary accurate mass and isotopic distribution information. Identified lipids were used to populate initial lipid AMT tag databases containing 250 and 45 entries for those species detected in positive and negative electrospray ionization (ESI) modes, respectively. The positive ESI database was then utilized to identify human plasma, erythrocyte, and lymphocyte lipids in high-throughput LC-MS analyses based on the AMT tag approach. We were able to define the lipid profiles of human plasma, erythrocytes, and lymphocytes based on qualitative and quantitative differences in lipid abundance. PMID:18502191

  10. Accurate mass measurement: terminology and treatment of data.

    PubMed

    Brenton, A Gareth; Godfrey, A Ruth

    2010-11-01

    High-resolution mass spectrometry has become ever more accessible with improvements in instrumentation, such as modern FT-ICR and Orbitrap mass spectrometers. This has resulted in an increase in the number of articles submitted for publication quoting accurate mass data. There is a plethora of terms related to accurate mass analysis that are in current usage, many employed incorrectly or inconsistently. This article is based on a set of notes prepared by the authors for research students and staff in our laboratories as a guide to the correct terminology and basic statistical procedures to apply in relation to mass measurement, particularly for accurate mass measurement. It elaborates on the editorial by Gross in 1994 regarding the use of accurate masses for structure confirmation. We have presented and defined the main terms in use with reference to the International Union of Pure and Applied Chemistry (IUPAC) recommendations for nomenclature and symbolism for mass spectrometry. The correct use of statistics and treatment of data is illustrated as a guide to new and existing mass spectrometry users with a series of examples as well as statistical methods to compare different experimental methods and datasets. Copyright © 2010. Published by Elsevier Inc.

  11. Accurate Mass Fragment Library for Rapid Analysis of Pesticides on Produce Using Ambient Pressure Desorption Ionization with High-Resolution Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Kern, Sara E.; Lin, Lora A.; Fricke, Frederick L.

    2014-08-01

    U.S. food imports have been increasing steadily for decades, intensifying the need for a rapid and sensitive screening technique. A method has been developed that uses foam disks to sample the surface of incoming produce. This work provides complimentary information to the extensive amount of published pesticide fragmentation data collected using LCMS systems (Sack et al. Journal of Agricultural and Food Chemistry, 59, 6383-6411, 2011; Mol et al. Analytical and Bioanalytical Chemistry, 403, 2891-2908, 2012). The disks are directly analyzed using transmission-mode direct analysis in real time (DART) ambient pressure desorption ionization coupled to a high resolution accurate mass-mass spectrometer (HRAM-MS). In order to provide more certainty in the identification of the pesticides detected, a library of accurate mass fragments and isotopes of the protonated parent molecular ion (the [M+H]+) has been developed. The HRAM-MS is equipped with a quadrupole mass filter, providing the capability of "data-dependent" fragmentation, as opposed to "all -ion" fragmentation (where all of the ions enter a collision chamber and are fragmented at once). A temperature gradient for the DART helium stream and multiple collision energies were employed to detect and fragment 164 pesticides of varying chemical classes, sizes, and polarities. The accurate mass information of precursor ([M+H]+ ion) and fragment ions is essential in correctly identifying chemical contaminants on the surface of imported produce. Additionally, the inclusion of isotopes of the [M+H]+ in the database adds another metric to the confirmation process. The fragmentation data were collected using a Q-Exactive mass spectrometer and were added to a database used to process data collected with an Exactive mass spectrometer, an instrument that is more readily available for this screening application. The commodities investigated range from smooth-skinned produce such as apples to rougher surfaces like broccoli. The

  12. Screening of 485 Pesticide Residues in Fruits and Vegetables by Liquid Chromatography-Quadrupole-Time-of-Flight Mass Spectrometry Based on TOF Accurate Mass Database and QTOF Spectrum Library.

    PubMed

    Pang, Guo-Fang; Fan, Chun-Lin; Chang, Qiao-Ying; Li, Jian-Xun; Kang, Jian; Lu, Mei-Ling

    2018-03-22

    This paper uses the LC-quadrupole-time-of-flight MS technique to evaluate the behavioral characteristics of MSof 485 pesticides under different conditions and has developed an accurate mass database and spectra library. A high-throughput screening and confirmation method has been developed for the 485 pesticides in fruits and vegetables. Through the optimization of parameters such as accurate mass number, time of retention window, ionization forms, etc., the method has improved the accuracy of pesticide screening, thus avoiding the occurrence of false-positive and false-negative results. The method features a full scan of fragments, with 80% of pesticide qualitative points over 10, which helps increase pesticide qualitative accuracy. The abundant differences of fragment categories help realize the effective separation and qualitative identification of isomer pesticides. Four different fruits and vegetables-apples, grapes, celery, and tomatoes-were chosen to evaluate the efficiency of the method at three fortification levels of 5, 10, and 20 μg/kg, and satisfactory results were obtained. With this method, a national survey of pesticide residues was conducted between 2012 and 2015 for 12 551 samples of 146 different fruits and vegetables collected from 638 sampling points in 284 counties across 31 provincial capitals/cities directly under the central government, which provided scientific data backup for ensuring pesticide residue safety of the fruits and vegetables consumed daily by the public. Meanwhile, the big data statistical analysis of the new technique also further proves it to be of high speed, high throughput, high accuracy, high reliability, and high informatization.

  13. A new, high-resolution global mass coral bleaching database

    PubMed Central

    Rickbeil, Gregory J. M.; Heron, Scott F.

    2017-01-01

    Episodes of mass coral bleaching have been reported in recent decades and have raised concerns about the future of coral reefs on a warming planet. Despite the efforts to enhance and coordinate coral reef monitoring within and across countries, our knowledge of the geographic extent of mass coral bleaching over the past few decades is incomplete. Existing databases, like ReefBase, are limited by the voluntary nature of contributions, geographical biases in data collection, and the variations in the spatial scale of bleaching reports. In this study, we have developed the first-ever gridded, global-scale historical coral bleaching database. First, we conducted a targeted search for bleaching reports not included in ReefBase by personally contacting scientists and divers conducting monitoring in under-reported locations and by extracting data from the literature. This search increased the number of observed bleaching reports by 79%, from 4146 to 7429. Second, we employed spatial interpolation techniques to develop annual 0.04° × 0.04° latitude-longitude global maps of the probability that bleaching occurred for 1985 through 2010. Initial results indicate that the area of coral reefs with a more likely than not (>50%) or likely (>66%) probability of bleaching was eight times higher in the second half of the assessed time period, after the 1997/1998 El Niño. The results also indicate that annual maximum Degree Heating Weeks, a measure of thermal stress, for coral reefs with a high probability of bleaching increased over time. The database will help the scientific community more accurately assess the change in the frequency of mass coral bleaching events, validate methods of predicting mass coral bleaching, and test whether coral reefs are adjusting to rising ocean temperatures. PMID:28445534

  14. A new, high-resolution global mass coral bleaching database.

    PubMed

    Donner, Simon D; Rickbeil, Gregory J M; Heron, Scott F

    2017-01-01

    Episodes of mass coral bleaching have been reported in recent decades and have raised concerns about the future of coral reefs on a warming planet. Despite the efforts to enhance and coordinate coral reef monitoring within and across countries, our knowledge of the geographic extent of mass coral bleaching over the past few decades is incomplete. Existing databases, like ReefBase, are limited by the voluntary nature of contributions, geographical biases in data collection, and the variations in the spatial scale of bleaching reports. In this study, we have developed the first-ever gridded, global-scale historical coral bleaching database. First, we conducted a targeted search for bleaching reports not included in ReefBase by personally contacting scientists and divers conducting monitoring in under-reported locations and by extracting data from the literature. This search increased the number of observed bleaching reports by 79%, from 4146 to 7429. Second, we employed spatial interpolation techniques to develop annual 0.04° × 0.04° latitude-longitude global maps of the probability that bleaching occurred for 1985 through 2010. Initial results indicate that the area of coral reefs with a more likely than not (>50%) or likely (>66%) probability of bleaching was eight times higher in the second half of the assessed time period, after the 1997/1998 El Niño. The results also indicate that annual maximum Degree Heating Weeks, a measure of thermal stress, for coral reefs with a high probability of bleaching increased over time. The database will help the scientific community more accurately assess the change in the frequency of mass coral bleaching events, validate methods of predicting mass coral bleaching, and test whether coral reefs are adjusting to rising ocean temperatures.

  15. Mass spectrometry-based protein identification with accurate statistical significance assignment.

    PubMed

    Alves, Gelio; Yu, Yi-Kuo

    2015-03-01

    Assigning statistical significance accurately has become increasingly important as metadata of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of metadata at any level may propagate to downstream analyses, undermining the validity of scientific conclusions thus drawn. From the perspective of mass spectrometry-based proteomics, even though accurate statistics for peptide identification can now be achieved, accurate protein level statistics remain challenging. We have constructed a protein ID method that combines peptide evidences of a candidate protein based on a rigorous formula derived earlier; in this formula the database P-value of every peptide is weighted, prior to the final combination, according to the number of proteins it maps to. We have also shown that this protein ID method provides accurate protein level E-value, eliminating the need of using empirical post-processing methods for type-I error control. Using a known protein mixture, we find that this protein ID method, when combined with the Sorić formula, yields accurate values for the proportion of false discoveries. In terms of retrieval efficacy, the results from our method are comparable with other methods tested. The source code, implemented in C++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.

  16. Libraries of Peptide Fragmentation Mass Spectra Database

    National Institute of Standards and Technology Data Gateway

    SRD 1C NIST Libraries of Peptide Fragmentation Mass Spectra Database (Web, free access)   The purpose of the library is to provide peptide reference data for laboratories employing mass spectrometry-based proteomics methods for protein analysis. Mass spectral libraries identify these compounds in a more sensitive and robust manner than alternative methods. These databases are freely available for testing and development of new applications.

  17. The BDNYC database of low-mass stars, brown dwarfs, and planetary mass companions

    NASA Astrophysics Data System (ADS)

    Cruz, Kelle; Rodriguez, David; Filippazzo, Joseph; Gonzales, Eileen; Faherty, Jacqueline K.; Rice, Emily; BDNYC

    2018-01-01

    We present a web-interface to a database of low-mass stars, brown dwarfs, and planetary mass companions. Users can send SELECT SQL queries to the database, perform searches by coordinates or name, check the database inventory on specified objects, and even plot spectra interactively. The initial version of this database contains information for 198 objects and version 2 will contain over 1000 objects. The database currently includes photometric data from 2MASS, WISE, and Spitzer and version 2 will include a significant portion of the publicly available optical and NIR spectra for brown dwarfs. The database is maintained and curated by the BDNYC research group and we welcome contributions from other researchers via GitHub.

  18. Evaluation of MALDI-TOF mass spectrometry for identification of environmental yeasts and development of supplementary database.

    PubMed

    Agustini, Bruna Carla; Silva, Luciano Paulino; Bloch, Carlos; Bonfim, Tania M B; da Silva, Gildo Almeida

    2014-06-01

    Yeast identification using traditional methods which employ morphological, physiological, and biochemical characteristics can be considered a hard task as it requires experienced microbiologists and a rigorous control in culture conditions that could implicate in different outcomes. Considering clinical or industrial applications, the fast and accurate identification of microorganisms is a crescent demand. Hence, molecular biology approaches has been extensively used and, more recently, protein profiling using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has proved to be an even more efficient tool for taxonomic purposes. Nonetheless, concerning to mass spectrometry, data available for the differentiation of yeast species for industrial purpose is limited and reference databases commercially available comprise almost exclusively clinical microorganisms. In this context, studies focusing on environmental isolates are required to extend the existing databases. The development of a supplementary database and the assessment of a commercial database for taxonomic identifications of environmental yeast are the aims of this study. We challenge MALDI-TOF MS to create protein profiles for 845 yeast strains isolated from grape must and 67.7 % of the strains were successfully identified according to previously available manufacturer database. The remaining 32.3 % strains were not identified due to the absence of a reference spectrum. After matching the correct taxon for these strains by using molecular biology approaches, the spectra concerning the missing species were added in a supplementary database. This new library was able to accurately predict unidentified species at first instance by MALDI-TOF MS, proving it is a powerful tool for the identification of environmental yeasts.

  19. Analysis of hydraulic fracturing flowback and produced waters using accurate mass: identification of ethoxylated surfactants.

    PubMed

    Thurman, E Michael; Ferrer, Imma; Blotevogel, Jens; Borch, Thomas

    2014-10-07

    Two series of ethylene oxide (EO) surfactants, polyethylene glycols (PEGs from EO3 to EO33) and linear alkyl ethoxylates (LAEs C-9 to C-15 with EO3-EO28), were identified in hydraulic fracturing flowback and produced water using a new application of the Kendrick mass defect and liquid chromatography/quadrupole-time-of-flight mass spectrometry. The Kendrick mass defect differentiates the proton, ammonium, and sodium adducts in both singly and doubly charged forms. A structural model of adduct formation is presented, and binding constants are calculated, which is based on a spherical cagelike conformation, where the central cation (NH4(+) or Na(+)) is coordinated with ether oxygens. A major purpose of the study was the identification of the ethylene oxide (EO) surfactants and the construction of a database with accurate masses and retention times in order to unravel the mass spectral complexity of surfactant mixtures used in hydraulic fracturing fluids. For example, over 500 accurate mass assignments are made in a few seconds of computer time, which then is used as a fingerprint chromatogram of the water samples. This technique is applied to a series of flowback and produced water samples to illustrate the usefulness of ethoxylate "fingerprinting", in a first application to monitor water quality that results from fluids used in hydraulic fracturing.

  20. Integrated database for rapid mass movements in Norway

    NASA Astrophysics Data System (ADS)

    Jaedicke, C.; Lied, K.; Kronholm, K.

    2009-03-01

    Rapid gravitational slope mass movements include all kinds of short term relocation of geological material, snow or ice. Traditionally, information about such events is collected separately in different databases covering selected geographical regions and types of movement. In Norway the terrain is susceptible to all types of rapid gravitational slope mass movements ranging from single rocks hitting roads and houses to large snow avalanches and rock slides where entire mountainsides collapse into fjords creating flood waves and endangering large areas. In addition, quick clay slides occur in desalinated marine sediments in South Eastern and Mid Norway. For the authorities and inhabitants of endangered areas, the type of threat is of minor importance and mitigation measures have to consider several types of rapid mass movements simultaneously. An integrated national database for all types of rapid mass movements built around individual events has been established. Only three data entries are mandatory: time, location and type of movement. The remaining optional parameters enable recording of detailed information about the terrain, materials involved and damages caused. Pictures, movies and other documentation can be uploaded into the database. A web-based graphical user interface has been developed allowing new events to be entered, as well as editing and querying for all events. An integration of the database into a GIS system is currently under development. Datasets from various national sources like the road authorities and the Geological Survey of Norway were imported into the database. Today, the database contains 33 000 rapid mass movement events from the last five hundred years covering the entire country. A first analysis of the data shows that the most frequent type of recorded rapid mass movement is rock slides and snow avalanches followed by debris slides in third place. Most events are recorded in the steep fjord terrain of the Norwegian west coast, but

  1. A Comprehensive Strategy to Construct In-house Database for Accurate and Batch Identification of Small Molecular Metabolites.

    PubMed

    Zhao, Xinjie; Zeng, Zhongda; Chen, Aiming; Lu, Xin; Zhao, Chunxia; Hu, Chunxiu; Zhou, Lina; Liu, Xinyu; Wang, Xiaolin; Hou, Xiaoli; Ye, Yaorui; Xu, Guowang

    2018-05-29

    Identification of the metabolites is an essential step in metabolomics study to interpret regulatory mechanism of pathological and physiological processes. However, it is still a big headache in LC-MSn-based studies because of the complexity of mass spectrometry, chemical diversity of metabolites, and deficiency of standards database. In this work, a comprehensive strategy is developed for accurate and batch metabolite identification in non-targeted metabolomics studies. First, a well defined procedure was applied to generate reliable and standard LC-MS2 data including tR, MS1 and MS2 information at a standard operational procedure (SOP). An in-house database including about 2000 metabolites was constructed and used to identify the metabolites in non-targeted metabolic profiling by retention time calibration using internal standards, precursor ion alignment and ion fusion, auto-MS2 information extraction and selection, and database batch searching and scoring. As an application example, a pooled serum sample was analyzed to deliver the strategy, 202 metabolites were identified in the positive ion mode. It shows our strategy is useful for LC-MSn-based non-targeted metabolomics study.

  2. SwePep, a database designed for endogenous peptides and mass spectrometry.

    PubMed

    Fälth, Maria; Sköld, Karl; Norrman, Mathias; Svensson, Marcus; Fenyö, David; Andren, Per E

    2006-06-01

    A new database, SwePep, specifically designed for endogenous peptides, has been constructed to significantly speed up the identification process from complex tissue samples utilizing mass spectrometry. In the identification process the experimental peptide masses are compared with the peptide masses stored in the database both with and without possible post-translational modifications. This intermediate identification step is fast and singles out peptides that are potential endogenous peptides and can later be confirmed with tandem mass spectrometry data. Successful applications of this methodology are presented. The SwePep database is a relational database developed using MySql and Java. The database contains 4180 annotated endogenous peptides from different tissues originating from 394 different species as well as 50 novel peptides from brain tissue identified in our laboratory. Information about the peptides, including mass, isoelectric point, sequence, and precursor protein, is also stored in the database. This new approach holds great potential for removing the bottleneck that occurs during the identification process in the field of peptidomics. The SwePep database is available to the public.

  3. A statistical method for assessing peptide identification confidence in accurate mass and time tag proteomics

    PubMed Central

    Stanley, Jeffrey R.; Adkins, Joshua N.; Slysz, Gordon W.; Monroe, Matthew E.; Purvine, Samuel O.; Karpievitch, Yuliya V.; Anderson, Gordon A.; Smith, Richard D.; Dabney, Alan R.

    2011-01-01

    Current algorithms for quantifying peptide identification confidence in the accurate mass and time (AMT) tag approach assume that the AMT tags themselves have been correctly identified. However, there is uncertainty in the identification of AMT tags, as this is based on matching LC-MS/MS fragmentation spectra to peptide sequences. In this paper, we incorporate confidence measures for the AMT tag identifications into the calculation of probabilities for correct matches to an AMT tag database, resulting in a more accurate overall measure of identification confidence for the AMT tag approach. The method is referred to as Statistical Tools for AMT tag Confidence (STAC). STAC additionally provides a Uniqueness Probability (UP) to help distinguish between multiple matches to an AMT tag and a method to calculate an overall false discovery rate (FDR). STAC is freely available for download as both a command line and a Windows graphical application. PMID:21692516

  4. The volatile compound BinBase mass spectral database.

    PubMed

    Skogerson, Kirsten; Wohlgemuth, Gert; Barupal, Dinesh K; Fiehn, Oliver

    2011-08-04

    Volatile compounds comprise diverse chemical groups with wide-ranging sources and functions. These compounds originate from major pathways of secondary metabolism in many organisms and play essential roles in chemical ecology in both plant and animal kingdoms. In past decades, sampling methods and instrumentation for the analysis of complex volatile mixtures have improved; however, design and implementation of database tools to process and store the complex datasets have lagged behind. The volatile compound BinBase (vocBinBase) is an automated peak annotation and database system developed for the analysis of GC-TOF-MS data derived from complex volatile mixtures. The vocBinBase DB is an extension of the previously reported metabolite BinBase software developed to track and identify derivatized metabolites. The BinBase algorithm uses deconvoluted spectra and peak metadata (retention index, unique ion, spectral similarity, peak signal-to-noise ratio, and peak purity) from the Leco ChromaTOF software, and annotates peaks using a multi-tiered filtering system with stringent thresholds. The vocBinBase algorithm assigns the identity of compounds existing in the database. Volatile compound assignments are supported by the Adams mass spectral-retention index library, which contains over 2,000 plant-derived volatile compounds. Novel molecules that are not found within vocBinBase are automatically added using strict mass spectral and experimental criteria. Users obtain fully annotated data sheets with quantitative information for all volatile compounds for studies that may consist of thousands of chromatograms. The vocBinBase database may also be queried across different studies, comprising currently 1,537 unique mass spectra generated from 1.7 million deconvoluted mass spectra of 3,435 samples (18 species). Mass spectra with retention indices and volatile profiles are available as free download under the CC-BY agreement (http://vocbinbase.fiehnlab.ucdavis.edu). The Bin

  5. The volatile compound BinBase mass spectral database

    PubMed Central

    2011-01-01

    Background Volatile compounds comprise diverse chemical groups with wide-ranging sources and functions. These compounds originate from major pathways of secondary metabolism in many organisms and play essential roles in chemical ecology in both plant and animal kingdoms. In past decades, sampling methods and instrumentation for the analysis of complex volatile mixtures have improved; however, design and implementation of database tools to process and store the complex datasets have lagged behind. Description The volatile compound BinBase (vocBinBase) is an automated peak annotation and database system developed for the analysis of GC-TOF-MS data derived from complex volatile mixtures. The vocBinBase DB is an extension of the previously reported metabolite BinBase software developed to track and identify derivatized metabolites. The BinBase algorithm uses deconvoluted spectra and peak metadata (retention index, unique ion, spectral similarity, peak signal-to-noise ratio, and peak purity) from the Leco ChromaTOF software, and annotates peaks using a multi-tiered filtering system with stringent thresholds. The vocBinBase algorithm assigns the identity of compounds existing in the database. Volatile compound assignments are supported by the Adams mass spectral-retention index library, which contains over 2,000 plant-derived volatile compounds. Novel molecules that are not found within vocBinBase are automatically added using strict mass spectral and experimental criteria. Users obtain fully annotated data sheets with quantitative information for all volatile compounds for studies that may consist of thousands of chromatograms. The vocBinBase database may also be queried across different studies, comprising currently 1,537 unique mass spectra generated from 1.7 million deconvoluted mass spectra of 3,435 samples (18 species). Mass spectra with retention indices and volatile profiles are available as free download under the CC-BY agreement (http

  6. Fast and accurate mock catalogue generation for low-mass galaxies

    NASA Astrophysics Data System (ADS)

    Koda, Jun; Blake, Chris; Beutler, Florian; Kazin, Eyal; Marin, Felipe

    2016-06-01

    We present an accurate and fast framework for generating mock catalogues including low-mass haloes, based on an implementation of the COmoving Lagrangian Acceleration (COLA) technique. Multiple realisations of mock catalogues are crucial for analyses of large-scale structure, but conventional N-body simulations are too computationally expensive for the production of thousands of realizations. We show that COLA simulations can produce accurate mock catalogues with a moderate computation resource for low- to intermediate-mass galaxies in 1012 M⊙ haloes, both in real and redshift space. COLA simulations have accurate peculiar velocities, without systematic errors in the velocity power spectra for k ≤ 0.15 h Mpc-1, and with only 3-per cent error for k ≤ 0.2 h Mpc-1. We use COLA with 10 time steps and a Halo Occupation Distribution to produce 600 mock galaxy catalogues of the WiggleZ Dark Energy Survey. Our parallelized code for efficient generation of accurate halo catalogues is publicly available at github.com/junkoda/cola_halo.

  7. Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules'

    PubMed Central

    Draper, John; Enot, David P; Parker, David; Beckmann, Manfred; Snowdon, Stuart; Lin, Wanchang; Zubair, Hassan

    2009-01-01

    Background Metabolomics experiments using Mass Spectrometry (MS) technology measure the mass to charge ratio (m/z) and intensity of ionised molecules in crude extracts of complex biological samples to generate high dimensional metabolite 'fingerprint' or metabolite 'profile' data. High resolution MS instruments perform routinely with a mass accuracy of < 5 ppm (parts per million) thus providing potentially a direct method for signal putative annotation using databases containing metabolite mass information. Most database interfaces support only simple queries with the default assumption that molecules either gain or lose a single proton when ionised. In reality the annotation process is confounded by the fact that many ionisation products will be not only molecular isotopes but also salt/solvent adducts and neutral loss fragments of original metabolites. This report describes an annotation strategy that will allow searching based on all potential ionisation products predicted to form during electrospray ionisation (ESI). Results Metabolite 'structures' harvested from publicly accessible databases were converted into a common format to generate a comprehensive archive in MZedDB. 'Rules' were derived from chemical information that allowed MZedDB to generate a list of adducts and neutral loss fragments putatively able to form for each structure and calculate, on the fly, the exact molecular weight of every potential ionisation product to provide targets for annotation searches based on accurate mass. We demonstrate that data matrices representing populations of ionisation products generated from different biological matrices contain a large proportion (sometimes > 50%) of molecular isotopes, salt adducts and neutral loss fragments. Correlation analysis of ESI-MS data features confirmed the predicted relationships of m/z signals. An integrated isotope enumerator in MZedDB allowed verification of exact isotopic pattern distributions to corroborate experimental data

  8. Mass and Reliability Source (MaRS) Database

    NASA Technical Reports Server (NTRS)

    Valdenegro, Wladimir

    2017-01-01

    The Mass and Reliability Source (MaRS) Database consolidates components mass and reliability data for all Oribital Replacement Units (ORU) on the International Space Station (ISS) into a single database. It was created to help engineers develop a parametric model that relates hardware mass and reliability. MaRS supplies relevant failure data at the lowest possible component level while providing support for risk, reliability, and logistics analysis. Random-failure data is usually linked to the ORU assembly. MaRS uses this data to identify and display the lowest possible component failure level. As seen in Figure 1, the failure point is identified to the lowest level: Component 2.1. This is useful for efficient planning of spare supplies, supporting long duration crewed missions, allowing quicker trade studies, and streamlining diagnostic processes. MaRS is composed of information from various databases: MADS (operating hours), VMDB (indentured part lists), and ISS PART (failure data). This information is organized in Microsoft Excel and accessed through a program made in Microsoft Access (Figure 2). The focus of the Fall 2017 internship tour was to identify the components that were the root cause of failure from the given random-failure data, develop a taxonomy for the database, and attach material headings to the component list. Secondary objectives included verifying the integrity of the data in MaRS, eliminating any part discrepancies, and generating documentation for future reference. Due to the nature of the random-failure data, data mining had to be done manually without the assistance of an automated program to ensure positive identification.

  9. Molecular formula and METLIN Personal Metabolite Database matching applied to the identification of compounds generated by LC/TOF-MS.

    PubMed

    Sana, Theodore R; Roark, Joseph C; Li, Xiangdong; Waddell, Keith; Fischer, Steven M

    2008-09-01

    In an effort to simplify and streamline compound identification from metabolomics data generated by liquid chromatography time-of-flight mass spectrometry, we have created software for constructing Personalized Metabolite Databases with content from over 15,000 compounds pulled from the public METLIN database (http://metlin.scripps.edu/). Moreover, we have added extra functionalities to the database that (a) permit the addition of user-defined retention times as an orthogonal searchable parameter to complement accurate mass data; and (b) allow interfacing to separate software, a Molecular Formula Generator (MFG), that facilitates reliable interpretation of any database matches from the accurate mass spectral data. To test the utility of this identification strategy, we added retention times to a subset of masses in this database, representing a mixture of 78 synthetic urine standards. The synthetic mixture was analyzed and screened against this METLIN urine database, resulting in 46 accurate mass and retention time matches. Human urine samples were subsequently analyzed under the same analytical conditions and screened against this database. A total of 1387 ions were detected in human urine; 16 of these ions matched both accurate mass and retention time parameters for the 78 urine standards in the database. Another 374 had only an accurate mass match to the database, with 163 of those masses also having the highest MFG score. Furthermore, MFG calculated a formula for a further 849 ions that had no match to the database. Taken together, these results suggest that the METLIN Personal Metabolite database and MFG software offer a robust strategy for confirming the formula of database matches. In the event of no database match, it also suggests possible formulas that may be helpful in interpreting the experimental results.

  10. Application of kernel functions for accurate similarity search in large chemical databases.

    PubMed

    Wang, Xiaohong; Huan, Jun; Smalter, Aaron; Lushington, Gerald H

    2010-04-29

    Similarity search in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screening among others. It is widely believed that structure based methods provide an efficient way to do the query. Recently various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models, graph kernel functions can not be applied to large chemical compound database due to the high computational complexity and the difficulties in indexing similarity search for large databases. To bridge graph kernel function and similarity search in chemical databases, we applied a novel kernel-based similarity measurement, developed in our team, to measure similarity of graph represented chemicals. In our method, we utilize a hash table to support new graph kernel function definition, efficient storage and fast search. We have applied our method, named G-hash, to large chemical databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Moreover, the similarity measurement and the index structure is scalable to large chemical databases with smaller indexing size, and faster query processing time as compared to state-of-the-art indexing methods such as Daylight fingerprints, C-tree and GraphGrep. Efficient similarity query processing method for large chemical databases is challenging since we need to balance running time efficiency and similarity search accuracy. Our previous similarity search method, G-hash, provides a new way to perform similarity search in chemical databases. Experimental study validates the utility of G-hash in chemical databases.

  11. A Web-based Tool for SDSS and 2MASS Database Searches

    NASA Astrophysics Data System (ADS)

    Hendrickson, M. A.; Uomoto, A.; Golimowski, D. A.

    We have developed a web site using HTML, Php, Python, and MySQL that extracts, processes, and displays data from the Sloan Digital Sky Survey (SDSS) and the Two-Micron All-Sky Survey (2MASS). The goal is to locate brown dwarf candidates in the SDSS database by looking at color cuts; however, this site could also be useful for targeted searches of other databases as well. MySQL databases are created from broad searches of SDSS and 2MASS data. Broad queries on the SDSS and 2MASS database servers are run weekly so that observers have the most up-to-date information from which to select candidates for observation. Observers can look at detailed information about specific objects including finding charts, images, and available spectra. In addition, updates from previous observations can be added by any collaborators; this format makes observational collaboration simple. Observers can also restrict the database search, just before or during an observing run, to select objects of special interest.

  12. Development and practical application of a library of CID accurate mass spectra of more than 2,500 toxic compounds for systematic toxicological analysis by LC-QTOF-MS with data-dependent acquisition.

    PubMed

    Broecker, Sebastian; Herre, Sieglinde; Wüst, Bernhard; Zweigenbaum, Jerry; Pragst, Fritz

    2011-04-01

    A library of collision-induced dissociation (CID) accurate mass spectra has been developed for efficient use of liquid chromatography in combination with hybrid quadrupole time-of-flight mass spectrometry (LC-QTOF-MS) as a tool in systematic toxicological analysis. The mass spectra (Δm < 3 ppm) of more than 2,500 illegal and therapeutic drugs, pesticides, alkaloids, other toxic chemicals and metabolites were measured, by use of an Agilent 6530 instrument, by flow-injection of 1 ng of the pure substances in aqueous ammonium formate-formic acid-methanol, with positive and negative electrospray-ionization (ESI), selection of the protonated or deprotonated molecules [M+H](+) or [M-H](-) by the quadrupole, and collision induced dissociation (CID) with nitrogen as collision gas at CID energies of 10, 20, and 40 eV. The fragment mass spectra were controlled for structural plausibility, corrected by recalculation to the theoretical fragment masses and added to a database of accurate mass data and molecular formulas of more than 7,500 toxicologically relevant substances to form the "database and library of toxic compounds". For practical evaluation, blood and urine samples were spiked with a mixture of 33 drugs at seven concentrations between 0.5 and 500 ng mL(-1), prepared by dichloromethane extraction or protein precipitation, and analyzed by LC-QTOF-MS in data-dependent acquisition mode. Unambiguous identification by library search was possible for typical basic drugs down to 0.5-2 ng mL(-1) and for benzodiazepines down to 2-20 ng mL(-1). The efficiency of the method was also demonstrated by re-analysis of venous blood samples from 50 death cases and comparison with previous results. In conclusion, LC-QTOF-MS in data-dependent acquisition mode combined with an accurate mass database and CID spectra library seemed to be one of the most efficient tools for systematic toxicological analysis.

  13. MASS MEASUREMENTS BY AN ACCURATE AND SENSITIVE SELECTED ION RECORDING TECHNIQUE

    EPA Science Inventory

    Trace-level components of mixtures were successfully identified or confirmed by mass spectrometric accurate mass measurements, made at high resolution with selected ion recording, using GC and LC sample introduction. Measurements were made at 20 000 or 10 000 resolution, respecti...

  14. Identification of phlebotomine sand flies using one MALDI-TOF MS reference database and two mass spectrometer systems.

    PubMed

    Mathis, Alexander; Depaquit, Jérôme; Dvořák, Vit; Tuten, Holly; Bañuls, Anne-Laure; Halada, Petr; Zapata, Sonia; Lehrter, Véronique; Hlavačková, Kristýna; Prudhomme, Jorian; Volf, Petr; Sereno, Denis; Kaufmann, Christian; Pflüger, Valentin; Schaffner, Francis

    2015-05-10

    Rapid, accurate and high-throughput identification of vector arthropods is of paramount importance in surveillance programmes that are becoming more common due to the changing geographic occurrence and extent of many arthropod-borne diseases. Protein profiling by MALDI-TOF mass spectrometry fulfils these requirements for identification, and reference databases have recently been established for several vector taxa, mostly with specimens from laboratory colonies. We established and validated a reference database containing 20 phlebotomine sand fly (Diptera: Psychodidae, Phlebotominae) species by using specimens from colonies or field-collections that had been stored for various periods of time. Identical biomarker mass patterns ('superspectra') were obtained with colony- or field-derived specimens of the same species. In the validation study, high quality spectra (i.e. more than 30 evaluable masses) were obtained with all fresh insects from colonies, and with 55/59 insects deep-frozen (liquid nitrogen/-80 °C) for up to 25 years. In contrast, only 36/52 specimens stored in ethanol could be identified. This resulted in an overall sensitivity of 87 % (140/161); specificity was 100 %. Duration of storage impaired data counts in the high mass range, and thus cluster analyses of closely related specimens might reflect their storage conditions rather than phenotypic distinctness. A major drawback of MALDI-TOF MS is the restricted availability of in-house databases and the fact that mass spectrometers from 2 companies (Bruker, Shimadzu) are widely being used. We have analysed fingerprints of phlebotomine sand flies obtained by automatic routine procedure on a Bruker instrument by using our database and the software established on a Shimadzu system. The sensitivity with 312 specimens from 8 sand fly species from laboratory colonies when evaluating only high quality spectra was 98.3 %; the specificity was 100 %. The corresponding diagnostic values with 55 field

  15. Building phytochemical mass spec identification protocols and database libraries

    USDA-ARS?s Scientific Manuscript database

    An optimized single LC-MS evaluation that would accurately determine the elemental composition of as many compounds present in an extract would greatly aid in the evaluation of plant tissues. For phytochemicals, we have used accurate mass analysis to quickly characterize the potential chemical formu...

  16. Peptide Identification by Database Search of Mixture Tandem Mass Spectra*

    PubMed Central

    Wang, Jian; Bourne, Philip E.; Bandeira, Nuno

    2011-01-01

    In high-throughput proteomics the development of computational methods and novel experimental strategies often rely on each other. In certain areas, mass spectrometry methods for data acquisition are ahead of computational methods to interpret the resulting tandem mass spectra. Particularly, although there are numerous situations in which a mixture tandem mass spectrum can contain fragment ions from two or more peptides, nearly all database search tools still make the assumption that each tandem mass spectrum comes from one peptide. Common examples include mixture spectra from co-eluting peptides in complex samples, spectra generated from data-independent acquisition methods, and spectra from peptides with complex post-translational modifications. We propose a new database search tool (MixDB) that is able to identify mixture tandem mass spectra from more than one peptide. We show that peptides can be reliably identified with up to 95% accuracy from mixture spectra while considering only a 0.01% of all possible peptide pairs (four orders of magnitude speedup). Comparison with current database search methods indicates that our approach has better or comparable sensitivity and precision at identifying single-peptide spectra while simultaneously being able to identify 38% more peptides from mixture spectra at significantly higher precision. PMID:21862760

  17. Mutation databases for inherited renal disease: are they complete, accurate, clinically relevant, and freely available?

    PubMed

    Savige, Judy; Dagher, Hayat; Povey, Sue

    2014-07-01

    This study examined whether gene-specific DNA variant databases for inherited diseases of the kidney fulfilled the Human Variome Project recommendations of being complete, accurate, clinically relevant and freely available. A recent review identified 60 inherited renal diseases caused by mutations in 132 genes. The disease name, MIM number, gene name, together with "mutation" or "database," were used to identify web-based databases. Fifty-nine diseases (98%) due to mutations in 128 genes had a variant database. Altogether there were 349 databases (a median of 3 per gene, range 0-6), but no gene had two databases with the same number of variants, and 165 (50%) databases included fewer than 10 variants. About half the databases (180, 54%) had been updated in the previous year. Few (77, 23%) were curated by "experts" but these included nine of the 11 with the most variants. Even fewer databases (41, 12%) included clinical features apart from the name of the associated disease. Most (223, 67%) could be accessed without charge, including those for 50 genes (40%) with the maximum number of variants. Future efforts should focus on encouraging experts to collaborate on a single database for each gene affected in inherited renal disease, including both unpublished variants, and clinical phenotypes. © 2014 WILEY PERIODICALS, INC.

  18. Application of proteotyping Strain Solution™ ver. 2 software and theoretically calculated mass database in MALDI-TOF MS typing of Salmonella serotype.

    PubMed

    Ojima-Kato, Teruyo; Yamamoto, Naomi; Nagai, Satomi; Shima, Keisuke; Akiyama, Yumi; Ota, Junji; Tamura, Hiroto

    2017-12-01

    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS)-based microbial identification is a popular analytical method. Strain Solution proteotyping software available for MALDI-TOF MS has great potential for the precise and detailed discrimination of microorganisms at serotype- or strain-level, beyond the conventional mass fingerprinting approaches. Here, we constructed a theoretically calculated mass database of Salmonella enterica subspecies enterica consisting of 12 biomarker proteins: ribosomal proteins S8, L15, L17, L21, L25, and S7, Mn-cofactor-containing superoxide dismutase (SodA), peptidyl-prolyl cis-trans isomerase C (PPIase C), and protein Gns, and uncharacterized proteins YibT, YaiA, and YciF, that can allow serotyping of Salmonella. Strain Solution ver. 2 software with the novel database constructed in this study demonstrated that 109 strains (94%), including the major outbreak-associated serotypes, Enteritidis, Typhimurium, and Infantis, could be correctly identified from others by colony-directed MALDI-TOF MS using 116 strains belonging to 23 kinds of typed and untyped serotypes of S. enterica from culture collections, patients, and foods. We conclude that Strain Solution ver. 2 software integrated with the accurate mass database will be useful for the bacterial proteotyping by MALDI-TOF MS-based microbial classification in the clinical and food safety fields.

  19. Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.

    PubMed

    Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis

    2017-01-01

    Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.

  20. Improvement of identification of Capnocytophaga canimorsus by matrix-assisted laser desorption ionization-time of flight mass spectrometry using enriched database.

    PubMed

    Magnette, Amandine; Huang, Te-Din; Renzi, Francesco; Bogaerts, Pierre; Cornelis, Guy R; Glupczynski, Youri

    2016-01-01

    Capnocytophaga canimorsus and Capnocytophaga cynodegmi can be transmitted from dogs or cats and cause serious human infections. We aimed to evaluate the ability of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) to identify these two Capnocytophaga species. Ninety-four C. canimorsus and 10 C. cynodegmi isolates identified by 16S rRNA gene sequencing were analyzed. Using the MALDI BioTyper database, correct identification was achieved for only 16 of 94 (17%) C. canimorsus and all 10 C. cynodegmi strains, according to the manufacturer's log score specifications. Following the establishment of a complementary homemade reference database by addition of 51 C. canimorsus and 8 C. cynodegmi mass spectra, MALDI-TOF MS provided reliable identification to the species level for 100% of the 45 blind-coded Capnocytophaga isolates tested. MALDI-TOF MS can accurately identify C. canimorsus and C. cynodegmi using an enriched database and thus constitutes a valuable diagnostic tool in the clinical laboratory. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. pyQms enables universal and accurate quantification of mass spectrometry data.

    PubMed

    Leufken, Johannes; Niehues, Anna; Sarin, L Peter; Wessel, Florian; Hippler, Michael; Leidel, Sebastian A; Fufezan, Christian

    2017-10-01

    Quantitative mass spectrometry (MS) is a key technique in many research areas (1), including proteomics, metabolomics, glycomics, and lipidomics. Because all of the corresponding molecules can be described by chemical formulas, universal quantification tools are highly desirable. Here, we present pyQms, an open-source software for accurate quantification of all types of molecules measurable by MS. pyQms uses isotope pattern matching that offers an accurate quality assessment of all quantifications and the ability to directly incorporate mass spectrometer accuracy. pyQms is, due to its universal design, applicable to every research field, labeling strategy, and acquisition technique. This opens ultimate flexibility for researchers to design experiments employing innovative and hitherto unexplored labeling strategies. Importantly, pyQms performs very well to accurately quantify partially labeled proteomes in large scale and high throughput, the most challenging task for a quantification algorithm. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  2. Accurate mass and velocity functions of dark matter haloes

    NASA Astrophysics Data System (ADS)

    Comparat, Johan; Prada, Francisco; Yepes, Gustavo; Klypin, Anatoly

    2017-08-01

    N-body cosmological simulations are an essential tool to understand the observed distribution of galaxies. We use the MultiDark simulation suite, run with the Planck cosmological parameters, to revisit the mass and velocity functions. At redshift z = 0, the simulations cover four orders of magnitude in halo mass from ˜1011M⊙ with 8783 874 distinct haloes and 532 533 subhaloes. The total volume used is ˜515 Gpc3, more than eight times larger than in previous studies. We measure and model the halo mass function, its covariance matrix w.r.t halo mass and the large-scale halo bias. With the formalism of the excursion-set mass function, we explicit the tight interconnection between the covariance matrix, bias and halo mass function. We obtain a very accurate (<2 per cent level) model of the distinct halo mass function. We also model the subhalo mass function and its relation to the distinct halo mass function. The set of models obtained provides a complete and precise framework for the description of haloes in the concordance Planck cosmology. Finally, we provide precise analytical fits of the Vmax maximum velocity function up to redshift z < 2.3 to push for the development of halo occupation distribution using Vmax. The data and the analysis code are made publicly available in the Skies and Universes data base.

  3. Gapped Spectral Dictionaries and Their Applications for Database Searches of Tandem Mass Spectra*

    PubMed Central

    Jeong, Kyowon; Kim, Sangtae; Bandeira, Nuno; Pevzner, Pavel A.

    2011-01-01

    Generating all plausible de novo interpretations of a peptide tandem mass (MS/MS) spectrum (Spectral Dictionary) and quickly matching them against the database represent a recently emerged alternative approach to peptide identification. However, the sizes of the Spectral Dictionaries quickly grow with the peptide length making their generation impractical for long peptides. We introduce Gapped Spectral Dictionaries (all plausible de novo interpretations with gaps) that can be easily generated for any peptide length thus addressing the limitation of the Spectral Dictionary approach. We show that Gapped Spectral Dictionaries are small thus opening a possibility of using them to speed-up MS/MS searches. Our MS-GappedDictionary algorithm (based on Gapped Spectral Dictionaries) enables proteogenomics applications (such as searches in the six-frame translation of the human genome) that are prohibitively time consuming with existing approaches. MS-GappedDictionary generates gapped peptides that occupy a niche between accurate but short peptide sequence tags and long but inaccurate full length peptide reconstructions. We show that, contrary to conventional wisdom, some high-quality spectra do not have good peptide sequence tags and introduce gapped tags that have advantages over the conventional peptide sequence tags in MS/MS database searches. PMID:21444829

  4. Accurate EPR radiosensitivity calibration using small sample masses

    NASA Astrophysics Data System (ADS)

    Hayes, R. B.; Haskell, E. H.; Barrus, J. K.; Kenner, G. H.; Romanyukha, A. A.

    2000-03-01

    We demonstrate a procedure in retrospective EPR dosimetry which allows for virtually nondestructive sample evaluation in terms of sample irradiations. For this procedure to work, it is shown that corrections must be made for cavity response characteristics when using variable mass samples. Likewise, methods are employed to correct for empty tube signals, sample anisotropy and frequency drift while considering the effects of dose distribution optimization. A demonstration of the method's utility is given by comparing sample portions evaluated using both the described methodology and standard full sample additive dose techniques. The samples used in this study are tooth enamel from teeth removed during routine dental care. We show that by making all the recommended corrections, very small masses can be both accurately measured and correlated with measurements of other samples. Some issues relating to dose distribution optimization are also addressed.

  5. Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.

    PubMed

    Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.

  6. Invited article: Time accurate mass flow measurements of solid-fueled systems.

    PubMed

    Olliges, Jordan D; Lilly, Taylor C; Joslyn, Thomas B; Ketsdever, Andrew D

    2008-10-01

    A novel diagnostic method is described that utilizes a thrust stand mass balance (TSMB) to directly measure time-accurate mass flow from a solid-fuel thruster. The accuracy of the TSMB mass flow measurement technique was demonstrated in three ways including the use of an idealized numerical simulation, verifying a fluid mass calibration with high-speed digital photography, and by measuring mass loss in more than 30 hybrid rocket motor firings. Dynamic response of the mass balance was assessed through weight calibration and used to derive spring, damping, and mass moment of inertia coefficients for the TSMB. These dynamic coefficients were used to determine the mass flow rate and total mass loss within an acrylic and gaseous oxygen hybrid rocket motor firing. Intentional variations in the oxygen flow rate resulted in corresponding variations in the total propellant mass flow as expected. The TSMB was optimized to determine mass losses of up to 2.5 g and measured total mass loss to within 2.5% of that calculated by a NIST-calibrated digital scale. Using this method, a mass flow resolution of 0.0011 g/s or 2% of the average mass flow in this study has been achieved.

  7. Invited Article: Time accurate mass flow measurements of solid-fueled systems

    NASA Astrophysics Data System (ADS)

    Olliges, Jordan D.; Lilly, Taylor C.; Joslyn, Thomas B.; Ketsdever, Andrew D.

    2008-10-01

    A novel diagnostic method is described that utilizes a thrust stand mass balance (TSMB) to directly measure time-accurate mass flow from a solid-fuel thruster. The accuracy of the TSMB mass flow measurement technique was demonstrated in three ways including the use of an idealized numerical simulation, verifying a fluid mass calibration with high-speed digital photography, and by measuring mass loss in more than 30 hybrid rocket motor firings. Dynamic response of the mass balance was assessed through weight calibration and used to derive spring, damping, and mass moment of inertia coefficients for the TSMB. These dynamic coefficients were used to determine the mass flow rate and total mass loss within an acrylic and gaseous oxygen hybrid rocket motor firing. Intentional variations in the oxygen flow rate resulted in corresponding variations in the total propellant mass flow as expected. The TSMB was optimized to determine mass losses of up to 2.5 g and measured total mass loss to within 2.5% of that calculated by a NIST-calibrated digital scale. Using this method, a mass flow resolution of 0.0011 g/s or 2% of the average mass flow in this study has been achieved.

  8. Accurate mass measurement by matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry. II. Measurement of negative radical ions using porphyrin and fullerene standard reference materials.

    PubMed

    Shao, Zhecheng; Wyatt, Mark F; Stein, Bridget K; Brenton, A Gareth

    2010-10-30

    A method for the accurate mass measurement of negative radical ions by matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOFMS) is described. This is an extension to our previously described method for the accurate mass measurement of positive radical ions (Griffiths NW, Wyatt MF, Kean SD, Graham AE, Stein BK, Brenton AG. Rapid Commun. Mass Spectrom. 2010; 24: 1629). The porphyrin standard reference materials (SRMs) developed for positive mode measurements cannot be observed in negative ion mode, so fullerene and fluorinated porphyrin compounds were identified as effective SRMs. The method is of immediate practical use for the accurate mass measurement of functionalised fullerenes, for which negative ion MALDI-TOFMS is the principal mass spectrometry characterisation technique. This was demonstrated by the accurate mass measurement of six functionalised C(60) compounds. Copyright © 2010 John Wiley & Sons, Ltd.

  9. Peptide Mass Fingerprinting of Egg White Proteins

    ERIC Educational Resources Information Center

    Alty, Lisa T.; LaRiviere, Frederick J.

    2016-01-01

    Use of advanced mass spectrometry techniques in the undergraduate setting has burgeoned in the past decade. However, relatively few undergraduate experiments examine the proteomics tools of protein digestion, peptide accurate mass determination, and database searching, also known as peptide mass fingerprinting. In this experiment, biochemistry…

  10. The establishment and use of the point source catalog database of the 2MASS near infrared survey

    NASA Astrophysics Data System (ADS)

    Gao, Y. F.; Shan, H. G.; Cheng, D.

    2003-02-01

    The 2MASS near infrared survey project is introduced briefly. The 2MASS point sources catalog (2MASS PSC) database and the network query system are established by using the PHP Hypertext Preprocessor and MySQL database server. By using the system, one can not only query information of sources listed in the catalog, but also draw the plots related. Moreover, after the 2MASS data are diagnosed , some research fields which can be benefited from this database are suggested.

  11. Investigation into accurate mass capability of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry, with respect to radical ion species.

    PubMed

    Wyatt, Mark F; Stein, Bridget K; Brenton, A Gareth

    2006-05-01

    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOFMS) has been shown to be an effective technique for the characterization of organometallic, coordination, and highly conjugated compounds. The preferred matrix is 2-[(2E)-3-(4-tert-butylphenyl)-2-methylprop-2-enylidene]malononitrile (DCTB), with radical ions observed. However, MALDI-TOFMS is generally not favored for accurate mass measurement. A specific method had to be developed for such compounds to assure the quality of our accurate mass results. Therefore, in this preliminary study, two methods of data acquisition, and both even-electron (EE+) ion and odd-electron (OE+.) radical ion mass calibration standards, have been investigated to establish the basic measurement technique. The benefit of this technique is demonstrated for a copper compound for which ions were observed by MALDI, but not by electrospray (ESI) or liquid secondary ion mass spectrometry (LSIMS); a mean mass accuracy error of -1.2 ppm was obtained.

  12. Negative chemical ionization gas chromatography coupled to hybrid quadrupole time-of-flight mass spectrometry and automated accurate mass data processing for determination of pesticides in fruit and vegetables.

    PubMed

    Besil, Natalia; Uclés, Samanta; Mezcúa, Milagros; Heinzen, Horacio; Fernández-Alba, Amadeo R

    2015-08-01

    Gas chromatography coupled to high resolution hybrid quadrupole time-of-flight mass spectrometry (GC-QTOF MS), operating in negative chemical ionization (NCI) mode and combining full scan with MSMS experiments using accurate mass analysis, has been explored for the automated determination of pesticide residues in fruit and vegetables. Seventy compounds were included in this approach where 50 % of them are not approved by the EU legislation. A global 76 % of the analytes could be identified at 1 μg kg(-1). Recovery studies were developed at three concentration levels (1, 5, and 10 μg kg(-1)). Seventy-seven percent of the detected pesticides at the lowest level yielded recoveries within the 70 %-120 % range, whereas 94 % could be quantified at 5 μg kg(-1), and the 100 % were determined at 10 μg kg(-1). Good repeatability, expressed as relative standard deviation (RSD <20 %), was obtained for all compounds. The main drawback of the method was the limited dynamic range that was observed for some analytes that can be overcome either diluting the sample or lowering the injection volume. A home-made database was developed and applied to an automatic accurate mass data processing. Measured mass accuracies of the generated ions were mainly less than 5 ppm for at least one diagnostic ion. When only one ion was obtained in the single-stage NCI-MS, a representative product ion from MSMS experiments was used as identification criterion. A total of 30 real samples were analyzed and 67 % of the samples were positive for 12 different pesticides in the range 1.0-1321.3 μg kg(-1).

  13. Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry

    PubMed Central

    Kind, Tobias; Fiehn, Oliver

    2007-01-01

    Background Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. Results An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80–99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. Conclusion The seven rules enable an

  14. Accurate mass measurements and their appropriate use for reliable analyte identification.

    PubMed

    Godfrey, A Ruth; Brenton, A Gareth

    2012-09-01

    Accurate mass instrumentation is becoming increasingly available to non-expert users. This data can be mis-used, particularly for analyte identification. Current best practice in assigning potential elemental formula for reliable analyte identification has been described with modern informatic approaches to analyte elucidation, including chemometric characterisation, data processing and searching using facilities such as the Chemical Abstracts Service (CAS) Registry and Chemspider.

  15. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows.

    PubMed

    Verheggen, Kenneth; Raeder, Helge; Berven, Frode S; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2017-09-13

    Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines. © 2017 Wiley Periodicals, Inc.

  16. Matrix-assisted laser desorption ionization time-of-flight mass spectrometry and database for identification of Legionella species.

    PubMed

    He, Ying; Chang, Tsung C; Li, Haijing; Shi, Gongyi; Tang, Yi-Wei

    2011-07-01

    More than 20 species of Legionella have been identified in relation to human infections. Rapid detection and identification of Legionella isolates is clinically useful to differentiate between infection and contamination and to determine treatment regimens. We explored the use of matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) Biotyper system (Bruker Daltonik GmbH, Bremen, Germany) for the identification of Legionella species. The MALDI MS spectra were generated and compared with the Biotyper database, which includes 25 Legionella strains covering 22 species and four Legionella pneumophila serogroups. A total of 83 blind-coded Legionella strains, consisting of 54 reference and 29 clinical strains, were analyzed in the study. Overall, the Biotyper system correctly identified 51 (61.4%) of all strains and isolates to the species level. For species included in the Biotyper database, the method identified 51 (86.4%) strains out of 59 Legionella strains to the correct species level, including 24 (100%) L. pneumophila and 27 (77.1%) non-L. pneumophila strains. The remaining 24 Legionella strains, belonging to species not covered by the Biotyper database, were either identified to the Legionella genus level or had no reliable identification. The Biotyper system produces constant and reproducible MALDI MS spectra for Legionella strains and can be used for rapid and accurate Legionella identification. More Legionella strains, especially the non-L. pneumophila strains, need to be included in the current Biotyper database to cover varieties of Legionella species and to increase identification accuracy.

  17. Dispersive liquid-liquid microextraction and gas chromatography accurate mass spectrometry for extraction and non-targeted profiling of volatile and semi-volatile compounds in grape marc distillates.

    PubMed

    Fontana, Ariel; Rodríguez, Isaac; Cela, Rafael

    2018-04-20

    The suitability of dispersive liquid-liquid microextraction (DLLME) and gas chromatography accurate mass spectrometry (GC-MS), based on a time-of-flight (TOF) MS analyzer and using electron ionization (EI), for the characterization of volatile and semi-volatile profiles of grape marc distillates (grappa) are evaluated. DLLME conditions are optimized with a selection of compounds, from different chemical families, present in the distillate spirit. Under final working conditions, 2.5 mL of sample and 0.5 mL of organic solvents are consumed in the sample preparation process. The absolute extraction efficiencies ranged from 30 to 100%, depending on the compound. For the same sample volume, DLLME provided higher responses than solid-phase microextraction (SPME) for most of the model compounds. The GC-EI-TOF-MS records of grappa samples were processed using a data mining non-targeted search algorithm. In this way, chromatographic peaks and accurate EI-MS spectra of sample components were linked. The identities of more than 140 of these components are proposed from comparison of their accurate spectra with those in a low resolution EI-MS database, accurate masses of most intense fragment ions of known structure, and available chromatographic retention index. The use of chromatographic and spectral data, associated to the set of components mined from different grappa samples, for multivariate analysis purposes is also illustrated in the study. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Accurate wavelengths for X-ray spectroscopy and the NIST hydrogen-like ion database

    NASA Astrophysics Data System (ADS)

    Kotochigova, S. A.; Kirby, K. P.; Brickhouse, N. S.; Mohr, P. J.; Tupitsyn, I. I.

    2005-06-01

    We have developed an ab initio multi-configuration Dirac-Fock-Sturm method for the precise calculation of X-ray emission spectra, including energies, transition wavelengths and transition probabilities. The calculations are based on non-orthogonal basis sets, generated by solving the Dirac-Fock and Dirac-Fock-Sturm equations. Inclusion of Sturm functions into the basis set provides an efficient description of correlation effects in highly charged ions and fast convergence of the configuration interaction procedure. A second part of our study is devoted to developing a theoretical procedure and creating an interactive database to generate energies and transition frequencies for hydrogen-like ions. This procedure is highly accurate and based on current knowledge of the relevant theory, which includes relativistic, quantum electrodynamic, recoil, and nuclear size effects.

  19. AtomDB: Expanding an Accessible and Accurate Atomic Database for X-ray Astronomy

    NASA Astrophysics Data System (ADS)

    Smith, Randall

    Since its inception in 2001, the AtomDB has become the standard repository of accurate and accessible atomic data for the X-ray astrophysics community, including laboratory astrophysicists, observers, and modelers. Modern calculations of collisional excitation rates now exist - and are in AtomDB - for all abundant ions in a hot plasma. AtomDB has expanded beyond providing just a collisional model, and now also contains photoionization data from XSTAR as well as a charge exchange model, amongst others. However, building and maintaining an accurate and complete database that can fully exploit the diagnostic potential of high-resolution X-ray spectra requires further work. The Hitomi results, sadly limited as they were, demonstrated the urgent need for the best possible wavelength and rate data, not merely for the strongest lines but for the diagnostic features that may have 1% or less of the flux of the strong lines. In particular, incorporation of weak but powerfully diagnostic satellite lines will be crucial to understanding the spectra expected from upcoming deep observations with Chandra and XMM-Newton, as well as the XARM and Athena satellites. Beyond incorporating this new data, a number of groups, both experimental and theoretical, have begun to produce data with errors and/or sensitivity estimates. We plan to use this to create statistically meaningful spectral errors on collisional plasmas, providing practical uncertainties together with model spectra. We propose to continue to (1) engage the X-ray astrophysics community regarding their issues and needs, notably by a critical comparison with other related databases and tools, (2) enhance AtomDB to incorporate a large number of satellite lines as well as updated wavelengths with error estimates, (3) continue to update the AtomDB with the latest calculations and laboratory measurements, in particular velocity-dependent charge exchange rates, and (4) enhance existing tools, and create new ones as needed to

  20. YPED: An Integrated Bioinformatics Suite and Database for Mass Spectrometry-based Proteomics Research

    PubMed Central

    Colangelo, Christopher M.; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L.; Carriero, Nicholas J.; Gulcicek, Erol E.; Lam, TuKiet T.; Wu, Terence; Bjornson, Robert D.; Bruce, Can; Nairn, Angus C.; Rinehart, Jesse; Miller, Perry L.; Williams, Kenneth R.

    2015-01-01

    We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography–tandem mass spectrometry (LC–MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED’s database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. PMID:25712262

  1. YPED: an integrated bioinformatics suite and database for mass spectrometry-based proteomics research.

    PubMed

    Colangelo, Christopher M; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L; Carriero, Nicholas J; Gulcicek, Erol E; Lam, TuKiet T; Wu, Terence; Bjornson, Robert D; Bruce, Can; Nairn, Angus C; Rinehart, Jesse; Miller, Perry L; Williams, Kenneth R

    2015-02-01

    We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  2. Accurate mass replacement method for the sediment concentration measurement with a constant volume container

    NASA Astrophysics Data System (ADS)

    Ban, Yunyun; Chen, Tianqin; Yan, Jun; Lei, Tingwu

    2017-04-01

    The measurement of sediment concentration in water is of great importance in soil erosion research and soil and water loss monitoring systems. The traditional weighing method has long been the foundation of all the other measuring methods and instrument calibration. The development of a new method to replace the traditional oven-drying method is of interest in research and practice for the quick and efficient measurement of sediment concentration, especially field measurements. A new method is advanced in this study for accurately measuring the sediment concentration based on the accurate measurement of the mass of the sediment-water mixture in the confined constant volume container (CVC). A sediment-laden water sample is put into the CVC to determine its mass before the CVC is filled with water and weighed again for the total mass of the water and sediments in the container. The known volume of the CVC, the mass of sediment-laden water, and sediment particle density are used to calculate the mass of water, which is replaced by sediments, therefore sediment concentration of the sample is calculated. The influence of water temperature was corrected by measuring water density to determine the temperature of water before measurements were conducted. The CVC was used to eliminate the surface tension effect so as to obtain the accurate volume of water and sediment mixture. Experimental results showed that the method was capable of measuring the sediment concentration from 0.5 up to 1200 kg m-3. A good liner relationship existed between the designed and measured sediment concentrations with all the coefficients of determination greater than 0.999 and the averaged relative error less than 0.2%. All of these seem to indicate that the new method is capable of measuring a full range of sediment concentration above 0.5 kg m-3 to replace the traditional oven-drying method as a standard method for evaluating and calibrating other methods.

  3. FlavonoidSearch: A system for comprehensive flavonoid annotation by mass spectrometry.

    PubMed

    Akimoto, Nayumi; Ara, Takeshi; Nakajima, Daisuke; Suda, Kunihiro; Ikeda, Chiaki; Takahashi, Shingo; Muneto, Reiko; Yamada, Manabu; Suzuki, Hideyuki; Shibata, Daisuke; Sakurai, Nozomu

    2017-04-28

    Currently, in mass spectrometry-based metabolomics, limited reference mass spectra are available for flavonoid identification. In the present study, a database of probable mass fragments for 6,867 known flavonoids (FsDatabase) was manually constructed based on new structure- and fragmentation-related rules using new heuristics to overcome flavonoid complexity. We developed the FlavonoidSearch system for flavonoid annotation, which consists of the FsDatabase and a computational tool (FsTool) to automatically search the FsDatabase using the mass spectra of metabolite peaks as queries. This system showed the highest identification accuracy for the flavonoid aglycone when compared to existing tools and revealed accurate discrimination between the flavonoid aglycone and other compounds. Sixteen new flavonoids were found from parsley, and the diversity of the flavonoid aglycone among different fruits and vegetables was investigated.

  4. Accurate mass measurement by matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry. I. Measurement of positive radical ions using porphyrin standard reference materials.

    PubMed

    Griffiths, Nia W; Wyatt, Mark F; Kean, Suzanna D; Graham, Andrew E; Stein, Bridget K; Brenton, A Gareth

    2010-06-15

    A method for the accurate mass measurement of positive radical ions by matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOFMS) is described. Initial use of a conjugated oligomeric calibration material was rejected in favour of a series of meso-tetraalkyl/tetraalkylaryl-functionalised porphyrins, from which the two calibrants required for a particular accurate mass measurement were chosen. While all measurements of monoisotopic species were within +/-5 ppm, and the method was rigorously validated using chemometrics, mean values of five measurements were used for extra confidence in the generation of potential elemental formulae. Potential difficulties encountered when measuring compounds containing multi-isotopic elements are discussed, where the monoisotopic peak is no longer the lowest mass peak, and a simple mass-correction solution can be applied. The method requires no significant expertise to implement, but care and attention is required to obtain valid measurements. The method is operationally simple and will prove useful to the analytical chemistry community. Copyright (c) 2010 John Wiley & Sons, Ltd.

  5. Ariadne: a database search engine for identification and chemical analysis of RNA using tandem mass spectrometry data.

    PubMed

    Nakayama, Hiroshi; Akiyama, Misaki; Taoka, Masato; Yamauchi, Yoshio; Nobe, Yuko; Ishikawa, Hideaki; Takahashi, Nobuhiro; Isobe, Toshiaki

    2009-04-01

    We present here a method to correlate tandem mass spectra of sample RNA nucleolytic fragments with an RNA nucleotide sequence in a DNA/RNA sequence database, thereby allowing tandem mass spectrometry (MS/MS)-based identification of RNA in biological samples. Ariadne, a unique web-based database search engine, identifies RNA by two probability-based evaluation steps of MS/MS data. In the first step, the software evaluates the matches between the masses of product ions generated by MS/MS of an RNase digest of sample RNA and those calculated from a candidate nucleotide sequence in a DNA/RNA sequence database, which then predicts the nucleotide sequences of these RNase fragments. In the second step, the candidate sequences are mapped for all RNA entries in the database, and each entry is scored for a function of occurrences of the candidate sequences to identify a particular RNA. Ariadne can also predict post-transcriptional modifications of RNA, such as methylation of nucleotide bases and/or ribose, by estimating mass shifts from the theoretical mass values. The method was validated with MS/MS data of RNase T1 digests of in vitro transcripts. It was applied successfully to identify an unknown RNA component in a tRNA mixture and to analyze post-transcriptional modification in yeast tRNA(Phe-1).

  6. Accurate Mass Determination of Organotrifluoroborates

    PubMed Central

    Petrillo, Daniel E.; Kohli, Rakesh K.; Molander, Gary A.

    2007-01-01

    Exact mass measurements were obtained for a variety of potassium- and tetra-n-butylammonium organotrifluoroborates using commercially available organic sulfate salts as internal reference standards. Accuracies were determined within 5 ppm using a sector ESI mass spectrometer operating in the negative ionization mode. PMID:17112738

  7. A compatible exon-exon junction database for the identification of exon skipping events using tandem mass spectrum data.

    PubMed

    Mo, Fan; Hong, Xu; Gao, Feng; Du, Lin; Wang, Jun; Omenn, Gilbert S; Lin, Biaoyang

    2008-12-16

    Alternative splicing is an important gene regulation mechanism. It is estimated that about 74% of multi-exon human genes have alternative splicing. High throughput tandem (MS/MS) mass spectrometry provides valuable information for rapidly identifying potentially novel alternatively-spliced protein products from experimental datasets. However, the ability to identify alternative splicing events through tandem mass spectrometry depends on the database against which the spectra are searched. We wrote scripts in perl, Bioperl, mysql and Ensembl API and built a theoretical exon-exon junction protein database to account for all possible combinations of exons for a gene while keeping the frame of translation (i.e., keeping only in-phase exon-exon combinations) from the Ensembl Core Database. Using our liver cancer MS/MS dataset, we identified a total of 488 non-redundant peptides that represent putative exon skipping events. Our exon-exon junction database provides the scientific community with an efficient means to identify novel alternatively spliced (exon skipping) protein isoforms using mass spectrometry data. This database will be useful in annotating genome structures using rapidly accumulating proteomics data.

  8. Accurately Mapping M31's Microlensing Population

    NASA Astrophysics Data System (ADS)

    Crotts, Arlin

    2004-07-01

    We propose to augment an existing microlensing survey of M31 with source identifications provided by a modest amount of ACS {and WFPC2 parallel} observations to yield an accurate measurement of the masses responsible for microlensing in M31, and presumably much of its dark matter. The main benefit of these data is the determination of the physical {or "einstein"} timescale of each microlensing event, rather than an effective {"FWHM"} timescale, allowing masses to be determined more than twice as accurately as without HST data. The einstein timescale is the ratio of the lensing cross-sectional radius and relative velocities. Velocities are known from kinematics, and the cross-section is directly proportional to the {unknown} lensing mass. We cannot easily measure these quantities without knowing the amplification, hence the baseline magnitude, which requires the resolution of HST to find the source star. This makes a crucial difference because M31 lens m ass determinations can be more accurate than those towards the Magellanic Clouds through our Galaxy's halo {for the same number of microlensing events} due to the better constrained geometry in the M31 microlensing situation. Furthermore, our larger survey, just completed, should yield at least 100 M31 microlensing events, more than any Magellanic survey. A small amount of ACS+WFPC2 imaging will deliver the potential of this large database {about 350 nights}. For the whole survey {and a delta-function mass distribution} the mass error should approach only about 15%, or about 6% error in slope for a power-law distribution. These results will better allow us to pinpoint the lens halo fraction, and the shape of the halo lens spatial distribution, and allow generalization/comparison of the nature of halo dark matter in spiral galaxies. In addition, we will be able to establish the baseline magnitude for about 50, 000 variable stars, as well as measure an unprecedentedly deta iled color-magnitude diagram and luminosity

  9. Matrix-assisted laser desorption ionization time-of-flight mass spectrometry for fast and accurate identification of clinically relevant Aspergillus species.

    PubMed

    Alanio, A; Beretti, J-L; Dauphin, B; Mellado, E; Quesne, G; Lacroix, C; Amara, A; Berche, P; Nassif, X; Bougnoux, M-E

    2011-05-01

    New Aspergillus species have recently been described with the use of multilocus sequencing in refractory cases of invasive aspergillosis. The classical phenotypic identification methods routinely used in clinical laboratories failed to identify them adequately. Some of these Aspergillus species have specific patterns of susceptibility to antifungal agents, and misidentification may lead to inappropriate therapy. We developed a matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry (MS)-based strategy to adequately identify Aspergillus species to the species level. A database including the reference spectra of 28 clinically relevant species from seven Aspergillus sections (five common and 23 unusual species) was engineered. The profiles of young and mature colonies were analysed for each reference strain, and species-specific spectral fingerprints were identified. The performance of the database was then tested on 124 clinical and 16 environmental isolates previously characterized by partial sequencing of the β-tubulin and calmodulin genes. One hundred and thirty-eight isolates of 140 (98.6%) were correctly identified. Two atypical isolates could not be identified, but no isolate was misidentified (specificity: 100%). The database, including species-specific spectral fingerprints of young and mature colonies of the reference strains, allowed identification regardless of the maturity of the clinical isolate. These results indicate that MALDI-TOF MS is a powerful tool for rapid and accurate identification of both common and unusual species of Aspergillus. It can give better results than morphological identification in clinical laboratories. © 2010 The Authors. Clinical Microbiology and Infection © 2010 European Society of Clinical Microbiology and Infectious Diseases.

  10. Rapid identification of fatty acid methyl esters using a multidimensional gas chromatography-mass spectrometry database.

    PubMed

    Härtig, Claus

    2008-01-04

    A multidimensional approach for the identification of fatty acid methyl esters (FAME) based on GC/MS analysis is described. Mass spectra and retention data of more than 130 FAME from various sources (chain lengths in the range from 4 to 24 carbon atoms) were collected in a database. Hints for the interpretation of FAME mass spectra are given and relevant diagnostic marker ions are deduced indicating specific groups of fatty acids. To verify the identity of single species and to ensure an optimized chromatographic resolution, the database was compiled with retention data libraries acquired on columns of different polarity (HP-5, DB-23, and HP-88). For a combined use of mass spectra and retention data standardized methods of measurement for each of these columns are required. Such master methods were developed and always applied under the conditions of retention time locking (RTL) which allowed an excellent reproducibility and comparability of absolute retention times. Moreover, as a relative retention index system, equivalent chain lengths (ECL) of FAME were determined by linear interpolation. To compare and to predict ECL values by means of structural features, fractional chain lengths (FCL) were calculated and fitted as well. As shown in an example, the use of retention data and mass spectral information together in a database search leads to an improved and reliable identification of FAME (including positional and geometrical isomers) without further derivatizations.

  11. Fatty acids composition of Caenorhabditis elegans using accurate mass GCMS-QTOF

    PubMed Central

    Henry, Parise; Owopetu, Olufunmilayo; Adisa, Demilade; Nguyen, Thao; Anthony, Kevin; Ijoni-Animadu, David; Jamadar, Sakha; Abdel-Rahman, Fawzia; Saleh, Mahmoud A.

    2016-01-01

    The free living nematode Caenorhabditis elegans is a proven model organism for lipid metabolism research. Total lipids of C. elegans were extracted using chloroform, methanol 2:1(v/v). Fatty acids composition of the extracted total lipids were converted to their corresponding methyl esters (FAMEs) and analyzed by gas chromatography/accurate mass quadrupole time of flight mass spectrometry (GCMS-QTOF) using both electron ionization (EI) and chemical ionization (CI) techniques. 28 fatty acids consisting of 12 to 22 carbon atoms were identified, 65% of them were unsaturated. Fatty acids containing 12 to 17 carbons were mostly saturated with stearic acid (18:0) as the major constituent. Several branched-chain fatty acids were identified. Methyl-14-methylhexadecanoate (iso-17:0) was the major identified branched fatty acid. This is the first report to detect the intact molecular parent ions of the identified fatty acids using chemical ionization compared to electron ionization which produced fragmentations of the fatty acids methyl esters (FAMEs). PMID:27166662

  12. Accurate physical laws can permit new standard units: The two laws F→=ma→ and the proportionality of weight to mass

    NASA Astrophysics Data System (ADS)

    Saslow, Wayne M.

    2014-04-01

    Three common approaches to F→=ma→ are: (1) as an exactly true definition of force F→ in terms of measured inertial mass m and measured acceleration a→; (2) as an exactly true axiom relating measured values of a→, F→ and m; and (3) as an imperfect but accurately true physical law relating measured a→ to measured F→, with m an experimentally determined, matter-dependent constant, in the spirit of the resistance R in Ohm's law. In the third case, the natural units are those of a→ and F→, where a→ is normally specified using distance and time as standard units, and F→ from a spring scale as a standard unit; thus mass units are derived from force, distance, and time units such as newtons, meters, and seconds. The present work develops the third approach when one includes a second physical law (again, imperfect but accurate)—that balance-scale weight W is proportional to m—and the fact that balance-scale measurements of relative weight are more accurate than those of absolute force. When distance and time also are more accurately measurable than absolute force, this second physical law permits a shift to standards of mass, distance, and time units, such as kilograms, meters, and seconds, with the unit of force—the newton—a derived unit. However, were force and distance more accurately measurable than time (e.g., time measured with an hourglass), this second physical law would permit a shift to standards of force, mass, and distance units such as newtons, kilograms, and meters, with the unit of time—the second—a derived unit. Therefore, the choice of the most accurate standard units depends both on what is most accurately measurable and on the accuracy of physical law.

  13. Profiling monoterpenol glycoconjugation in Vitis vinifera L. cv. Muscat of Alexandria using a novel putative compound database approach, high resolution mass spectrometry and collision induced dissociation fragmentation analysis.

    PubMed

    Hjelmeland, Anna K; Zweigenbaum, Jerry; Ebeler, Susan E

    2015-08-05

    In this work we present a novel approach for the identification of plant metabolites using ultrahigh performance liquid chromatography coupled to accurate mass time-of-flight mass spectrometry. The workflow involves developing an in-house compound database consisting of exact masses of previously identified as well as putative compounds. The database is used to screen accurate mass spectrometry (MS) data to identify possible compound matches. Subsequent tandem MS data is acquired for possible matches and used for structural elucidation. The methodology is applied to profile monoterpene glycosides in Vitis vinifera cv. Muscat of Alexandria grape berries over three developmental stages. Monoterpenes are a subclass of terpenes, the largest class of plant secondary metabolites, and are found in two major forms in the plant, "bound" to one or more sugar moieties or "free" of said sugar moieties. In the free form, monoterpenes are noted for their fragrance and play important roles in plant defense and as attractants for pollinators. However, glycoconjugation renders these compounds odorless, and it is this form that the plant uses for monoterpene storage. In order to gain insight into monoterpene biochemistry and their fate in the plant an analysis of intact glycosides is essential. Eighteen monoterpene glycosides were identified including a monoterpene trisaccharide glycoside, which is tentatively identified here for this first time in any plant. Additionally, while previous studies have identified monoterpene malonylated glucosides in other grapevine tissue, we tentatively identify them for the first time in grape berries. This analytical approach can be readily applied to other plants and the workflow approach can also be used for other classes of compounds. This approach, in general, provides researchers with data to support the identification of putative compounds, which is especially useful when no standard is available. Copyright © 2015 Elsevier B.V. All rights

  14. Archive of mass spectral data files on recordable CD-ROMs and creation and maintenance of a searchable computerized database.

    PubMed

    Amick, G D

    1999-01-01

    A database containing names of mass spectral data files generated in a forensic toxicology laboratory and two Microsoft Visual Basic programs to maintain and search this database is described. The data files (approximately 0.5 KB/each) were collected from six mass spectrometers during routine casework. Data files were archived on 650 MB (74 min) recordable CD-ROMs. Each recordable CD-ROM was given a unique name, and its list of data file names was placed into the database. The present manuscript describes the use of search and maintenance programs for searching and routine upkeep of the database and creation of CD-ROMs for archiving of data files.

  15. Identification of volatile and semivolatile compounds in chemical ionization GC-MS using a mass-to-structure (MTS) Search Engine with integral isotope pattern ranking.

    PubMed

    Liao, Wenta; Draper, William M

    2013-02-21

    The mass-to-structure or MTS Search Engine is an Access 2010 database containing theoretical molecular mass information for 19,438 compounds assembled from common sources such as the Merck Index, pesticide and pharmaceutical compilations, and chemical catalogues. This database, which contains no experimental mass spectral data, was developed as an aid to identification of compounds in atmospheric pressure ionization (API)-LC-MS. This paper describes a powerful upgrade to this database, a fully integrated utility for filtering or ranking candidates based on isotope ratios and patterns. The new MTS Search Engine is applied here to the identification of volatile and semivolatile compounds including pesticides, nitrosoamines and other pollutants. Methane and isobutane chemical ionization (CI) GC-MS spectra were obtained from unit mass resolution mass spectrometers to determine MH(+) masses and isotope ratios. Isotopes were measured accurately with errors of <4% and <6%, respectively, for A + 1 and A + 2 peaks. Deconvolution of interfering isotope clusters (e.g., M(+) and [M - H](+)) was required for accurate determination of the A + 1 isotope in halogenated compounds. Integrating the isotope data greatly improved the speed and accuracy of the database identifications. The database accurately identified unknowns from isobutane CI spectra in 100% of cases where as many as 40 candidates satisfied the mass tolerance. The paper describes the development and basic operation of the new MTS Search Engine and details performance testing with over 50 model compounds.

  16. Accurate mass analysis of ethanesulfonic acid degradates of acetochlor and alachlor using high-performance liquid chromatography and time-of-flight mass spectrometry

    USGS Publications Warehouse

    Thurman, E.M.; Ferrer, I.; Parry, R.

    2002-01-01

    Degradates of acetochlor and alachlor (ethanesulfonic acids, ESAs) were analyzed in both standards and in a groundwater sample using high-performance liquid chromatography-time-of-flight mass spectrometry with electrospray ionization. The negative pseudomolecular ion of the secondary amide of acetochlor ESA and alachlor ESA gave average masses of 256.0750??0.0049 amu and 270.0786??0.0064 amu respectively. Acetochlor and alachlor ESA gave similar masses of 314.1098??0.0061 amu and 314.1153??0.0048 amu; however, they could not be distinguished by accurate mass because they have the same empirical formula. On the other hand, they may be distinguished using positive-ion electrospray because of different fragmentation spectra, which did not occur using negative-ion electrospray.

  17. Accurate mass analysis of ethanesulfonic acid degradates of acetochlor and alachlor using high-performance liquid chromatography and time-of-flight mass spectrometry

    USGS Publications Warehouse

    Thurman, E.M.; Ferrer, Imma; Parry, R.

    2002-01-01

    Degradates of acetochlor and alachlor (ethanesulfonic acids, ESAs) were analyzed in both standards and in a groundwater sample using high-performance liquid chromatography-time-of-flight mass spectrometry with electrospray ionization. The negative pseudomolecular ion of the secondary amide of acetochlor ESA and alachlor ESA gave average masses of 256.0750+/-0.0049 amu and 270.0786+/-0.0064 amu respectively. Acetochlor and alachlor ESA gave similar masses of 314.1098+/-0.0061 amu and 314.1153+/-0.0048 amu; however, they could not be distinguished by accurate mass because they have the same empirical formula. On the other hand, they may be distinguished using positive-ion electrospray because of different fragmentation spectra, which did not occur using negative-ion electrospray.

  18. Multidimensional gas chromatography in combination with accurate mass, tandem mass spectrometry, and element-specific detection for identification of sulfur compounds in tobacco smoke.

    PubMed

    Ochiai, Nobuo; Mitsui, Kazuhisa; Sasamoto, Kikuo; Yoshimura, Yuta; David, Frank; Sandra, Pat

    2014-09-05

    A method is developed for identification of sulfur compounds in tobacco smoke extract. The method is based on large volume injection (LVI) of 10μL of tobacco smoke extract followed by selectable one-dimensional ((1)D) or two-dimensional ((2)D) gas chromatography (GC) coupled to a hybrid quadrupole time-of-flight mass spectrometer (Q-TOF-MS) using electron ionization (EI) and positive chemical ionization (PCI), with parallel sulfur chemiluminescence detection (SCD). In order to identify each individual sulfur compound, sequential heart-cuts of 28 sulfur fractions from (1)D GC to (2)D GC were performed with the three MS detection modes (SCD/EI-TOF-MS, SCD/PCI-TOF-MS, and SCD/PCI-Q-TOF-MS). Thirty sulfur compounds were positively identified by MS library search, linear retention indices (LRI), molecular mass determination using PCI accurate mass spectra, formula calculation using EI and PCI accurate mass spectra, and structure elucidation using collision activated dissociation (CAD) of the protonated molecule. Additionally, 11 molecular formulas were obtained for unknown sulfur compounds. The determined values of the identified and unknown sulfur compounds were in the range of 10-740ngmg total particulate matter (TPM) (RSD: 1.2-12%, n=3). Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  19. OBSIFRAC: database-supported software for 3D modeling of rock mass fragmentation

    NASA Astrophysics Data System (ADS)

    Empereur-Mot, Luc; Villemin, Thierry

    2003-03-01

    Under stress, fractures in rock masses tend to form fully connected networks. The mass can thus be thought of as a 3D series of blocks produced by fragmentation processes. A numerical model has been developed that uses a relational database to describe such a mass. The model, which assumes the fractures to be plane, allows data from natural networks to test theories concerning fragmentation processes. In the model, blocks are bordered by faces that are composed of edges and vertices. A fracture can originate from a seed point, its orientation being controlled by the stress field specified by an orientation matrix. Alternatively, it can be generated from a discrete set of given orientations and positions. Both kinds of fracture can occur together in a model. From an original simple block, a given fracture produces two simple polyhedral blocks, and the original block becomes compound. Compound and simple blocks created throughout fragmentation are stored in the database. Several fragmentation processes have been studied. In one scenario, a constant proportion of blocks is fragmented at each step of the process. The resulting distribution appears to be fractal, although seed points are random in each fragmented block. In a second scenario, division affects only one random block at each stage of the process, and gives a Weibull volume distribution law. This software can be used for a large number of other applications.

  20. Dereplication of peptidic natural products through database search of mass spectra

    PubMed Central

    Mohimani, Hosein; Gurevich, Alexey; Mikheenko, Alla; Garg, Neha; Nothias, Louis-Felix; Ninomiya, Akihiro; Takada, Kentaro; Dorrestein, Pieter C.; Pevzner, Pavel A.

    2016-01-01

    Peptidic Natural Products (PNPs) are widely used compounds that include many antibiotics and a variety of other bioactive peptides. While recent breakthroughs in PNP discovery raised the challenge of developing new algorithms for their analysis, identification of PNPs via database search of tandem mass spectra remains an open problem. To address this problem, natural product researchers utilize dereplication strategies that identify known PNPs and lead to the discovery of new ones even in cases when the reference spectra are not present in existing spectral libraries. DEREPLICATOR is a new dereplication algorithm that enabled high-throughput PNP identification and that is compatible with large-scale mass spectrometry-based screening platforms for natural product discovery. After searching nearly one hundred million tandem mass spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure, DEREPLICATOR identified an order of magnitude more PNPs (and their new variants) than any previous dereplication efforts. PMID:27820803

  1. Accurate mass screening and identification of emerging contaminants in environmental samples by liquid chromatography-hybrid linear ion trap Orbitrap mass spectrometry.

    PubMed

    Hogenboom, A C; van Leerdam, J A; de Voogt, P

    2009-01-16

    The European Reach legislation will possibly drive producers to develop newly designed chemicals that will be less persistent, bioaccumulative or toxic. If this innovation leads to an increased use of more hydrophilic chemicals it may result in higher mobilities of chemicals in the aqueous environment. As a result, the drinking water companies may face stronger demands on removal processes as the hydrophilic compounds inherently are more difficult to remove. Monitoring efforts will also experience a shift in focus to more water-soluble compounds. Screening source waters on the presence of (emerging) contaminants is an essential step in the control of the water cycle from source to tap water. In this article, some of our experiences are presented with the hybrid linear ion trap (LTQ) FT Orbitrap mass spectrometer, in the area of chemical water analysis. A two-pronged strategy in mass spectrometric research was employed: (i) exploring effluent, surface, ground- and drinking-water samples searching for accurate masses corresponding to target compounds (and their product ions) known from, e.g. priority lists or the scientific literature and (ii) full-scan screening of water samples in search of 'unknown' or unexpected masses, followed by MS(n) experiments to elucidate the structure of the unknowns. Applications of both approaches to emerging water contaminants are presented and discussed. Results are presented for target analysis search for pharmaceuticals, benzotriazoles, illicit drugs and for the identification of unknown compounds in a groundwater sample and in a polar extract of a landfill soil sample (a toxicity identification evaluation bioassay sample). The applications of accurate mass screening and identification described in this article demonstrate that the LC-LTQ FT Orbitrap MS is well equipped to meet the challenges posed by newly emerging polar contaminants.

  2. A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls

    PubMed Central

    Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique LCT; Heeren, Ron MA; Sillevis Smitt, Peter A; Luider, Theo M

    2006-01-01

    Background Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. Results A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. Conclusion The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry

  3. A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls.

    PubMed

    Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique L C T; Heeren, Ron M A; Sillevis Smitt, Peter A; Luider, Theo M

    2006-09-05

    Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the

  4. Identifying Gel-Separated Proteins Using In-Gel Digestion, Mass Spectrometry, and Database Searching: Consider the Chemistry

    ERIC Educational Resources Information Center

    Albright, Jessica C.; Dassenko, David J.; Mohamed, Essa A.; Beussman, Douglas J.

    2009-01-01

    Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry is an important bioanalytical technique in drug discovery, proteomics, and research at the biology-chemistry interface. This is an especially powerful tool when combined with gel separation of proteins and database mining using the mass spectral data. Currently, few hands-on…

  5. High-performance hardware implementation of a parallel database search engine for real-time peptide mass fingerprinting

    PubMed Central

    Bogdán, István A.; Rivers, Jenny; Beynon, Robert J.; Coca, Daniel

    2008-01-01

    Motivation: Peptide mass fingerprinting (PMF) is a method for protein identification in which a protein is fragmented by a defined cleavage protocol (usually proteolysis with trypsin), and the masses of these products constitute a ‘fingerprint’ that can be searched against theoretical fingerprints of all known proteins. In the first stage of PMF, the raw mass spectrometric data are processed to generate a peptide mass list. In the second stage this protein fingerprint is used to search a database of known proteins for the best protein match. Although current software solutions can typically deliver a match in a relatively short time, a system that can find a match in real time could change the way in which PMF is deployed and presented. In a paper published earlier we presented a hardware design of a raw mass spectra processor that, when implemented in Field Programmable Gate Array (FPGA) hardware, achieves almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In this article we present a complementary hardware realization of a parallel database search engine that, when running on a Xilinx Virtex 2 FPGA at 100 MHz, delivers 1800-fold speed-up compared with an equivalent C software routine, running on a 3.06 GHz Xeon workstation. The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs. The database search processor and the mass spectra processor, running on a reconfigurable computing platform, provide a complete real-time PMF protein identification solution. Contact: d.coca@sheffield.ac.uk PMID:18453553

  6. Purification of pharmaceutical preparations using thin-layer chromatography to obtain mass spectra with Direct Analysis in Real Time and accurate mass spectrometry.

    PubMed

    Wood, Jessica L; Steiner, Robert R

    2011-06-01

    Forensic analysis of pharmaceutical preparations requires a comparative analysis with a standard of the suspected drug in order to identify the active ingredient. Purchasing analytical standards can be expensive or unattainable from the drug manufacturers. Direct Analysis in Real Time (DART™) is a novel, ambient ionization technique, typically coupled with a JEOL AccuTOF™ (accurate mass) mass spectrometer. While a fast and easy technique to perform, a drawback of using DART™ is the lack of component separation of mixtures prior to ionization. Various in-house pharmaceutical preparations were purified using thin-layer chromatography (TLC) and mass spectra were subsequently obtained using the AccuTOF™- DART™ technique. Utilizing TLC prior to sample introduction provides a simple, low-cost solution to acquiring mass spectra of the purified preparation. Each spectrum was compared against an in-house molecular formula list to confirm the accurate mass elemental compositions. Spectra of purified ingredients of known pharmaceuticals were added to an in-house library for use as comparators for casework samples. Resolving isomers from one another can be accomplished using collision-induced dissociation after ionization. Challenges arose when the pharmaceutical preparation required an optimized TLC solvent to achieve proper separation and purity of the standard. Purified spectra were obtained for 91 preparations and included in an in-house drug standard library. Primary standards would only need to be purchased when pharmaceutical preparations not previously encountered are submitted for comparative analysis. TLC prior to DART™ analysis demonstrates a time efficient and cost saving technique for the forensic drug analysis community. Copyright © 2011 John Wiley & Sons, Ltd. Copyright © 2011 John Wiley & Sons, Ltd.

  7. Determination of the presence or absence of sulfur materials in drywall using direct analysis in real time in conjunction with an accurate-mass time-of-flight mass spectrometer.

    PubMed

    Curtis, Matthew E; Jones, Patrick R; Sparkman, O David; Cody, Robert B

    2009-11-01

    Based on the concern about the presence of sulfur materials being in drywall (wallboard), a quick and reliable test to confirm the presence or absence of these materials using direct analysis in real time (DART) mass spectrometry in conjunction with an accurate-mass time-of-flight (TOF) mass spectrometer has been developed and is described here.

  8. Using semantic data modeling techniques to organize an object-oriented database for extending the mass storage model

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Short, Nicholas M., Jr.; Roelofs, Larry H.; Dorfman, Erik

    1991-01-01

    A methodology for optimizing organization of data obtained by NASA earth and space missions is discussed. The methodology uses a concept based on semantic data modeling techniques implemented in a hierarchical storage model. The modeling is used to organize objects in mass storage devices, relational database systems, and object-oriented databases. The semantic data modeling at the metadata record level is examined, including the simulation of a knowledge base and semantic metadata storage issues. The semantic data model hierarchy and its application for efficient data storage is addressed, as is the mapping of the application structure to the mass storage.

  9. Ongoing revolution in bacteriology: routine identification of bacteria by matrix-assisted laser desorption ionization time-of-flight mass spectrometry.

    PubMed

    Seng, Piseth; Drancourt, Michel; Gouriet, Frédérique; La Scola, Bernard; Fournier, Pierre-Edouard; Rolain, Jean Marc; Raoult, Didier

    2009-08-15

    Matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry accurately identifies both selected bacteria and bacteria in select clinical situations. It has not been evaluated for routine use in the clinic. We prospectively analyzed routine MALDI-TOF mass spectrometry identification in parallel with conventional phenotypic identification of bacteria regardless of phylum or source of isolation. Discrepancies were resolved by 16S ribosomal RNA and rpoB gene sequence-based molecular identification. Colonies (4 spots per isolate directly deposited on the MALDI-TOF plate) were analyzed using an Autoflex II Bruker Daltonik mass spectrometer. Peptidic spectra were compared with the Bruker BioTyper database, version 2.0, and the identification score was noted. Delays and costs of identification were measured. Of 1660 bacterial isolates analyzed, 95.4% were correctly identified by MALDI-TOF mass spectrometry; 84.1% were identified at the species level, and 11.3% were identified at the genus level. In most cases, absence of identification (2.8% of isolates) and erroneous identification (1.7% of isolates) were due to improper database entries. Accurate MALDI-TOF mass spectrometry identification was significantly correlated with having 10 reference spectra in the database (P=.01). The mean time required for MALDI-TOF mass spectrometry identification of 1 isolate was 6 minutes for an estimated 22%-32% cost of current methods of identification. MALDI-TOF mass spectrometry is a cost-effective, accurate method for routine identification of bacterial isolates in <1 h using a database comprising > or =10 reference spectra per bacterial species and a 1.9 identification score (Brucker system). It may replace Gram staining and biochemical identification in the near future.

  10. Mass spectrometry-based protein identification by integrating de novo sequencing with database searching.

    PubMed

    Wang, Penghao; Wilson, Susan R

    2013-01-01

    Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.

  11. 'Isotopo' a database application for facile analysis and management of mass isotopomer data.

    PubMed

    Ahmed, Zeeshan; Zeeshan, Saman; Huber, Claudia; Hensel, Michael; Schomburg, Dietmar; Münch, Richard; Eylert, Eva; Eisenreich, Wolfgang; Dandekar, Thomas

    2014-01-01

    The composition of stable-isotope labelled isotopologues/isotopomers in metabolic products can be measured by mass spectrometry and supports the analysis of pathways and fluxes. As a prerequisite, the original mass spectra have to be processed, managed and stored to rapidly calculate, analyse and compare isotopomer enrichments to study, for instance, bacterial metabolism in infection. For such applications, we provide here the database application 'Isotopo'. This software package includes (i) a database to store and process isotopomer data, (ii) a parser to upload and translate different data formats for such data and (iii) an improved application to process and convert signal intensities from mass spectra of (13)C-labelled metabolites such as tertbutyldimethylsilyl-derivatives of amino acids. Relative mass intensities and isotopomer distributions are calculated applying a partial least square method with iterative refinement for high precision data. The data output includes formats such as graphs for overall enrichments in amino acids. The package is user-friendly for easy and robust data management of multiple experiments. The 'Isotopo' software is available at the following web link (section Download): http://spp1316.uni-wuerzburg.de/bioinformatics/isotopo/. The package contains three additional files: software executable setup (installer), one data set file (discussed in this article) and one excel file (which can be used to convert data from excel to '.iso' format). The 'Isotopo' software is compatible only with the Microsoft Windows operating system. http://spp1316.uni-wuerzburg.de/bioinformatics/isotopo/. © The Author(s) 2014. Published by Oxford University Press.

  12. The INFN-CNAF Tier-1 GEMSS Mass Storage System and database facility activity

    NASA Astrophysics Data System (ADS)

    Ricci, Pier Paolo; Cavalli, Alessandro; Dell'Agnello, Luca; Favaro, Matteo; Gregori, Daniele; Prosperini, Andrea; Pezzi, Michele; Sapunenko, Vladimir; Zizzi, Giovanni; Vagnoni, Vincenzo

    2015-05-01

    The consolidation of Mass Storage services at the INFN-CNAF Tier1 Storage department that has occurred during the last 5 years, resulted in a reliable, high performance and moderately easy-to-manage facility that provides data access, archive, backup and database services to several different use cases. At present, the GEMSS Mass Storage System, developed and installed at CNAF and based upon an integration between the IBM GPFS parallel filesystem and the Tivoli Storage Manager (TSM) tape management software, is one of the largest hierarchical storage sites in Europe. It provides storage resources for about 12% of LHC data, as well as for data of other non-LHC experiments. Files are accessed using standard SRM Grid services provided by the Storage Resource Manager (StoRM), also developed at CNAF. Data access is also provided by XRootD and HTTP/WebDaV endpoints. Besides these services, an Oracle database facility is in production characterized by an effective level of parallelism, redundancy and availability. This facility is running databases for storing and accessing relational data objects and for providing database services to the currently active use cases. It takes advantage of several Oracle technologies, like Real Application Cluster (RAC), Automatic Storage Manager (ASM) and Enterprise Manager centralized management tools, together with other technologies for performance optimization, ease of management and downtime reduction. The aim of the present paper is to illustrate the state-of-the-art of the INFN-CNAF Tier1 Storage department infrastructures and software services, and to give a brief outlook to forthcoming projects. A description of the administrative, monitoring and problem-tracking tools that play a primary role in managing the whole storage framework is also given.

  13. GlycoDeNovo - an Efficient Algorithm for Accurate de novo Glycan Topology Reconstruction from Tandem Mass Spectra

    NASA Astrophysics Data System (ADS)

    Hong, Pengyu; Sun, Hui; Sha, Long; Pu, Yi; Khatri, Kshitij; Yu, Xiang; Tang, Yang; Lin, Cheng

    2017-08-01

    A major challenge in glycomics is the characterization of complex glycan structures that are essential for understanding their diverse roles in many biological processes. We present a novel efficient computational approach, named GlycoDeNovo, for accurate elucidation of the glycan topologies from their tandem mass spectra. Given a spectrum, GlycoDeNovo first builds an interpretation-graph specifying how to interpret each peak using preceding interpreted peaks. It then reconstructs the topologies of peaks that contribute to interpreting the precursor ion. We theoretically prove that GlycoDeNovo is highly efficient. A major innovative feature added to GlycoDeNovo is a data-driven IonClassifier which can be used to effectively rank candidate topologies. IonClassifier is automatically learned from experimental spectra of known glycans to distinguish B- and C-type ions from all other ion types. Our results showed that GlycoDeNovo is robust and accurate for topology reconstruction of glycans from their tandem mass spectra. [Figure not available: see fulltext.

  14. MS2Analyzer: A Software for Small Molecule Substructure Annotations from Accurate Tandem Mass Spectra

    PubMed Central

    2015-01-01

    Systematic analysis and interpretation of the large number of tandem mass spectra (MS/MS) obtained in metabolomics experiments is a bottleneck in discovery-driven research. MS/MS mass spectral libraries are small compared to all known small molecule structures and are often not freely available. MS2Analyzer was therefore developed to enable user-defined searches of thousands of spectra for mass spectral features such as neutral losses, m/z differences, and product and precursor ions from MS/MS spectra in MSP/MGF files. The software is freely available at http://fiehnlab.ucdavis.edu/projects/MS2Analyzer/. As the reference query set, 147 literature-reported neutral losses and their corresponding substructures were collected. This set was tested for accuracy of linking neutral loss analysis to substructure annotations using 19 329 accurate mass tandem mass spectra of structurally known compounds from the NIST11 MS/MS library. Validation studies showed that 92.1 ± 6.4% of 13 typical neutral losses such as acetylations, cysteine conjugates, or glycosylations are correct annotating the associated substructures, while the absence of mass spectra features does not necessarily imply the absence of such substructures. Use of this tool has been successfully demonstrated for complex lipids in microalgae. PMID:25263576

  15. DETERMINING ION COMPOSITIONS USING AN ACCURATE MASS, TRIPLE QUADRUPOLE MASS SPECTROMETER

    EPA Science Inventory

    For the past decade, we have used double focusing mass spectrometers to determine
    compositions of ions observed in mass spectra produced from compounds introduced by GC
    based on measured exact masses of the ions and their +1 and +2 isotopic profiles arising from atoms of ...

  16. Towards the Interoperability of Web, Database, and Mass Storage Technologies for Petabyte Archives

    NASA Technical Reports Server (NTRS)

    Moore, Reagan; Marciano, Richard; Wan, Michael; Sherwin, Tom; Frost, Richard

    1996-01-01

    At the San Diego Supercomputer Center, a massive data analysis system (MDAS) is being developed to support data-intensive applications that manipulate terabyte sized data sets. The objective is to support scientific application access to data whether it is located at a Web site, stored as an object in a database, and/or storage in an archival storage system. We are developing a suite of demonstration programs which illustrate how Web, database (DBMS), and archival storage (mass storage) technologies can be integrated. An application presentation interface is being designed that integrates data access to all of these sources. We have developed a data movement interface between the Illustra object-relational database and the NSL UniTree archival storage system running in a production mode at the San Diego Supercomputer Center. With this interface, an Illustra client can transparently access data on UniTree under the control of the Illustr DBMS server. The current implementation is based on the creation of a new DBMS storage manager class, and a set of library functions that allow the manipulation and migration of data stored as Illustra 'large objects'. We have extended this interface to allow a Web client application to control data movement between its local disk, the Web server, the DBMS Illustra server, and the UniTree mass storage environment. This paper describes some of the current approaches successfully integrating these technologies. This framework is measured against a representative sample of environmental data extracted from the San Diego Ba Environmental Data Repository. Practical lessons are drawn and critical research areas are highlighted.

  17. Identification and accurate quantification of structurally related peptide impurities in synthetic human C-peptide by liquid chromatography-high resolution mass spectrometry.

    PubMed

    Li, Ming; Josephs, Ralf D; Daireaux, Adeline; Choteau, Tiphaine; Westwood, Steven; Wielgosz, Robert I; Li, Hongmei

    2018-06-04

    Peptides are an increasingly important group of biomarkers and pharmaceuticals. The accurate purity characterization of peptide calibrators is critical for the development of reference measurement systems for laboratory medicine and quality control of pharmaceuticals. The peptides used for these purposes are increasingly produced through peptide synthesis. Various approaches (for example mass balance, amino acid analysis, qNMR, and nitrogen determination) can be applied to accurately value assign the purity of peptide calibrators. However, all purity assessment approaches require a correction for structurally related peptide impurities in order to avoid biases. Liquid chromatography coupled to high resolution mass spectrometry (LC-hrMS) has become the key technique for the identification and accurate quantification of structurally related peptide impurities in intact peptide calibrator materials. In this study, LC-hrMS-based methods were developed and validated in-house for the identification and quantification of structurally related peptide impurities in a synthetic human C-peptide (hCP) material, which served as a study material for an international comparison looking at the competencies of laboratories to perform peptide purity mass fraction assignments. More than 65 impurities were identified, confirmed, and accurately quantified by using LC-hrMS. The total mass fraction of all structurally related peptide impurities in the hCP study material was estimated to be 83.3 mg/g with an associated expanded uncertainty of 3.0 mg/g (k = 2). The calibration hierarchy concept used for the quantification of individual impurities is described in detail. Graphical abstract ᅟ.

  18. Mass spectrometry in Earth sciences: the precise and accurate measurement of time.

    PubMed

    Schaltegger, Urs; Wotzlaw, Jörn-Frederik; Ovtcharova, Maria; Chiaradia, Massimo; Spikings, Richard

    2014-01-01

    Precise determinations of the isotopic compositions of a variety of elements is a widely applied tool in Earth sciences. Isotope ratios are used to quantify rates of geological processes that occurred during the previous 4.5 billion years, and also at the present time. An outstanding application is geochronology, which utilizes the production of radiogenic daughter isotopes by the radioactive decay of parent isotopes. Geochronological tools, involving isotopic analysis of selected elements from smallest volumes of minerals by thermal ionization mass spectrometry, provide precise and accurate measurements of time throughout the geological history of our planet over nine orders of magnitude, from the accretion of the proto-planetary disk, to the timing of the last glaciation. This article summarizes the recent efforts of the Isotope Geochemistry, Geochronology and Thermochronology research group at the University of Geneva to advance the U-Pb geochronological tool to achieve unprecedented precision and accuracy, and presents two examples of its application to two significant open questions in Earth sciences: what are the triggers and timescales of volcanic supereruptions, and what were the causes of mass extinctions in the geological past, driven by global climatic and environmental deterioration?

  19. 2DB: a Proteomics database for storage, analysis, presentation, and retrieval of information from mass spectrometric experiments.

    PubMed

    Allmer, Jens; Kuhlgert, Sebastian; Hippler, Michael

    2008-07-07

    The amount of information stemming from proteomics experiments involving (multi dimensional) separation techniques, mass spectrometric analysis, and computational analysis is ever-increasing. Data from such an experimental workflow needs to be captured, related and analyzed. Biological experiments within this scope produce heterogenic data ranging from pictures of one or two-dimensional protein maps and spectra recorded by tandem mass spectrometry to text-based identifications made by algorithms which analyze these spectra. Additionally, peptide and corresponding protein information needs to be displayed. In order to handle the large amount of data from computational processing of mass spectrometric experiments, automatic import scripts are available and the necessity for manual input to the database has been minimized. Information is in a generic format which abstracts from specific software tools typically used in such an experimental workflow. The software is therefore capable of storing and cross analysing results from many algorithms. A novel feature and a focus of this database is to facilitate protein identification by using peptides identified from mass spectrometry and link this information directly to respective protein maps. Additionally, our application employs spectral counting for quantitative presentation of the data. All information can be linked to hot spots on images to place the results into an experimental context. A summary of identified proteins, containing all relevant information per hot spot, is automatically generated, usually upon either a change in the underlying protein models or due to newly imported identifications. The supporting information for this report can be accessed in multiple ways using the user interface provided by the application. We present a proteomics database which aims to greatly reduce evaluation time of results from mass spectrometric experiments and enhance result quality by allowing consistent data handling

  20. Identification of Microorganisms by High Resolution Tandem Mass Spectrometry with Accurate Statistical Significance

    NASA Astrophysics Data System (ADS)

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo

    2016-02-01

    Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.

  1. Accurate recognition and feature qualify for flavonoid extracts from Liang-wai Gan Cao by liquid chromatography-high resolution-mass spectrometry and computational MS/MS fragmentation.

    PubMed

    He, Min; Wu, Hai; Nie, Juan; Yan, Pan; Yang, Tian-Biao; Yang, Zhi-Yu; Pei, Rui

    2017-11-30

    In this study, Liquid Chromatography (LC) separation combined with quadrupole-Time-Of-Flight Mass Spectrometry (qTOF-MS) detection was used to analyze the characteristic ions of the flavonoids from Liang-wai Gan Cao (Radix Glycyrrhizae uralensis). First, accurate mass measurement and isotope curve optimization could provide reliable molecular prediction after noise deduction, baseline calibration and "ghost peak recognition". Thus, some spectral features in the LC-MS data could be clearly explained. Secondly, the chemical structure of flavonoids was deduced by MS/MS fragment ions, and the in-silico spectra by MS-FINDER program provided strong support for overcoming the bottleneck of phytochemical identification. For a predicted formula and experimental MS/MS spectrum, the MS-FINDER program could sort the candidate compounds in the public database based on a comprehensive weighted score, and we took the first 20 reliable compounds to seek the target compound in an in-house database. Certainly, those fragmentation pathways could also be deduced and described as Retro-Diels-Alder (RDA) fragmentation reaction, losses of C 4 H 8 , C 5 H 8 , CH 3 , CO, CO 2 and others. Accordingly, 63 flavonoids were identified, and their in-silico bioactivity were clearly disclosed by some bioinformatics tools. In this experiment, the flavonoids obtained by the four extraction processes were tested by LC-qTOF-MS. We looked for possible Q-markers from these data matrices and then quantified them; their similarities/differences were also described. The results also indicated that the Macroporous Adsorption Resins (MARs) purification is a low cost, environmentally friendly and effective approach. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Usefulness and accuracy of MALDI-TOF mass spectrometry as a supplementary tool to identify mosquito vector species and to invest in development of international database.

    PubMed

    Raharimalala, F N; Andrianinarivomanana, T M; Rakotondrasoa, A; Collard, J M; Boyer, S

    2017-09-01

    Arthropod-borne diseases are important causes of morbidity and mortality. The identification of vector species relies mainly on morphological features and/or molecular biology tools. The first method requires specific technical skills and may result in misidentifications, and the second method is time-consuming and expensive. The aim of the present study is to assess the usefulness and accuracy of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) as a supplementary tool with which to identify mosquito vector species and to invest in the creation of an international database. A total of 89 specimens belonging to 10 mosquito species were selected for the extraction of proteins from legs and for the establishment of a reference database. A blind test with 123 mosquitoes was performed to validate the MS method. Results showed that: (a) the spectra obtained in the study with a given species differed from the spectra of the same species collected in another country, which highlights the need for an international database; (b) MALDI-TOF MS is an accurate method for the rapid identification of mosquito species that are referenced in a database; (c) MALDI-TOF MS allows the separation of groups or complex species, and (d) laboratory specimens undergo a loss of proteins compared with those isolated in the field. In conclusion, MALDI-TOF MS is a useful supplementary tool for mosquito identification and can help inform vector control. © 2017 The Royal Entomological Society.

  3. Quantitative analysis of naphthenic acids in water by liquid chromatography-accurate mass time-of-flight mass spectrometry.

    PubMed

    Hindle, Ralph; Noestheden, Matthew; Peru, Kerry; Headley, John

    2013-04-19

    This study details the development of a routine method for quantitative analysis of oil sands naphthenic acids, which are a complex class of compounds found naturally and as contaminants in oil sands process waters from Alberta's Athabasca region. Expanding beyond classical naphthenic acids (CnH2n-zO2), those compounds conforming to the formula CnH2n-zOx (where 2≥x≤4) were examined in commercial naphthenic acid and environmental water samples. HPLC facilitated a five-fold reduction in ion suppression when compared to the more commonly used flow injection analysis. A comparison of 39 model naphthenic acids revealed significant variability in response factors, demonstrating the necessity of using naphthenic acid mixtures for quantitation, rather than model compounds. It was also demonstrated that naphthenic acidic heterogeneity (commercial and environmental) necessitates establishing a single NA mix as the standard against which all quantitation is performed. The authors present the first ISO17025 accredited method for the analysis of naphthenic acids in water using HPLC high resolution accurate mass time-of-flight mass spectrometry. The method detection limit was 1mg/L total oxy-naphthenic acids (Sigma technical mix). Copyright © 2013 Elsevier B.V. All rights reserved.

  4. The Xeno-glycomics database (XDB): a relational database of qualitative and quantitative pig glycome repertoire.

    PubMed

    Park, Hae-Min; Park, Ju-Hyeong; Kim, Yoon-Woo; Kim, Kyoung-Jin; Jeong, Hee-Jin; Jang, Kyoung-Soon; Kim, Byung-Gee; Kim, Yun-Gon

    2013-11-15

    In recent years, the improvement of mass spectrometry-based glycomics techniques (i.e. highly sensitive, quantitative and high-throughput analytical tools) has enabled us to obtain a large dataset of glycans. Here we present a database named Xeno-glycomics database (XDB) that contains cell- or tissue-specific pig glycomes analyzed with mass spectrometry-based techniques, including a comprehensive pig glycan information on chemical structures, mass values, types and relative quantities. It was designed as a user-friendly web-based interface that allows users to query the database according to pig tissue/cell types or glycan masses. This database will contribute in providing qualitative and quantitative information on glycomes characterized from various pig cells/organs in xenotransplantation and might eventually provide new targets in the α1,3-galactosyltransferase gene-knock out pigs era. The database can be accessed on the web at http://bioinformatics.snu.ac.kr/xdb.

  5. Multiplexed Post-Experimental Monoisotopic Mass Refinement ( m PE-MMR) to Increase Sensitivity and Accuracy in Peptide Identifications from Tandem Mass Spectra of Cofragmentation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Madar, Inamul Hasan; Ko, Seung-Ik; Kim, Hokeun

    Mass spectrometry (MS)-based proteomics, which uses high-resolution hybrid mass spectrometers such as the quadrupole-orbitrap mass spectrometer, can yield tens of thousands of tandem mass (MS/MS) spectra of high resolution during a routine bottom-up experiment. Despite being a fundamental and key step in MS-based proteomics, the accurate determination and assignment of precursor monoisotopic masses to the MS/MS spectra remains difficult. The difficulties stem from imperfect isotopic envelopes of precursor ions, inaccurate charge states for precursor ions, and cofragmentation. We describe a composite method of utilizing MS data to assign accurate monoisotopic masses to MS/MS spectra, including those subject to cofragmentation. Themore » method, “multiplexed post-experiment monoisotopic mass refinement” (mPE-MMR), consists of the following: multiplexing of precursor masses to assign multiple monoisotopic masses of cofragmented peptides to the corresponding multiplexed MS/MS spectra, multiplexing of charge states to assign correct charges to the precursor ions of MS/ MS spectra with no charge information, and mass correction for inaccurate monoisotopic peak picking. When combined with MS-GF+, a database search algorithm based on fragment mass difference, mPE-MMR effectively increases both sensitivity and accuracy in peptide identification from complex high-throughput proteomics data compared to conventional methods.« less

  6. Phoenix Lander's Thermal Evolved Gas Analyzer: Differential Scanning Calorimeter and Mass Spectrometer Database Development

    NASA Technical Reports Server (NTRS)

    Sutter, B.; Lauer, H. V.; Golden, D. C.; Ming, D. W.; Boynton, W. V.

    2008-01-01

    The Mars Scout Phoenix lander will land in the north polar region of Mars in May, 2008. One objective of the Phoenix lander is to search for evidence of past life in the form of molecular organics that may be preserved in the subsurface soil. The Thermal Evolved Gas Analyzer (TEGA) was developed to detect these organics by coupling a simultaneous differential thermal analyzer (SDTA) with a mass spectrometer. Martian soil will be heated to approx.1000 C and potential organic decomposition products such as CO2, CH4 etc. will be examined for with the MS. TEGA s SDTA will also assess the presence of endothermic and exothermic reactions that are characteristic of soil organics and minerals as the soil is heated. The MS in addition to detecting organic decompositon products, will also assess the levels of soil inorganic volatiles such as H2O, SO2, and CO2. Organic detection has a high priority for this mission; however, TEGA has the ability to provide valuable insight into the mineralogical composition of the soil. The overall goal of this work is to develop a TEGA database of minerals that will serve as a reference for the interpretation of Phoenix-TEGA. Previous databases for the ill-fated Mars Polar Lander (MPL)-TEGA instrument only went to 725 C. Furthermore, the MPL-TEGA could only detect CO2 and H2O while the Phoenix-TEGA MS can examine up to 144 atomic mass units. The higher temperature Phoenix-TEGA SDTA coupled with the more capable MS indicates that a higher temperature database is required for TEGA interpretation. The overall goal of this work is to develop a differential scanning calorimeter (DSC) database of minerals along with corresponding MS data of evolved gases that can used to interpret TEGA data during and after mission operations. While SDTA and DSC measurement techniques are slightly different (SDTA does not use a reference pan), the results are fundamentally similar and thus DSC is a useful technique in providing comparative data for the TEGA

  7. Identification of Novel Perfluoroalkyl Ether Carboxylic Acids (PFECAs) and Sulfonic Acids (PFESAs) in Natural Waters Using Accurate Mass Time-of-Flight Mass Spectrometry (TOFMS).

    PubMed

    Strynar, Mark; Dagnino, Sonia; McMahen, Rebecca; Liang, Shuang; Lindstrom, Andrew; Andersen, Erik; McMillan, Larry; Thurman, Michael; Ferrer, Imma; Ball, Carol

    2015-10-06

    Recent scientific scrutiny and concerns over exposure, toxicity, and risk have led to international regulatory efforts resulting in the reduction or elimination of certain perfluorinated compounds from various products and waste streams. Some manufacturers have started producing shorter chain per- and polyfluorinated compounds to try to reduce the potential for bioaccumulation in humans and wildlife. Some of these new compounds contain central ether oxygens or other minor modifications of traditional perfluorinated structures. At present, there has been very limited information published on these "replacement chemistries" in the peer-reviewed literature. In this study we used a time-of-flight mass spectrometry detector (LC-ESI-TOFMS) to identify fluorinated compounds in natural waters collected from locations with historical perfluorinated compound contamination. Our workflow for discovery of chemicals included sequential sampling of surface water for identification of potential sources, nontargeted TOFMS analysis, molecular feature extraction (MFE) of samples, and evaluation of features unique to the sample with source inputs. Specifically, compounds were tentatively identified by (1) accurate mass determination of parent and/or related adducts and fragments from in-source collision-induced dissociation (CID), (2) in-depth evaluation of in-source adducts formed during analysis, and (3) confirmation with authentic standards when available. We observed groups of compounds in homologous series that differed by multiples of CF2 (m/z 49.9968) or CF2O (m/z 65.9917). Compounds in each series were chromatographically separated and had comparable fragments and adducts produced during analysis. We detected 12 novel perfluoroalkyl ether carboxylic and sulfonic acids in surface water in North Carolina, USA using this approach. A key piece of evidence was the discovery of accurate mass in-source n-mer formation (H(+) and Na(+)) differing by m/z 21.9819, corresponding to the

  8. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

    PubMed

    Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

    2016-07-08

    The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. An accurate and adaptable photogrammetric approach for estimating the mass and body condition of pinnipeds using an unmanned aerial system

    PubMed Central

    Hinke, Jefferson T.; Perryman, Wayne L.; Goebel, Michael E.; LeRoi, Donald J.

    2017-01-01

    Measurements of body size and mass are fundamental to pinniped population management and research. Manual measurements tend to be accurate but are invasive and logistically challenging to obtain. Ground-based photogrammetric techniques are less invasive, but inherent limitations make them impractical for many field applications. The recent proliferation of unmanned aerial systems (UAS) in wildlife monitoring has provided a promising new platform for the photogrammetry of free-ranging pinnipeds. Leopard seals (Hydrurga leptonyx) are an apex predator in coastal Antarctica whose body condition could be a valuable indicator of ecosystem health. We aerially surveyed leopard seals of known body size and mass to test the precision and accuracy of photogrammetry from a small UAS. Flights were conducted in January and February of 2013 and 2014 and 50 photogrammetric samples were obtained from 15 unrestrained seals. UAS-derived measurements of standard length were accurate to within 2.01 ± 1.06%, and paired comparisons with ground measurements were statistically indistinguishable. An allometric linear mixed effects model predicted leopard seal mass within 19.40 kg (4.4% error for a 440 kg seal). Photogrammetric measurements from a single, vertical image obtained using UAS provide a noninvasive approach for estimating the mass and body condition of pinnipeds that may be widely applicable. PMID:29186134

  10. An accurate and adaptable photogrammetric approach for estimating the mass and body condition of pinnipeds using an unmanned aerial system.

    PubMed

    Krause, Douglas J; Hinke, Jefferson T; Perryman, Wayne L; Goebel, Michael E; LeRoi, Donald J

    2017-01-01

    Measurements of body size and mass are fundamental to pinniped population management and research. Manual measurements tend to be accurate but are invasive and logistically challenging to obtain. Ground-based photogrammetric techniques are less invasive, but inherent limitations make them impractical for many field applications. The recent proliferation of unmanned aerial systems (UAS) in wildlife monitoring has provided a promising new platform for the photogrammetry of free-ranging pinnipeds. Leopard seals (Hydrurga leptonyx) are an apex predator in coastal Antarctica whose body condition could be a valuable indicator of ecosystem health. We aerially surveyed leopard seals of known body size and mass to test the precision and accuracy of photogrammetry from a small UAS. Flights were conducted in January and February of 2013 and 2014 and 50 photogrammetric samples were obtained from 15 unrestrained seals. UAS-derived measurements of standard length were accurate to within 2.01 ± 1.06%, and paired comparisons with ground measurements were statistically indistinguishable. An allometric linear mixed effects model predicted leopard seal mass within 19.40 kg (4.4% error for a 440 kg seal). Photogrammetric measurements from a single, vertical image obtained using UAS provide a noninvasive approach for estimating the mass and body condition of pinnipeds that may be widely applicable.

  11. Accurate donor electron wave functions from a multivalley effective mass theory.

    NASA Astrophysics Data System (ADS)

    Pendo, Luke; Hu, Xuedong

    Multivalley effective mass (MEM) theories combine physical intuition with a marginal need for computational resources, but they tend to be insensitive to variations in the wavefunction. However, recent papers suggest full Bloch functions and suitable central cell donor potential corrections are essential to replicating qualitative and quantitative features of the wavefunction. In this talk, we consider a variational MEM method that can accurately predict both spectrum and wavefunction of isolated phosphorus donors. As per Gamble et. al, we employ a truncated series representation of the Bloch function with a tetrahedrally symmetric central cell correction. We use a dynamic dielectric constant, a feature commonly seen in tight-binding methods. Uniquely, we use a freely extensible basis of either all Slater- or all Gaussian-type functions. With a large basis able to capture the influence of higher energy eigenstates, this method is well positioned to consider the influence of external perturbations, such as electric field or applied strain, on the charge density. This work is supported by the US Army Research Office (W911NF1210609).

  12. Microbalance accurately measures extremely small masses

    NASA Technical Reports Server (NTRS)

    Patashnick, H.

    1970-01-01

    Oscillating fiber microbalance has a vibrating quartz fiber as balance arm to hold the mass to be weighed. Increasing fiber weight decreases its resonant frequency. Scaler and timer measure magnitude of the shift. This instrument withstands considerable physical abuse and has calibration stability at normal room temperatures.

  13. Fourier Transform Mass Spectrometry and Nuclear Magnetic Resonance Analysis for the Rapid and Accurate Characterization of Hexacosanoylceramide.

    PubMed

    Ross, Charles W; Simonsick, William J; Bogusky, Michael J; Celikay, Recep W; Guare, James P; Newton, Randall C

    2016-06-28

    Ceramides are a central unit of all sphingolipids which have been identified as sites of biological recognition on cellular membranes mediating cell growth and differentiation. Several glycosphingolipids have been isolated, displaying immunomodulatory and anti-tumor activities. These molecules have generated considerable interest as potential vaccine adjuvants in humans. Accurate analyses of these and related sphingosine analogues are important for the characterization of structure, biological function, and metabolism. We report the complementary use of direct laser desorption ionization (DLDI), sheath flow electrospray ionization (ESI) Fourier transform ion cyclotron resonance mass spectrometry (FTICR MS) and high-field nuclear magnetic resonance (NMR) analysis for the rapid, accurate identification of hexacosanoylceramide and starting materials. DLDI does not require stringent sample preparation and yields representative ions. Sheath-flow ESI yields ions of the product and byproducts and was significantly better than monospray ESI due to improved compound solubility. Negative ion sheath flow ESI provided data of starting materials and products all in one acquisition as hexacosanoic acid does not ionize efficiently when ceramides are present. NMR provided characterization of these lipid molecules complementing the results obtained from MS analyses. NMR data was able to differentiate straight chain versus branched chain alkyl groups not easily obtained from mass spectrometry.

  14. Accurate inclusion mass screening: a bridge from unbiased discovery to targeted assay development for biomarker verification.

    PubMed

    Jaffe, Jacob D; Keshishian, Hasmik; Chang, Betty; Addona, Theresa A; Gillette, Michael A; Carr, Steven A

    2008-10-01

    Verification of candidate biomarker proteins in blood is typically done using multiple reaction monitoring (MRM) of peptides by LC-MS/MS on triple quadrupole MS systems. MRM assay development for each protein requires significant time and cost, much of which is likely to be of little value if the candidate biomarker is below the detection limit in blood or a false positive in the original discovery data. Here we present a new technology, accurate inclusion mass screening (AIMS), designed to provide a bridge from unbiased discovery to MS-based targeted assay development. Masses on the software inclusion list are monitored in each scan on the Orbitrap MS system, and MS/MS spectra for sequence confirmation are acquired only when a peptide from the list is detected with both the correct accurate mass and charge state. The AIMS experiment confirms that a given peptide (and thus the protein from which it is derived) is present in the plasma. Throughput of the method is sufficient to qualify up to a hundred proteins/week. The sensitivity of AIMS is similar to MRM on a triple quadrupole MS system using optimized sample preparation methods (low tens of ng/ml in plasma), and MS/MS data from the AIMS experiments on the Orbitrap can be directly used to configure MRM assays. The method was shown to be at least 4-fold more efficient at detecting peptides of interest than undirected LC-MS/MS experiments using the same instrumentation, and relative quantitation information can be obtained by AIMS in case versus control experiments. Detection by AIMS ensures that a quantitative MRM-based assay can be configured for that protein. The method has the potential to qualify large number of biomarker candidates based on their detection in plasma prior to committing to the time- and resource-intensive steps of establishing a quantitative assay.

  15. Accurate calibration of a molecular beam time-of-flight mass spectrometer for on-line analysis of high molecular weight species.

    PubMed

    Apicella, B; Wang, X; Passaro, M; Ciajolo, A; Russo, C

    2016-10-15

    Time-of-Flight (TOF) Mass Spectrometry is a powerful analytical technique, provided that an accurate calibration by standard molecules in the same m/z range of the analytes is performed. Calibration in a very large m/z range is a difficult task, particularly in studies focusing on the detection of high molecular weight clusters of different molecules or high molecular weight species. External calibration is the most common procedure used for TOF mass spectrometric analysis in the gas phase and, generally, the only available standards are made up of mixtures of noble gases, covering a small mass range for calibration, up to m/z 136 (higher mass isotope of xenon). In this work, an accurate calibration of a Molecular Beam Time-of Flight Mass Spectrometer (MB-TOFMS) is presented, based on the use of water clusters up to m/z 3000. The advantages of calibrating a MB-TOFMS with water clusters for the detection of analytes with masses above those of the traditional calibrants such as noble gases were quantitatively shown by statistical calculations. A comparison of the water cluster and noble gases calibration procedures in attributing the masses to a test mixture extending up to m/z 800 is also reported. In the case of the analysis of combustion products, another important feature of water cluster calibration was shown, that is the possibility of using them as "internal standard" directly formed from the combustion water, under suitable experimental conditions. The water clusters calibration of a MB-TOFMS gives rise to a ten-fold reduction in error compared to the traditional calibration with noble gases. The consequent improvement in mass accuracy in the calibration of a MB-TOFMS has important implications in various fields where detection of high molecular mass species is required. In combustion products analysis, it is also possible to obtain a new calibration spectrum before the acquisition of each spectrum, only modifying some operative conditions. Copyright © 2016

  16. SPARC: MASS MODELS FOR 175 DISK GALAXIES WITH SPITZER PHOTOMETRY AND ACCURATE ROTATION CURVES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lelli, Federico; McGaugh, Stacy S.; Schombert, James M., E-mail: federico.lelli@case.edu

    2016-12-01

    We introduce SPARC ( Spitzer Photometry and Accurate Rotation Curves): a sample of 175 nearby galaxies with new surface photometry at 3.6  μ m and high-quality rotation curves from previous H i/H α studies. SPARC spans a broad range of morphologies (S0 to Irr), luminosities (∼5 dex), and surface brightnesses (∼4 dex). We derive [3.6] surface photometry and study structural relations of stellar and gas disks. We find that both the stellar mass–H i mass relation and the stellar radius–H i radius relation have significant intrinsic scatter, while the H i   mass–radius relation is extremely tight. We build detailedmore » mass models and quantify the ratio of baryonic to observed velocity ( V {sub bar}/ V {sub obs}) for different characteristic radii and values of the stellar mass-to-light ratio (ϒ{sub ⋆}) at [3.6]. Assuming ϒ{sub ⋆} ≃ 0.5 M {sub ⊙}/ L {sub ⊙} (as suggested by stellar population models), we find that (i) the gas fraction linearly correlates with total luminosity; (ii) the transition from star-dominated to gas-dominated galaxies roughly corresponds to the transition from spiral galaxies to dwarf irregulars, in line with density wave theory; and (iii)  V {sub bar}/ V {sub obs} varies with luminosity and surface brightness: high-mass, high-surface-brightness galaxies are nearly maximal, while low-mass, low-surface-brightness galaxies are submaximal. These basic properties are lost for low values of ϒ{sub ⋆} ≃ 0.2 M {sub ⊙}/ L {sub ⊙} as suggested by the DiskMass survey. The mean maximum-disk limit in bright galaxies is ϒ{sub ⋆} ≃ 0.7 M {sub ⊙}/ L {sub ⊙} at [3.6]. The SPARC data are publicly available and represent an ideal test bed for models of galaxy formation.« less

  17. Comparison of peak-picking workflows for untargeted liquid chromatography/high-resolution mass spectrometry metabolomics data analysis.

    PubMed

    Rafiei, Atefeh; Sleno, Lekha

    2015-01-15

    Data analysis is a key step in mass spectrometry based untargeted metabolomics, starting with the generation of generic peak lists from raw liquid chromatography/mass spectrometry (LC/MS) data. Due to the use of various algorithms by different workflows, the results of different peak-picking strategies often differ widely. Raw LC/HRMS data from two types of biological samples (bile and urine), as well as a standard mixture of 84 metabolites, were processed with four peak-picking softwares: Peakview®, Markerview™, MetabolitePilot™ and XCMS Online. The overlaps between the results of each peak-generating method were then investigated. To gauge the relevance of peak lists, a database search using the METLIN online database was performed to determine which features had accurate masses matching known metabolites as well as a secondary filtering based on MS/MS spectral matching. In this study, only a small proportion of all peaks (less than 10%) were common to all four software programs. Comparison of database searching results showed peaks found uniquely by one workflow have less chance of being found in the METLIN metabolomics database and are even less likely to be confirmed by MS/MS. It was shown that the performance of peak-generating workflows has a direct impact on untargeted metabolomics results. As it was demonstrated that the peaks found in more than one peak detection workflow have higher potential to be identified by accurate mass as well as MS/MS spectrum matching, it is suggested to use the overlap of different peak-picking workflows as preliminary peak lists for more rugged statistical analysis in global metabolomics investigations. Copyright © 2014 John Wiley & Sons, Ltd.

  18. ACCURATE LOW-MASS STELLAR MODELS OF KOI-126

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Feiden, Gregory A.; Chaboyer, Brian; Dotter, Aaron, E-mail: gregory.a.feiden@dartmouth.edu

    2011-10-10

    The recent discovery of an eclipsing hierarchical triple system with two low-mass stars in a close orbit (KOI-126) by Carter et al. appeared to reinforce the evidence that theoretical stellar evolution models are not able to reproduce the observational mass-radius relation for low-mass stars. We present a set of stellar models for the three stars in the KOI-126 system that show excellent agreement with the observed radii. This agreement appears to be due to the equation of state implemented by our code. A significant dispersion in the observed mass-radius relation for fully convective stars is demonstrated; indicative of the influencemore » of physics currently not incorporated in standard stellar evolution models. We also predict apsidal motion constants for the two M dwarf companions. These values should be observationally determined to within 1% by the end of the Kepler mission.« less

  19. Non-targeted analysis of electronics waste by comprehensive two-dimensional gas chromatography combined with high-resolution mass spectrometry: Using accurate mass information and mass defect analysis to explore the data.

    PubMed

    Ubukata, Masaaki; Jobst, Karl J; Reiner, Eric J; Reichenbach, Stephen E; Tao, Qingping; Hang, Jiliang; Wu, Zhanpin; Dane, A John; Cody, Robert B

    2015-05-22

    Comprehensive two-dimensional gas chromatography (GC×GC) and high-resolution mass spectrometry (HRMS) offer the best possible separation of their respective techniques. Recent commercialization of combined GC×GC-HRMS systems offers new possibilities for the analysis of complex mixtures. However, such experiments yield enormous data sets that require new informatics tools to facilitate the interpretation of the rich information content. This study reports on the analysis of dust obtained from an electronics recycling facility by using GC×GC in combination with a new high-resolution time-of-flight (TOF) mass spectrometer. New software tools for (non-traditional) Kendrick mass defect analysis were developed in this research and greatly aided in the identification of compounds containing chlorine and bromine, elements that feature in most persistent organic pollutants (POPs). In essence, the mass defect plot serves as a visual aid from which halogenated compounds are recognizable on the basis of their mass defect and isotope patterns. Mass chromatograms were generated based on specific ions identified in the plots as well as region of the plot predominantly occupied by halogenated contaminants. Tentative identification was aided by database searches, complementary electron-capture negative ionization experiments and elemental composition determinations from the exact mass data. These included known and emerging flame retardants, such as polybrominated diphenyl ethers (PBDEs), hexabromobenzene, tetrabromo bisphenol A and tris (1-chloro-2-propyl) phosphate (TCPP), as well as other legacy contaminants such as polychlorinated biphenyls (PCBs) and polychlorinated terphenyls (PCTs). Copyright © 2015 Elsevier B.V. All rights reserved.

  20. MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics.

    PubMed

    Jeffryes, James G; Colastani, Ricardo L; Elbadawi-Sidhu, Mona; Kind, Tobias; Niehaus, Thomas D; Broadbelt, Linda J; Hanson, Andrew D; Fiehn, Oliver; Tyo, Keith E J; Henry, Christopher S

    2015-01-01

    In spite of its great promise, metabolomics has proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography-mass spectrometry (LC-MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC-MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical

  1. MINEs: Open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics

    DOE PAGES

    Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; ...

    2015-08-28

    Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likelymore » to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose

  2. 78 FR 60861 - Native American Tribal Insignia Database

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-10-02

    ... Database ACTION: Proposed collection; comment request. SUMMARY: The United States Patent and Trademark... the report was that the USPTO create and maintain an accurate and comprehensive database containing... this recommendation, the Senate Committee on Appropriations directed the USPTO to create this database...

  3. Curation accuracy of model organism databases

    PubMed Central

    Keseler, Ingrid M.; Skrzypek, Marek; Weerasinghe, Deepika; Chen, Albert Y.; Fulcher, Carol; Li, Gene-Wei; Lemmer, Kimberly C.; Mladinich, Katherine M.; Chow, Edmond D.; Sherlock, Gavin; Karp, Peter D.

    2014-01-01

    Manual extraction of information from the biomedical literature—or biocuration—is the central methodology used to construct many biological databases. For example, the UniProt protein database, the EcoCyc Escherichia coli database and the Candida Genome Database (CGD) are all based on biocuration. Biological databases are used extensively by life science researchers, as online encyclopedias, as aids in the interpretation of new experimental data and as golden standards for the development of new bioinformatics algorithms. Although manual curation has been assumed to be highly accurate, we are aware of only one previous study of biocuration accuracy. We assessed the accuracy of EcoCyc and CGD by manually selecting curated assertions within randomly chosen EcoCyc and CGD gene pages and by then validating that the data found in the referenced publications supported those assertions. A database assertion is considered to be in error if that assertion could not be found in the publication cited for that assertion. We identified 10 errors in the 633 facts that we validated across the two databases, for an overall error rate of 1.58%, and individual error rates of 1.82% for CGD and 1.40% for EcoCyc. These data suggest that manual curation of the experimental literature by Ph.D-level scientists is highly accurate. Database URL: http://ecocyc.org/, http://www.candidagenome.org// PMID:24923819

  4. Short communication: Evaluation of MALDI-TOF mass spectrometry and a custom reference spectra expanded database for the identification of bovine-associated coagulase-negative staphylococci.

    PubMed

    Cameron, M; Perry, J; Middleton, J R; Chaffer, M; Lewis, J; Keefe, G P

    2018-01-01

    This study evaluated MALDI-TOF mass spectrometry and a custom reference spectra expanded database for the identification of bovine-associated coagulase-negative staphylococci (CNS). A total of 861 CNS isolates were used in the study, covering 21 different CNS species. The majority of the isolates were previously identified by rpoB gene sequencing (n = 804) and the remainder were identified by sequencing of hsp60 (n = 56) and tuf (n = 1). The genotypic identification was considered the gold standard identification. Using a direct transfer protocol and the existing commercial database, MALDI-TOF mass spectrometry showed a typeability of 96.5% (831/861) and an accuracy of 99.2% (824/831). Using a custom reference spectra expanded database, which included an additional 13 in-house created reference spectra, isolates were identified by MALDI-TOF mass spectrometry with 99.2% (854/861) typeability and 99.4% (849/854) accuracy. Overall, MALDI-TOF mass spectrometry using the direct transfer method was shown to be a highly reliable tool for the identification of bovine-associated CNS. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  5. Modeling the High Speed Research Cycle 2B Longitudinal Aerodynamic Database Using Multivariate Orthogonal Functions

    NASA Technical Reports Server (NTRS)

    Morelli, E. A.; Proffitt, M. S.

    1999-01-01

    The data for longitudinal non-dimensional, aerodynamic coefficients in the High Speed Research Cycle 2B aerodynamic database were modeled using polynomial expressions identified with an orthogonal function modeling technique. The discrepancy between the tabular aerodynamic data and the polynomial models was tested and shown to be less than 15 percent for drag, lift, and pitching moment coefficients over the entire flight envelope. Most of this discrepancy was traced to smoothing local measurement noise and to the omission of mass case 5 data in the modeling process. A simulation check case showed that the polynomial models provided a compact and accurate representation of the nonlinear aerodynamic dependencies contained in the HSR Cycle 2B tabular aerodynamic database.

  6. Database Search Engines: Paradigms, Challenges and Solutions.

    PubMed

    Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.

  7. Tandem mass spectrometry for the detection of plant pathogenic fungi and the effects of database composition on protein inferences.

    PubMed

    Padliya, Neerav D; Garrett, Wesley M; Campbell, Kimberly B; Tabb, David L; Cooper, Bret

    2007-11-01

    LC-MS/MS has demonstrated potential for detecting plant pathogens. Unlike PCR or ELISA, LC-MS/MS does not require pathogen-specific reagents for the detection of pathogen-specific proteins and peptides. However, the MS/MS approach we and others have explored does require a protein sequence reference database and database-search software to interpret tandem mass spectra. To evaluate the limitations of database composition on pathogen identification, we analyzed proteins from cultured Ustilago maydis, Phytophthora sojae, Fusarium graminearum, and Rhizoctonia solani by LC-MS/MS. When the search database did not contain sequences for a target pathogen, or contained sequences to related pathogens, target pathogen spectra were reliably matched to protein sequences from nontarget organisms, giving an illusion that proteins from nontarget organisms were identified. Our analysis demonstrates that when database-search software is used as part of the identification process, a paradox exists whereby additional sequences needed to detect a wide variety of possible organisms may lead to more cross-species protein matches and misidentification of pathogens.

  8. Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Sacks, David B.; Yu, Yi-Kuo

    2018-06-01

    Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.

  9. Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry.

    PubMed

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Sacks, David B; Yu, Yi-Kuo

    2018-06-05

    Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.

  10. Pivotal role of computers and software in mass spectrometry - SEQUEST and 20 years of tandem MS database searching.

    PubMed

    Yates, John R

    2015-11-01

    Advances in computer technology and software have driven developments in mass spectrometry over the last 50 years. Computers and software have been impactful in three areas: the automation of difficult calculations to aid interpretation, the collection of data and control of instruments, and data interpretation. As the power of computers has grown, so too has the utility and impact on mass spectrometers and their capabilities. This has been particularly evident in the use of tandem mass spectrometry data to search protein and nucleotide sequence databases to identify peptide and protein sequences. This capability has driven the development of many new approaches to study biological systems, including the use of "bottom-up shotgun proteomics" to directly analyze protein mixtures. Graphical Abstract ᅟ.

  11. Proteogenomic Analysis of Polymorphisms and Gene Annotation Divergences in Prokaryotes using a Clustered Mass Spectrometry-Friendly Database*

    PubMed Central

    de Souza, Gustavo A.; Arntzen, Magnus Ø.; Fortuin, Suereta; Schürch, Anita C.; Målen, Hiwa; McEvoy, Christopher R. E.; van Soolingen, Dick; Thiede, Bernd; Warren, Robin M.; Wiker, Harald G.

    2011-01-01

    Precise annotation of genes or open reading frames is still a difficult task that results in divergence even for data generated from the same genomic sequence. This has an impact in further proteomic studies, and also compromises the characterization of clinical isolates with many specific genetic variations that may not be represented in the selected database. We recently developed software called multistrain mass spectrometry prokaryotic database builder (MSMSpdbb) that can merge protein databases from several sources and be applied on any prokaryotic organism, in a proteomic-friendly approach. We generated a database for the Mycobacterium tuberculosis complex (using three strains of Mycobacterium bovis and five of M. tuberculosis), and analyzed data collected from two laboratory strains and two clinical isolates of M. tuberculosis. We identified 2561 proteins, of which 24 were present in M. tuberculosis H37Rv samples, but not annotated in the M. tuberculosis H37Rv genome. We were also able to identify 280 nonsynonymous single amino acid polymorphisms and confirm 367 translational start sites. As a proof of concept we applied the database to whole-genome DNA sequencing data of one of the clinical isolates, which allowed the validation of 116 predicted single amino acid polymorphisms and the annotation of 131 N-terminal start sites. Moreover we identified regions not present in the original M. tuberculosis H37Rv sequence, indicating strain divergence or errors in the reference sequence. In conclusion, we demonstrated the potential of using a merged database to better characterize laboratory or clinical bacterial strains. PMID:21030493

  12. Discovery of Neuropeptides in the Nematode Ascaris suum by Database Mining and Tandem Mass Spectrometry

    PubMed Central

    Jarecki, Jessica L.; Frey, Brian L.; Smith, Lloyd M.; Stretton, Antony O.

    2011-01-01

    Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) was used to discover peptides in extracts of the large parasitic nematode Ascaris suum. This required the assembly of a new database of known and predicted peptides. In addition to those already sequenced, peptides were either previously predicted to be processed from precursor proteins identified in an A. suum library of expressed sequence tags (ESTs), or newly predicted from a library of A. suum genome survey sequences (GSSs). The predicted MS/MS fragmentation patterns of this collection of real and putative peptides were compared with the actual fragmentation patterns found in the MS/MS spectra of peptides fractionated by MS; this enabled individual peptides to be sequenced. Many previously identified peptides were found, and 21 novel peptides were discovered. Thus, this approach is very useful, despite the fact that the available GSS database is still preliminary, having only 1X coverage. PMID:21524146

  13. Capsule Typing of Haemophilus influenzae by Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry.

    PubMed

    Månsson, Viktor; Gilsdorf, Janet R; Kahlmeter, Gunnar; Kilian, Mogens; Kroll, J Simon; Riesbeck, Kristian; Resman, Fredrik

    2018-03-01

    Encapsulated Haemophilus influenzae strains belong to type-specific genetic lineages. Reliable capsule typing requires PCR, but a more efficient method would be useful. We evaluated capsule typing by using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. Isolates of all capsule types (a-f and nontypeable; n = 258) and isogenic capsule transformants (types a-d) were investigated. Principal component and biomarker analyses of mass spectra showed clustering, and mass peaks correlated with capsule type-specific genetic lineages. We used 31 selected isolates to construct a capsule typing database. Validation with the remaining isolates (n = 227) showed 100% sensitivity and 92.2% specificity for encapsulated strains (a-f; n = 61). Blinded validation of a supplemented database (n = 50) using clinical isolates (n = 126) showed 100% sensitivity and 100% specificity for encapsulated strains (b, e, and f; n = 28). MALDI-TOF mass spectrometry is an accurate method for capsule typing of H. influenzae.

  14. Determination of doping peptides via solid-phase microelution and accurate-mass quadrupole time-of-flight LC-MS.

    PubMed

    Cuervo, Darío; Loli, Cynthia; Fernández-Álvarez, María; Muñoz, Gloria; Carreras, Daniel

    2017-10-15

    A complete analytical protocol for the determination of 25 doping-related peptidic drugs and 3 metabolites in urine was developed by means of accurate-mass quadrupole time-of-flight (Q-TOF) LC-MS analysis following solid-phase extraction (SPE) on microplates and conventional SPE pre-treatment for initial testing and confirmation, respectively. These substances included growth hormone releasing factors, gonadotropin releasing factors and anti-diuretic hormones, with molecular weights ranging from 540 to 1320Da. Optimal experimental conditions were stablished after investigation of different parameters concerning sample preparation and instrumental analysis. Weak cation exchange SPE followed by C18 HPLC chromatography and accurate mass detection provided the required sensitivity and selectivity for all the target peptides under study. 2mg SPE on 96-well microplates can be used in combination with full scan MS detection for the initial testing, thus providing a fast, cost-effective and high-throughput protocol for the processing of a large batch of samples simultaneously. On the other hand, extraction on 30mg SPE cartridges and subsequent target MS/MS determination was the protocol of choice for confirmatory purposes. The methodology was validated in terms of selectivity, recovery, matrix effect, precision, sensitivity (limit of detection, LOD), cross contamination, carryover, robustness and stability. Recoveries ranged from 6 to 70% (microplates) and 17-95% (cartridges), with LODs from 0.1 to 1ng/mL. The suitability of the method was assessed by analyzing different spiked or excreted urines containing some of the target substances. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Linking high resolution mass spectrometry data with exposure ...

    EPA Pesticide Factsheets

    There is a growing need in the field of exposure science for monitoring methods that rapidly screen environmental media for suspect contaminants. Measurement and analysis platforms, based on high resolution mass spectrometry (HRMS), now exist to meet this need. Here we describe results of a study that links HRMS data with exposure predictions from the U.S. EPA's ExpoCast™ program and in vitro bioassay data from the U.S. interagency Tox21 consortium. Vacuum dust samples were collected from 56 households across the U.S. as part of the American Healthy Homes Survey (AHHS). Sample extracts were analyzed using liquid chromatography time-of-flight mass spectrometry (LC–TOF/MS) with electrospray ionization. On average, approximately 2000 molecular features were identified per sample (based on accurate mass) in negative ion mode, and 3000 in positive ion mode. Exact mass, isotope distribution, and isotope spacing were used to match molecular features with a unique listing of chemical formulas extracted from EPA's Distributed Structure-Searchable Toxicity (DSSTox) database. A total of 978 DSSTox formulas were consistent with the dust LC–TOF/molecular feature data (match score ≥ 90); these formulas mapped to 3228 possible chemicals in the database. Correct assignment of a unique chemical to a given formula required additional validation steps. Each suspect chemical was prioritized for follow-up confirmation using abundance and detection frequency results, along wi

  16. Dereplication of plant phenolics using a mass-spectrometry database independent method.

    PubMed

    Borges, Ricardo M; Taujale, Rahil; de Souza, Juliana Santana; de Andrade Bezerra, Thaís; Silva, Eder Lana E; Herzog, Ronny; Ponce, Francesca V; Wolfender, Jean-Luc; Edison, Arthur S

    2018-05-29

    Dereplication, an approach to sidestep the efforts involved in the isolation of known compounds, is generally accepted as being the first stage of novel discoveries in natural product research. It is based on metabolite profiling analysis of complex natural extracts. To present the application of LipidXplorer for automatic targeted dereplication of phenolics in plant crude extracts based on direct infusion high-resolution tandem mass spectrometry data. LipidXplorer uses a user-defined molecular fragmentation query language (MFQL) to search for specific characteristic fragmentation patterns in large data sets and highlight the corresponding metabolites. To this end, MFQL files were written to dereplicate common phenolics occurring in plant extracts. Complementary MFQL files were used for validation purposes. New MFQL files with molecular formula restrictions for common classes of phenolic natural products were generated for the metabolite profiling of different representative crude plant extracts. This method was evaluated against an open-source software for mass-spectrometry data processing (MZMine®) and against manual annotation based on published data. The targeted LipidXplorer method implemented using common phenolic fragmentation patterns, was found to be able to annotate more phenolics than MZMine® that is based on automated queries on the available databases. Additionally, screening for ascarosides, natural products with unrelated structures to plant phenolics collected from the nematode Caenorhabditis elegans, demonstrated the specificity of this method by cross-testing both groups of chemicals in both plants and nematodes. Copyright © 2018 John Wiley & Sons, Ltd.

  17. An Accurate Mass Determination for Kepler-1655b, a Moderately Irradiated World with a Significant Volatile Envelope

    NASA Astrophysics Data System (ADS)

    Haywood, Raphaëlle D.; Vanderburg, Andrew; Mortier, Annelies; Giles, Helen A. C.; López-Morales, Mercedes; Lopez, Eric D.; Malavolta, Luca; Charbonneau, David; Collier Cameron, Andrew; Coughlin, Jeffrey L.; Dressing, Courtney D.; Nava, Chantanelle; Latham, David W.; Dumusque, Xavier; Lovis, Christophe; Molinari, Emilio; Pepe, Francesco; Sozzetti, Alessandro; Udry, Stéphane; Bouchy, François; Johnson, John A.; Mayor, Michel; Micela, Giusi; Phillips, David; Piotto, Giampaolo; Rice, Ken; Sasselov, Dimitar; Ségransan, Damien; Watson, Chris; Affer, Laura; Bonomo, Aldo S.; Buchhave, Lars A.; Ciardi, David R.; Fiorenzano, Aldo F.; Harutyunyan, Avet

    2018-05-01

    We present the confirmation of a small, moderately irradiated (F = 155 ± 7 F ⊕) Neptune with a substantial gas envelope in a P = 11.8728787 ± 0.0000085 day orbit about a quiet, Sun-like G0V star Kepler-1655. Based on our analysis of the Kepler light curve, we determined Kepler-1655b’s radius to be 2.213 ± 0.082 R ⊕. We acquired 95 high-resolution spectra with Telescopio Nazionale Galileo/HARPS-N, enabling us to characterize the host star and determine an accurate mass for Kepler-1655b of 5.0{+/- }2.83.1 {M}\\oplus via Gaussian-process regression. Our mass determination excludes an Earth-like composition with 98% confidence. Kepler-1655b falls on the upper edge of the evaporation valley, in the relatively sparsely occupied transition region between rocky and gas-rich planets. It is therefore part of a population of planets that we should actively seek to characterize further.

  18. Galaxy And Mass Assembly: accurate panchromatic photometry from optical priors using LAMBDAR

    NASA Astrophysics Data System (ADS)

    Wright, A. H.; Robotham, A. S. G.; Bourne, N.; Driver, S. P.; Dunne, L.; Maddox, S. J.; Alpaslan, M.; Andrews, S. K.; Bauer, A. E.; Bland-Hawthorn, J.; Brough, S.; Brown, M. J. I.; Clarke, C.; Cluver, M.; Davies, L. J. M.; Grootes, M. W.; Holwerda, B. W.; Hopkins, A. M.; Jarrett, T. H.; Kafle, P. R.; Lange, R.; Liske, J.; Loveday, J.; Moffett, A. J.; Norberg, P.; Popescu, C. C.; Smith, M.; Taylor, E. N.; Tuffs, R. J.; Wang, L.; Wilkins, S. M.

    2016-07-01

    We present the Lambda Adaptive Multi-Band Deblending Algorithm in R (LAMBDAR), a novel code for calculating matched aperture photometry across images that are neither pixel- nor PSF-matched, using prior aperture definitions derived from high-resolution optical imaging. The development of this program is motivated by the desire for consistent photometry and uncertainties across large ranges of photometric imaging, for use in calculating spectral energy distributions. We describe the program, specifically key features required for robust determination of panchromatic photometry: propagation of apertures to images with arbitrary resolution, local background estimation, aperture normalization, uncertainty determination and propagation, and object deblending. Using simulated images, we demonstrate that the program is able to recover accurate photometric measurements in both high-resolution, low-confusion, and low-resolution, high-confusion, regimes. We apply the program to the 21-band photometric data set from the Galaxy And Mass Assembly (GAMA) Panchromatic Data Release (PDR; Driver et al. 2016), which contains imaging spanning the far-UV to the far-IR. We compare photometry derived from LAMBDAR with that presented in Driver et al. (2016), finding broad agreement between the data sets. None the less, we demonstrate that the photometry from LAMBDAR is superior to that from the GAMA PDR, as determined by a reduction in the outlier rate and intrinsic scatter of colours in the LAMBDAR data set. We similarly find a decrease in the outlier rate of stellar masses and star formation rates using LAMBDAR photometry. Finally, we note an exceptional increase in the number of UV and mid-IR sources able to be constrained, which is accompanied by a significant increase in the mid-IR colour-colour parameter-space able to be explored.

  19. A completely automated CAD system for mass detection in a large mammographic database.

    PubMed

    Bellotti, R; De Carlo, F; Tangaro, S; Gargano, G; Maggipinto, G; Castellano, M; Massafra, R; Cascio, D; Fauci, F; Magro, R; Raso, G; Lauria, A; Forni, G; Bagnasco, S; Cerello, P; Zanon, E; Cheran, S C; Lopez Torres, E; Bottigli, U; Masala, G L; Oliva, P; Retico, A; Fantacci, M E; Cataldo, R; De Mitri, I; De Nunzio, G

    2006-08-01

    Mass localization plays a crucial role in computer-aided detection (CAD) systems for the classification of suspicious regions in mammograms. In this article we present a completely automated classification system for the detection of masses in digitized mammographic images. The tool system we discuss consists in three processing levels: (a) Image segmentation for the localization of regions of interest (ROIs). This step relies on an iterative dynamical threshold algorithm able to select iso-intensity closed contours around gray level maxima of the mammogram. (b) ROI characterization by means of textural features computed from the gray tone spatial dependence matrix (GTSDM), containing second-order spatial statistics information on the pixel gray level intensity. As the images under study were recorded in different centers and with different machine settings, eight GTSDM features were selected so as to be invariant under monotonic transformation. In this way, the images do not need to be normalized, as the adopted features depend on the texture only, rather than on the gray tone levels, too. (c) ROI classification by means of a neural network, with supervision provided by the radiologist's diagnosis. The CAD system was evaluated on a large database of 3369 mammographic images [2307 negative, 1062 pathological (or positive), containing at least one confirmed mass, as diagnosed by an expert radiologist]. To assess the performance of the system, receiver operating characteristic (ROC) and free-response ROC analysis were employed. The area under the ROC curve was found to be Az = 0.783 +/- 0.008 for the ROI-based classification. When evaluating the accuracy of the CAD against the radiologist-drawn boundaries, 4.23 false positives per image are found at 80% of mass sensitivity.

  20. Propellant Mass Gauging: Database of Vehicle Applications and Research and Development Studies

    NASA Technical Reports Server (NTRS)

    Dodge, Franklin T.

    2008-01-01

    Gauging the mass of propellants in a tank in low gravity is not a straightforward task because of the uncertainty of the liquid configuration in the tank and the possibility of there being more than one ullage bubble. Several concepts for such a low-gravity gauging system have been proposed, and breadboard or flight-like versions have been tested in normal gravity or even in low gravity, but at present, a flight-proven reliable gauging system is not available. NASA desired a database of the gauging techniques used in current and past vehicles during ascent or under settled conditions, and during short coasting (unpowered) periods, for both cryogenic and storable propellants. Past and current research and development efforts on gauging systems that are believed to be applicable in low-gravity conditions were also desired. This report documents the results of that survey.

  1. Lipid Identification by Untargeted Tandem Mass Spectrometry Coupled with Ultra-High-Pressure Liquid Chromatography.

    PubMed

    Gugiu, Gabriel B

    2017-01-01

    Lipidomics refers to the large-scale study of lipids in biological systems (Wenk, Nat Rev Drug Discov 4(7):594-610, 2005; Rolim et al., Gene 554(2):131-139, 2015). From a mass spectrometric point of view, by lipidomics we understand targeted or untargeted mass spectrometric analysis of lipids using either liquid chromatography (LC) (Castro-Perez et al., J Proteome Res 9(5):2377-2389, 2010) or shotgun (Han and Gross, Mass Spectrom Rev 24(3):367-412, 2005) approaches coupled with tandem mass spectrometry. This chapter describes the former methodology, which is becoming rapidly the preferred method for lipid identification owing to similarities with established omics workflows, such as proteomics (Washburn et al., Nat Biotechnol 19(3):242-247, 2001) or genomics (Yadav, J Biomol Tech: JBT 18(5):277, 2007). The workflow described consists in lipid extraction using a modified Bligh and Dyer method (Bligh and Dyer, Can J Biochem Physiol 37(8):911-917, 1959), ultra high pressure liquid chromatography fractionation of lipid samples on a reverse phase C18 column, followed by tandem mass spectrometric analysis and in silico database search for lipid identification based on MSMS spectrum matching (Kind et al., Nat Methods 10(8):755-758, 2013; Yamada et al., J Chromatogr A 1292:211-218, 2013; Taguchi and Ishikawa, J Chromatogr A 1217(25):4229-4239, 2010; Peake et al., Thermoscientifices 1-3, 2015) and accurate mass of parent ion (Sud et al., Nucleic Acids Res 35(database issue):D527-D532, 2007; Wishart et al., Nucleic Acids Res 35(database):D521-D526, 2007).

  2. Masses of the components of SB2 binaries observed with Gaia - IV. Accurate SB2 orbits for 14 binaries and masses of three binaries*

    NASA Astrophysics Data System (ADS)

    Kiefer, F.; Halbwachs, J.-L.; Lebreton, Y.; Soubiran, C.; Arenou, F.; Pourbaix, D.; Famaey, B.; Guillout, P.; Ibata, R.; Mazeh, T.

    2018-02-01

    The orbital motion of non-contact double-lined spectroscopic binaries (SB2s), with periods of a few tens of days to several years, holds unique, accurate information on individual stellar masses, which only long-term monitoring can unlock. The combination of radial velocity measurements from high-resolution spectrographs and astrometric measurements from high-precision interferometers allows the derivation of SB2 component masses down to the percent precision. Since 2010, we have observed a large sample of SB2s with the SOPHIE spectrograph at the Observatoire de Haute-Provence, aiming at the derivation of orbital elements with sufficient accuracy to obtain masses of components with relative errors as low as 1 per cent when the astrometric measurements of the Gaia satellite are taken into account. In this paper, we present the results from 6 yr of observations of 14 SB2 systems with periods ranging from 33 to 4185 days. Using the TODMOR algorithm, we computed radial velocities from the spectra and then derived the orbital elements of these binary systems. The minimum masses of the 28 stellar components are then obtained with an average sample accuracy of 1.0 ± 0.2 per cent. Combining the radial velocities with existing interferometric measurements, we derived the masses of the primary and secondary components of HIP 61100, HIP 95995 and HIP 101382 with relative errors for components (A,B) of, respectively, (2.0, 1.7) per cent, (3.7, 3.7) per cent and (0.2, 0.1) per cent. Using the CESAM2K stellar evolution code, we constrained the initial He abundance, age and metallicity for HIP 61100 and HIP 95995.

  3. The link between quasar broad-line region and galaxy-scale outflows and accurate CIV-based black hole masses

    NASA Astrophysics Data System (ADS)

    Coatman, Liam; Hewett, Paul C.; Banerji, Manda; Richards, Gordon T.; Hennawi, Joseph F.; Prochaska, Jason X.

    2017-01-01

    Accurate black-hole (BH) mass estimates for high-redshift (z>2) quasars are essential for better understanding the relationship between super-massive BH accretion and star formation. Progress is currently limited by the large systematic errors in virial BH-masses derived from the CIV broad emission line, which is often significantly blueshifted relative to systemic, most likely due to outflowing gas in the quasar broad-line region. We have assembled Balmer-line based BH masses for a large sample of 230 high-luminosity (1045.5-1048 ergs-1), redshift 1.5masses to be larger than the corresponding Balmer line-based masses by almost an order of magnitude at the most extreme blueshifts (˜5000 kms-1). An empirical correction to the CIV BH-masses is derived, which depends only on the properties of the CIV line itself (i.e. blueshift and FWHM). We show that this new correction now enables the derivation of un-biased CIV-based virial BH masses for the majority of high-luminosity, high-redshift quasars.In the same high-luminosity quasar sample, we find the narrow [OIII] emission to be weaker and more asymmetric than is generally found in lower-luminosity AGN and that a significant fraction of our quasars have exceptionally broad (FWHM > 3000 kms-1), blueshifted [OIII] emission. We find a strong correlation between the CIV and [OIII] blueshifts. This correlation holds even for quasars at fixed luminosity and suggests that broad line region outflows in quasars are connected to galaxy-scale winds.

  4. View generated database

    NASA Technical Reports Server (NTRS)

    Downward, James G.

    1992-01-01

    This document represents the final report for the View Generated Database (VGD) project, NAS7-1066. It documents the work done on the project up to the point at which all project work was terminated due to lack of project funds. The VGD was to provide the capability to accurately represent any real-world object or scene as a computer model. Such models include both an accurate spatial/geometric representation of surfaces of the object or scene, as well as any surface detail present on the object. Applications of such models are numerous, including acquisition and maintenance of work models for tele-autonomous systems, generation of accurate 3-D geometric/photometric models for various 3-D vision systems, and graphical models for realistic rendering of 3-D scenes via computer graphics.

  5. A comprehensive and scalable database search system for metaproteomics.

    PubMed

    Chatterjee, Sandip; Stupp, Gregory S; Park, Sung Kyu Robin; Ducom, Jean-Christophe; Yates, John R; Su, Andrew I; Wolan, Dennis W

    2016-08-16

    Mass spectrometry-based shotgun proteomics experiments rely on accurate matching of experimental spectra against a database of protein sequences. Existing computational analysis methods are limited in the size of their sequence databases, which severely restricts the proteomic sequencing depth and functional analysis of highly complex samples. The growing amount of public high-throughput sequencing data will only exacerbate this problem. We designed a broadly applicable metaproteomic analysis method (ComPIL) that addresses protein database size limitations. Our approach to overcome this significant limitation in metaproteomics was to design a scalable set of sequence databases assembled for optimal library querying speeds. ComPIL was integrated with a modified version of the search engine ProLuCID (termed "Blazmass") to permit rapid matching of experimental spectra. Proof-of-principle analysis of human HEK293 lysate with a ComPIL database derived from high-quality genomic libraries was able to detect nearly all of the same peptides as a search with a human database (~500x fewer peptides in the database), with a small reduction in sensitivity. We were also able to detect proteins from the adenovirus used to immortalize these cells. We applied our method to a set of healthy human gut microbiome proteomic samples and showed a substantial increase in the number of identified peptides and proteins compared to previous metaproteomic analyses, while retaining a high degree of protein identification accuracy and allowing for a more in-depth characterization of the functional landscape of the samples. The combination of ComPIL with Blazmass allows proteomic searches to be performed with database sizes much larger than previously possible. These large database searches can be applied to complex meta-samples with unknown composition or proteomic samples where unexpected proteins may be identified. The protein database, proteomic search engine, and the proteomic data files for

  6. Rhinoplasty perioperative database using a personal digital assistant.

    PubMed

    Kotler, Howard S

    2004-01-01

    To construct a reliable, accurate, and easy-to-use handheld computer database that facilitates the point-of-care acquisition of perioperative text and image data specific to rhinoplasty. A user-modified database (Pendragon Forms [v.3.2]; Pendragon Software Corporation, Libertyville, Ill) and graphic image program (Tealpaint [v.4.87]; Tealpaint Software, San Rafael, Calif) were used to capture text and image data, respectively, on a Palm OS (v.4.11) handheld operating with 8 megabytes of memory. The handheld and desktop databases were maintained secure using PDASecure (v.2.0) and GoldSecure (v.3.0) (Trust Digital LLC, Fairfax, Va). The handheld data were then uploaded to a desktop database of either FileMaker Pro 5.0 (v.1) (FileMaker Inc, Santa Clara, Calif) or Microsoft Access 2000 (Microsoft Corp, Redmond, Wash). Patient data were collected from 15 patients undergoing rhinoplasty in a private practice outpatient ambulatory setting. Data integrity was assessed after 6 months' disk and hard drive storage. The handheld database was able to facilitate data collection and accurately record, transfer, and reliably maintain perioperative rhinoplasty data. Query capability allowed rapid search using a multitude of keyword search terms specific to the operative maneuvers performed in rhinoplasty. Handheld computer technology provides a method of reliably recording and storing perioperative rhinoplasty information. The handheld computer facilitates the reliable and accurate storage and query of perioperative data, assisting the retrospective review of one's own results and enhancement of surgical skills.

  7. Measurement of the Accurate Mass of a 50 MDa Infectious Virus

    PubMed Central

    Keifer, David Z.; Motwani, Tina; Teschke, Carolyn M.; Jarrold, Martin F.

    2016-01-01

    RATIONALE Bacteriophage P22 is believed to contain a total of 521 copies of 9 different proteins and a 41,724 base pair genome. Despite its enormous size and complexity, phage P22 can be electrosprayed, and it remains intact in ultra-high vacuum where its molar mass distribution has been measured. METHODS Phage P22 virions were generated by complementation in Salmonella enterica and purified. They were transferred into 100 mM ammonium acetate and then electrosprayed. The masses of individual virions were determined using charge detection mass spectrometry. RESULTS The stoichiometry of the protein components of phage P22 is sufficiently well-known that the theoretical molar mass can be determined to within a narrow range. The measured average molar mass of phage P22, 52,180±59 kDa, is consistent with the theoretical molar mass and supports the proposed stoichiometry of the components. The intrinsic width of the phage P22 mass distribution can be entirely accounted for by the distribution of DNA packaged by the headful mechanism. CONCLUSIONS At over 50 MDa, phage P22 is the largest object with a well-defined molar mass to be analyzed by mass spectrometry. The narrow measured mass distribution indicates that the virions survive the transition into the gas phase intact. PMID:27501430

  8. Quantitative LC-MS of polymers: determining accurate molecular weight distributions by combined size exclusion chromatography and electrospray mass spectrometry with maximum entropy data processing.

    PubMed

    Gruendling, Till; Guilhaus, Michael; Barner-Kowollik, Christopher

    2008-09-15

    We report on the successful application of size exclusion chromatography (SEC) combined with electrospray ionization mass spectrometry (ESI-MS) and refractive index (RI) detection for the determination of accurate molecular weight distributions of synthetic polymers, corrected for chromatographic band broadening. The presented method makes use of the ability of ESI-MS to accurately depict the peak profiles and retention volumes of individual oligomers eluting from the SEC column, whereas quantitative information on the absolute concentration of oligomers is obtained from the RI-detector only. A sophisticated computational algorithm based on the maximum entropy principle is used to process the data gained by both detectors, yielding an accurate molecular weight distribution, corrected for chromatographic band broadening. Poly(methyl methacrylate) standards with molecular weights up to 10 kDa serve as model compounds. Molecular weight distributions (MWDs) obtained by the maximum entropy procedure are compared to MWDs, which were calculated by a conventional calibration of the SEC-retention time axis with peak retention data obtained from the mass spectrometer. Comparison showed that for the employed chromatographic system, distributions below 7 kDa were only weakly influenced by chromatographic band broadening. However, the maximum entropy algorithm could successfully correct the MWD of a 10 kDa standard for band broadening effects. Molecular weight averages were between 5 and 14% lower than the manufacturer stated data obtained by classical means of calibration. The presented method demonstrates a consistent approach for analyzing data obtained by coupling mass spectrometric detectors and concentration sensitive detectors to polymer liquid chromatography.

  9. Accurate determination of selected pesticides in soya beans by liquid chromatography coupled to isotope dilution mass spectrometry.

    PubMed

    Huertas Pérez, J F; Sejerøe-Olsen, B; Fernández Alba, A R; Schimmel, H; Dabrio, M

    2015-05-01

    A sensitive, accurate and simple liquid chromatography coupled with mass spectrometry method for the determination of 10 selected pesticides in soya beans has been developed and validated. The method is intended for use during the characterization of selected pesticides in a reference material. In this process, high accuracy and appropriate uncertainty levels associated to the analytical measurements are of utmost importance. The analytical procedure is based on sample extraction by the use of a modified QuEChERS (quick, easy, cheap, effective, rugged, safe) extraction and subsequent clean-up of the extract with C18, PSA and Florisil. Analytes were separated on a C18 column using gradient elution with water-methanol/2.5 mM ammonium acetate mobile phase, and finally identified and quantified by triple quadrupole mass spectrometry in the multiple reaction monitoring mode (MRM). Reliable and accurate quantification of the analytes was achieved by means of stable isotope-labelled analogues employed as internal standards (IS) and calibration with pure substance solutions containing both, the isotopically labelled and native compounds. Exceptions were made for thiodicarb and malaoxon where the isotopically labelled congeners were not commercially available at the time of analysis. For the quantification of those compounds methomyl-(13)C2(15)N and malathion-D10 were used respectively. The method was validated according to the general principles covered by DG SANCO guidelines. However, validation criteria were set more stringently. Mean recoveries were in the range of 86-103% with RSDs lower than 8.1%. Repeatability and intermediate precision were in the range of 3.9-7.6% and 1.9-8.7% respectively. LODs were theoretically estimated and experimentally confirmed to be in the range 0.001-0.005 mg kg(-1) in the matrix, while LOQs established as the lowest spiking mass fractionation level were in the range 0.01-0.05 mg kg(-1). The method reliably identifies and quantifies the

  10. Fine-structure resolved rotational transitions and database for CN+H2 collisions

    NASA Astrophysics Data System (ADS)

    Burton, Hannah; Mysliwiec, Ryan; Forrey, Robert C.; Yang, B. H.; Stancil, P. C.; Balakrishnan, N.

    2018-06-01

    Cross sections and rate coefficients for CN+H2 collisions are calculated using the coupled states (CS) approximation. The calculations are benchmarked against more accurate close-coupling (CC) calculations for transitions between low-lying rotational states. Comparisons are made between the two formulations for collision energies greater than 10 cm-1. The CS approximation is used to construct a database which includes highly excited rotational states that are beyond the practical limitations of the CC method. The database includes fine-structure resolved rotational quenching transitions for v = 0 and j ≤ 40, where v and j are the vibrational and rotational quantum numbers of the initial state of the CN molecule. Rate coefficients are computed for both para-H2 and ortho-H2 colliders. The results are shown to be in good agreement with previous calculations, however, the rates are substantially different from mass-scaled CN+He rates that are often used in astrophysical models.

  11. Metabolomic spectral libraries for data-independent SWATH liquid chromatography mass spectrometry acquisition.

    PubMed

    Bruderer, Tobias; Varesio, Emmanuel; Hidasi, Anita O; Duchoslav, Eva; Burton, Lyle; Bonner, Ron; Hopfgartner, Gérard

    2018-03-01

    High-quality mass spectral libraries have become crucial in mass spectrometry-based metabolomics. Here, we investigate a workflow to generate accurate mass discrete and composite spectral libraries for metabolite identification and for SWATH mass spectrometry data processing. Discrete collision energy (5-100 eV) accurate mass spectra were collected for 532 metabolites from the human metabolome database (HMDB) by flow injection analysis and compiled into composite spectra over a large collision energy range (e.g., 10-70 eV). Full scan response factors were also calculated. Software tools based on accurate mass and predictive fragmentation were specially developed and found to be essential for construction and quality control of the spectral library. First, elemental compositions constrained by the elemental composition of the precursor ion were calculated for all fragments. Secondly, all possible fragments were generated from the compound structure and were filtered based on their elemental compositions. From the discrete spectra, it was possible to analyze the specific fragment form at each collision energy and it was found that a relatively large collision energy range (10-70 eV) gives informative MS/MS spectra for library searches. From the composite spectra, it was possible to characterize specific neutral losses as radical losses using in silico fragmentation. Radical losses (generating radical cations) were found to be more prominent than expected. From 532 metabolites, 489 provided a signal in positive mode [M+H] + and 483 in negative mode [M-H] - . MS/MS spectra were obtained for 399 compounds in positive mode and for 462 in negative mode; 329 metabolites generated suitable spectra in both modes. Using the spectral library, LC retention time, response factors to analyze data-independent LC-SWATH-MS data allowed the identification of 39 (positive mode) and 72 (negative mode) metabolites in a plasma pool sample (total 92 metabolites) where 81 previously

  12. USING AN ACCURATE MASS, TRIPLE QUADRUPOLE MASS SPECTROMETER AND AN ION CORRELATION PROGRAM TO IDENTIFY COMPOUNDS

    EPA Science Inventory

    Most compounds are not found in mass spectral libraries and must be identified by other means. Often, compound identities can be deduced from the compositions of the ions in their mass spectra and review of the chemical literature. Confirmation is provided by mass spectra and r...

  13. The Effect of Starspots on Accurate Radius Determination of the Low-Mass Double-Lined Eclipsing Binary Gu Boo

    NASA Astrophysics Data System (ADS)

    Windmiller, G.; Orosz, J. A.; Etzel, P. B.

    2010-04-01

    GU Boo is one of only a relatively small number of well-studied double-lined eclipsing binaries that contain low-mass stars. López-Morales & Ribas present a comprehensive analysis of multi-color light and radial velocity curves for this system. The GU Boo light curves presented by López-Morales & Ribas had substantial asymmetries, which were attributed to large spots. In spite of the asymmetry, López-Morales & Ribas derived masses and radii accurate to sime2%. We obtained additional photometry of GU Boo using both a CCD and a single-channel photometer and modeled the light curves with the ELC software to determine if the large spots in the light curves give rise to systematic errors at the few percent level. We also modeled the original light curves from the work of López-Morales & Ribas using models with and without spots. We derived a radius of the primary of 0.6329 ± 0.0026 R sun, 0.6413 ± 0.0049 R sun, and 0.6373 ± 0.0029 R sun from the CCD, photoelectric, and López-Morales & Ribas data, respectively. Each of these measurements agrees with the value reported by López-Morales & Ribas (R 1 = 0.623 ± 0.016 R sun) at the level of ≈2%. In addition, the spread in these values is ≈1%-2% from the mean. For the secondary, we derive radii of 0.6074 ± 0.0035 R sun, 0.5944 ± 0.0069 R sun, and 0.5976 ± 0.0059 R sun from the three respective data sets. The López-Morales & Ribas value is R 2 = 0.620 ± 0.020 R sun, which is ≈2%-3% larger than each of the three values we found. The spread in these values is ≈2% from the mean. The systematic difference between our three determinations of the secondary radius and that of López-Morales & Ribas might be attributed to differences in the modeling process and codes used. Our own fits suggest that, for GU Boo at least, using accurate spot modeling of a single set of multi-color light curves results in radii determinations accurate at the ≈2% level.

  14. THE EFFECT OF STARSPOTS ON ACCURATE RADIUS DETERMINATION OF THE LOW-MASS DOUBLE-LINED ECLIPSING BINARY GU Boo

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Windmiller, G.; Orosz, J. A.; Etzel, P. B., E-mail: windmill@rohan.sdsu.ed, E-mail: orosz@sciences.sdsu.ed, E-mail: etzel@sciences.sdsu.ed

    2010-04-01

    GU Boo is one of only a relatively small number of well-studied double-lined eclipsing binaries that contain low-mass stars. Lopez-Morales and Ribas present a comprehensive analysis of multi-color light and radial velocity curves for this system. The GU Boo light curves presented by Lopez-Morales and Ribas had substantial asymmetries, which were attributed to large spots. In spite of the asymmetry, Lopez-Morales and Ribas derived masses and radii accurate to {approx_equal}2%. We obtained additional photometry of GU Boo using both a CCD and a single-channel photometer and modeled the light curves with the ELC software to determine if the large spotsmore » in the light curves give rise to systematic errors at the few percent level. We also modeled the original light curves from the work of Lopez-Morales and Ribas using models with and without spots. We derived a radius of the primary of 0.6329 +- 0.0026 R{sub sun}, 0.6413 +- 0.0049 R{sub sun}, and 0.6373 +- 0.0029 R{sub sun} from the CCD, photoelectric, and Lopez-Morales and Ribas data, respectively. Each of these measurements agrees with the value reported by Lopez-Morales and Ribas (R{sub 1} = 0.623 +- 0.016 R{sub sun}) at the level of {approx}2%. In addition, the spread in these values is {approx}1%-2% from the mean. For the secondary, we derive radii of 0.6074 +- 0.0035 R{sub sun}, 0.5944 +- 0.0069 R{sub sun}, and 0.5976 +- 0.0059 R{sub sun} from the three respective data sets. The Lopez-Morales and Ribas value is R{sub 2} = 0.620 +- 0.020 R{sub sun}, which is {approx}2%-3% larger than each of the three values we found. The spread in these values is {approx}2% from the mean. The systematic difference between our three determinations of the secondary radius and that of Lopez-Morales and Ribas might be attributed to differences in the modeling process and codes used. Our own fits suggest that, for GU Boo at least, using accurate spot modeling of a single set of multi-color light curves results in radii

  15. Development and validation of an extended database for yeast identification by MALDI-TOF MS in Argentina.

    PubMed

    Taverna, Constanza Giselle; Mazza, Mariana; Bueno, Nadia Soledad; Alvarez, Christian; Amigot, Susana; Andreani, Mariana; Azula, Natalia; Barrios, Rubén; Fernández, Norma; Fox, Barbara; Guelfand, Liliana; Maldonado, Ivana; Murisengo, Omar Alejandro; Relloso, Silvia; Vivot, Matias; Davel, Graciela

    2018-05-11

    Matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) has revolutionized the identification of microorganisms in clinical laboratories because it is rapid, relatively simple to use, accurate, and can be used for a wide number of microorganisms. Several studies have demonstrated the utility of this technique in the identification of yeasts; however, its performance is usually improved by the extension of the database. Here we developed an in-house database of 143 strains belonging to 42 yeast species in the MALDI Biotyper platform, and we validated the extended database with 388 regional strains and 15 reference strains belonging to 55 yeast species. We also performed an intra- and interlaboratory study to assess reproducibility and analyzed the use of the cutoff values of 1.700 and 2.000 to correctly identify at species level. The creation of an in-house database that extended the manufacturer's database was successful in view of no incorrect identification was introduced. The best performance was observed by using the extended database and a cutoff value of 1.700 with a sensitivity of .94 and specificity of .96. A reproducibility study showed utility to detect deviations and could be used for external quality control. The extended database was able to differentiate closely related species and it has potential in distinguishing the molecular genotypes of Cryptococcus neoformans and Cryptococcus gattii.

  16. Rapid glucosinolate detection and identification using accurate mass MS-MS

    USDA-ARS?s Scientific Manuscript database

    Currently, there is a demand for accurate evaluation of brassica plat species for their glucosinolate content. An optimized method has been developed for detecting and identifying glucosinolates in plant extracts using MS-MS fragmentation with ion trap collision induced dissociation (CID) and higher...

  17. Identification of substances migrating from plastic baby bottles using a combination of low-resolution and high-resolution mass spectrometric analysers coupled to gas and liquid chromatography.

    PubMed

    Onghena, Matthias; Van Hoeck, Els; Van Loco, Joris; Ibáñez, María; Cherta, Laura; Portolés, Tania; Pitarch, Elena; Hernandéz, Félix; Lemière, Filip; Covaci, Adrian

    2015-11-01

    This work presents a strategy for elucidation of unknown migrants from plastic food contact materials (baby bottles) using a combination of analytical techniques in an untargeted approach. First, gas chromatography (GC) coupled to mass spectrometry (MS) in electron ionisation mode was used to identify migrants through spectral library matching. When no acceptable match was obtained, a second analysis by GC-(electron ionisation) high resolution mass spectrometry time of flight (TOF) was applied to obtain accurate mass fragmentation spectra and isotopic patterns. Databases were then searched to find a possible elemental composition for the unknown compounds. Finally, a GC hybrid quadrupole-TOF-MS with an atmospheric pressure chemical ionisation source was used to obtain the molecular ion or the protonated molecule. Accurate mass data also provided additional information on the fragmentation behaviour as two acquisition functions with different collision energies were available (MS(E) approach). In the low-energy function, limited fragmentation took place, whereas for the high-energy function, fragmentation was enhanced. For less volatile unknowns, ultra-high pressure liquid chromatography-quadrupole-TOF-MS was additionally applied. Using a home-made database containing common migrating compounds and plastic additives, tentative identification was made for several positive findings based on accurate mass of the (de)protonated molecule, product ion fragments and characteristic isotopic ions. Six illustrative examples are shown to demonstrate the modus operandi and the difficulties encountered during identification. The combination of these techniques was proven to be a powerful tool for the elucidation of unknown migrating compounds from plastic baby bottles. Copyright © 2015 John Wiley & Sons, Ltd.

  18. ProXL (Protein Cross-Linking Database): A Platform for Analysis, Visualization, and Sharing of Protein Cross-Linking Mass Spectrometry Data

    PubMed Central

    2016-01-01

    ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. ProXL is designed to be independent of any particular software pipeline. The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. ProXL includes robust public and private data sharing capabilities, including a project-based interface designed to ensure security and facilitate collaboration among multiple researchers. ProXL provides multiple interactive and highly dynamic data visualizations that facilitate structural-based analysis of the observed cross-links as well as quality control. ProXL is open-source, well-documented, and freely available at https://github.com/yeastrc/proxl-web-app. PMID:27302480

  19. ProXL (Protein Cross-Linking Database): A Platform for Analysis, Visualization, and Sharing of Protein Cross-Linking Mass Spectrometry Data.

    PubMed

    Riffle, Michael; Jaschob, Daniel; Zelter, Alex; Davis, Trisha N

    2016-08-05

    ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. ProXL is designed to be independent of any particular software pipeline. The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. ProXL includes robust public and private data sharing capabilities, including a project-based interface designed to ensure security and facilitate collaboration among multiple researchers. ProXL provides multiple interactive and highly dynamic data visualizations that facilitate structural-based analysis of the observed cross-links as well as quality control. ProXL is open-source, well-documented, and freely available at https://github.com/yeastrc/proxl-web-app .

  20. Mass measurement errors of Fourier-transform mass spectrometry (FTMS): distribution, recalibration, and application.

    PubMed

    Zhang, Jiyang; Ma, Jie; Dou, Lei; Wu, Songfeng; Qian, Xiaohong; Xie, Hongwei; Zhu, Yunping; He, Fuchu

    2009-02-01

    The hybrid linear trap quadrupole Fourier-transform (LTQ-FT) ion cyclotron resonance mass spectrometer, an instrument with high accuracy and resolution, is widely used in the identification and quantification of peptides and proteins. However, time-dependent errors in the system may lead to deterioration of the accuracy of these instruments, negatively influencing the determination of the mass error tolerance (MET) in database searches. Here, a comprehensive discussion of LTQ/FT precursor ion mass error is provided. On the basis of an investigation of the mass error distribution, we propose an improved recalibration formula and introduce a new tool, FTDR (Fourier-transform data recalibration), that employs a graphic user interface (GUI) for automatic calibration. It was found that the calibration could adjust the mass error distribution to more closely approximate a normal distribution and reduce the standard deviation (SD). Consequently, we present a new strategy, LDSF (Large MET database search and small MET filtration), for database search MET specification and validation of database search results. As the name implies, a large-MET database search is conducted and the search results are then filtered using the statistical MET estimated from high-confidence results. By applying this strategy to a standard protein data set and a complex data set, we demonstrate the LDSF can significantly improve the sensitivity of the result validation procedure.

  1. MIDAS: a database-searching algorithm for metabolite identification in metabolomics.

    PubMed

    Wang, Yingfeng; Kora, Guruprasad; Bowen, Benjamin P; Pan, Chongle

    2014-10-07

    A database searching approach can be used for metabolite identification in metabolomics by matching measured tandem mass spectra (MS/MS) against the predicted fragments of metabolites in a database. Here, we present the open-source MIDAS algorithm (Metabolite Identification via Database Searching). To evaluate a metabolite-spectrum match (MSM), MIDAS first enumerates possible fragments from a metabolite by systematic bond dissociation, then calculates the plausibility of the fragments based on their fragmentation pathways, and finally scores the MSM to assess how well the experimental MS/MS spectrum from collision-induced dissociation (CID) is explained by the metabolite's predicted CID MS/MS spectrum. MIDAS was designed to search high-resolution tandem mass spectra acquired on time-of-flight or Orbitrap mass spectrometer against a metabolite database in an automated and high-throughput manner. The accuracy of metabolite identification by MIDAS was benchmarked using four sets of standard tandem mass spectra from MassBank. On average, for 77% of original spectra and 84% of composite spectra, MIDAS correctly ranked the true compounds as the first MSMs out of all MetaCyc metabolites as decoys. MIDAS correctly identified 46% more original spectra and 59% more composite spectra at the first MSMs than an existing database-searching algorithm, MetFrag. MIDAS was showcased by searching a published real-world measurement of a metabolome from Synechococcus sp. PCC 7002 against the MetaCyc metabolite database. MIDAS identified many metabolites missed in the previous study. MIDAS identifications should be considered only as candidate metabolites, which need to be confirmed using standard compounds. To facilitate manual validation, MIDAS provides annotated spectra for MSMs and labels observed mass spectral peaks with predicted fragments. The database searching and manual validation can be performed online at http://midas.omicsbio.org.

  2. Comprehensive and accurate tracking of carbon origin of LC-tandem mass spectrometry collisional fragments for 13C-MFA.

    PubMed

    Kappelmann, Jannick; Klein, Bianca; Geilenkirchen, Petra; Noack, Stephan

    2017-03-01

    In recent years the benefit of measuring positionally resolved 13 C-labeling enrichment from tandem mass spectrometry (MS/MS) collisional fragments for improved precision of 13 C-Metabolic Flux Analysis ( 13 C-MFA) has become evident. However, the usage of positional labeling information for 13 C-MFA faces two challenges: (1) The mass spectrometric acquisition of a large number of potentially interfering mass transitions may hamper accuracy and sensitivity. (2) The positional identity of carbon atoms of product ions needs to be known. The present contribution addresses the latter challenge by deducing the maximal positional labeling information contained in LC-ESI-MS/MS spectra of product anions of central metabolism as well as product cations of amino acids. For this purpose, we draw on accurate mass spectrometry, selectively labeled standards, and published fragmentation pathways to structurally annotate all dominant mass peaks of a large collection of metabolites, some of which with a complete fragmentation pathway. Compiling all available information, we arrive at the most detailed map of carbon atom fate of LC-ESI-MS/MS collisional fragments yet, comprising 170 intense and structurally annotated product ions with unique carbon origin from 76 precursor ions of 72 metabolites. Our 13 C-data proof that heuristic fragmentation rules often fail to yield correct fragment structures and we expose common pitfalls in the structural annotation of product ions. We show that the positionally resolved 13 C-label information contained in the product ions that we structurally annotated allows to infer the entire isotopomer distribution of several central metabolism intermediates, which is experimentally demonstrated for malate using quadrupole-time-of-flight MS technology. Finally, the inclusion of the label information from a subset of these fragments improves flux precision in a Corynebacterium glutamicum model of the central carbon metabolism.

  3. Accurate and precise determination of isotopic ratios by MC-ICP-MS: a review.

    PubMed

    Yang, Lu

    2009-01-01

    For many decades the accurate and precise determination of isotope ratios has remained a very strong interest to many researchers due to its important applications in earth, environmental, biological, archeological, and medical sciences. Traditionally, thermal ionization mass spectrometry (TIMS) has been the technique of choice for achieving the highest accuracy and precision. However, recent developments in multi-collector inductively coupled plasma mass spectrometry (MC-ICP-MS) have brought a new dimension to this field. In addition to its simple and robust sample introduction, high sample throughput, and high mass resolution, the flat-topped peaks generated by this technique provide for accurate and precise determination of isotope ratios with precision reaching 0.001%, comparable to that achieved with TIMS. These features, in combination with the ability of the ICP source to ionize nearly all elements in the periodic table, have resulted in an increased use of MC-ICP-MS for such measurements in various sample matrices. To determine accurate and precise isotope ratios with MC-ICP-MS, utmost care must be exercised during sample preparation, optimization of the instrument, and mass bias corrections. Unfortunately, there are inconsistencies and errors evident in many MC-ICP-MS publications, including errors in mass bias correction models. This review examines "state-of-the-art" methodologies presented in the literature for achievement of precise and accurate determinations of isotope ratios by MC-ICP-MS. Some general rules for such accurate and precise measurements are suggested, and calculations of combined uncertainty of the data using a few common mass bias correction models are outlined.

  4. Nontargeted Screening Method for Illegal Additives Based on Ultrahigh-Performance Liquid Chromatography-High-Resolution Mass Spectrometry.

    PubMed

    Fu, Yanqing; Zhou, Zhihui; Kong, Hongwei; Lu, Xin; Zhao, Xinjie; Chen, Yihui; Chen, Jia; Wu, Zeming; Xu, Zhiliang; Zhao, Chunxia; Xu, Guowang

    2016-09-06

    Identification of illegal additives in complex matrixes is important in the food safety field. In this study a nontargeted screening strategy was developed to find illegal additives based on ultrahigh-performance liquid chromatography-high-resolution mass spectrometry (UHPLC-HRMS). First, an analytical method for possible illegal additives in complex matrixes was established including fast sample pretreatment, accurate UHPLC separation, and HRMS detection. Second, efficient data processing and differential analysis workflow were suggested and applied to find potential risk compounds. Third, structure elucidation of risk compounds was performed by (1) searching online databases [Metlin and the Human Metabolome Database (HMDB)] and an in-house database which was established at the above-defined conditions of UHPLC-HRMS analysis and contains information on retention time, mass spectra (MS), and tandem mass spectra (MS/MS) of 475 illegal additives, (2) analyzing fragment ions, and (3) referring to fragmentation rules. Fish was taken as an example to show the usefulness of the nontargeted screening strategy, and six additives were found in suspected fish samples. Quantitative analysis was further carried out to determine the contents of these compounds. The satisfactory application of this strategy in fish samples means that it can also be used in the screening of illegal additives in other kinds of food samples.

  5. Metabolomics by Gas Chromatography-Mass Spectrometry: the combination of targeted and untargeted profiling

    PubMed Central

    Fiehn, Oliver

    2016-01-01

    Gas chromatography-mass spectrometry (GC-MS)-based metabolomics is ideal for identifying and quantitating small molecular metabolites (<650 daltons), including small acids, alcohols, hydroxyl acids, amino acids, sugars, fatty acids, sterols, catecholamines, drugs, and toxins, often using chemical derivatization to make these compounds volatile enough for gas chromatography. This unit shows that on GC-MS- based metabolomics easily allows integrating targeted assays for absolute quantification of specific metabolites with untargeted metabolomics to discover novel compounds. Complemented by database annotations using large spectral libraries and validated, standardized standard operating procedures, GC-MS can identify and semi-quantify over 200 compounds per study in human body fluids (e.g., plasma, urine or stool) samples. Deconvolution software enables detection of more than 300 additional unidentified signals that can be annotated through accurate mass instruments with appropriate data processing workflows, similar to liquid chromatography-MS untargeted profiling (LC-MS). Hence, GC-MS is a mature technology that not only uses classic detectors (‘quadrupole’) but also target mass spectrometers (‘triple quadrupole’) and accurate mass instruments (‘quadrupole-time of flight’). This unit covers the following aspects of GC-MS-based metabolomics: (i) sample preparation from mammalian samples, (ii) acquisition of data, (iii) quality control, and (iv) data processing. PMID:27038389

  6. AT_CHLORO, a comprehensive chloroplast proteome database with subplastidial localization and curated information on envelope proteins.

    PubMed

    Ferro, Myriam; Brugière, Sabine; Salvi, Daniel; Seigneurin-Berny, Daphné; Court, Magali; Moyet, Lucas; Ramus, Claire; Miras, Stéphane; Mellal, Mourad; Le Gall, Sophie; Kieffer-Jaquinod, Sylvie; Bruley, Christophe; Garin, Jérôme; Joyard, Jacques; Masselon, Christophe; Rolland, Norbert

    2010-06-01

    Recent advances in the proteomics field have allowed a series of high throughput experiments to be conducted on chloroplast samples, and the data are available in several public databases. However, the accurate localization of many chloroplast proteins often remains hypothetical. This is especially true for envelope proteins. We went a step further into the knowledge of the chloroplast proteome by focusing, in the same set of experiments, on the localization of proteins in the stroma, the thylakoids, and envelope membranes. LC-MS/MS-based analyses first allowed building the AT_CHLORO database (http://www.grenoble.prabi.fr/protehome/grenoble-plant-proteomics/), a comprehensive repertoire of the 1323 proteins, identified by 10,654 unique peptide sequences, present in highly purified chloroplasts and their subfractions prepared from Arabidopsis thaliana leaves. This database also provides extensive proteomics information (peptide sequences and molecular weight, chromatographic retention times, MS/MS spectra, and spectral count) for a unique chloroplast protein accurate mass and time tag database gathering identified peptides with their respective and precise analytical coordinates, molecular weight, and retention time. We assessed the partitioning of each protein in the three chloroplast compartments by using a semiquantitative proteomics approach (spectral count). These data together with an in-depth investigation of the literature were compiled to provide accurate subplastidial localization of previously known and newly identified proteins. A unique knowledge base containing extensive information on the proteins identified in envelope fractions was thus obtained, allowing new insights into this membrane system to be revealed. Altogether, the data we obtained provide unexpected information about plastidial or subplastidial localization of some proteins that were not suspected to be associated to this membrane system. The spectral counting-based strategy was further

  7. Design and Implementation of an Intelligence Database.

    DTIC Science & Technology

    1984-12-01

    In designing SDM, many database aplications were analyzed in order to determine the structures that cc. i:r and recur in them...automatically, nor is it even known which relations can be converted to Di./NF. In spite of this, DK/NF can be exceedingly useful for practical database...goal of any design process is to produce qn output design, Sout, to accurately represent Sin. Further . all the relations in Sout must satisfy

  8. Practice databases and their uses in clinical research.

    PubMed

    Tierney, W M; McDonald, C J

    1991-04-01

    A few large clinical information databases have been established within larger medical information systems. Although they are smaller than claims databases, these clinical databases offer several advantages: accurate and timely data, rich clinical detail, and continuous parameters (for example, vital signs and laboratory results). However, the nature of the data vary considerably, which affects the kinds of secondary analyses that can be performed. These databases have been used to investigate clinical epidemiology, risk assessment, post-marketing surveillance of drugs, practice variation, resource use, quality assurance, and decision analysis. In addition, practice databases can be used to identify subjects for prospective studies. Further methodologic developments are necessary to deal with the prevalent problems of missing data and various forms of bias if such databases are to grow and contribute valuable clinical information.

  9. Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing

    PubMed Central

    2013-01-01

    Background A large-scale, highly accurate, machine-understandable drug-disease treatment relationship knowledge base is important for computational approaches to drug repurposing. The large body of published biomedical research articles and clinical case reports available on MEDLINE is a rich source of FDA-approved drug-disease indication as well as drug-repurposing knowledge that is crucial for applying FDA-approved drugs for new diseases. However, much of this information is buried in free text and not captured in any existing databases. The goal of this study is to extract a large number of accurate drug-disease treatment pairs from published literature. Results In this study, we developed a simple but highly accurate pattern-learning approach to extract treatment-specific drug-disease pairs from 20 million biomedical abstracts available on MEDLINE. We extracted a total of 34,305 unique drug-disease treatment pairs, the majority of which are not included in existing structured databases. Our algorithm achieved a precision of 0.904 and a recall of 0.131 in extracting all pairs, and a precision of 0.904 and a recall of 0.842 in extracting frequent pairs. In addition, we have shown that the extracted pairs strongly correlate with both drug target genes and therapeutic classes, therefore may have high potential in drug discovery. Conclusions We demonstrated that our simple pattern-learning relationship extraction algorithm is able to accurately extract many drug-disease pairs from the free text of biomedical literature that are not captured in structured databases. The large-scale, accurate, machine-understandable drug-disease treatment knowledge base that is resultant of our study, in combination with pairs from structured databases, will have high potential in computational drug repurposing tasks. PMID:23742147

  10. Accurate atom-mapping computation for biochemical reactions.

    PubMed

    Latendresse, Mario; Malerich, Jeremiah P; Travers, Mike; Karp, Peter D

    2012-11-26

    The complete atom mapping of a chemical reaction is a bijection of the reactant atoms to the product atoms that specifies the terminus of each reactant atom. Atom mapping of biochemical reactions is useful for many applications of systems biology, in particular for metabolic engineering where synthesizing new biochemical pathways has to take into account for the number of carbon atoms from a source compound that are conserved in the synthesis of a target compound. Rapid, accurate computation of the atom mapping(s) of a biochemical reaction remains elusive despite significant work on this topic. In particular, past researchers did not validate the accuracy of mapping algorithms. We introduce a new method for computing atom mappings called the minimum weighted edit-distance (MWED) metric. The metric is based on bond propensity to react and computes biochemically valid atom mappings for a large percentage of biochemical reactions. MWED models can be formulated efficiently as Mixed-Integer Linear Programs (MILPs). We have demonstrated this approach on 7501 reactions of the MetaCyc database for which 87% of the models could be solved in less than 10 s. For 2.1% of the reactions, we found multiple optimal atom mappings. We show that the error rate is 0.9% (22 reactions) by comparing these atom mappings to 2446 atom mappings of the manually curated Kyoto Encyclopedia of Genes and Genomes (KEGG) RPAIR database. To our knowledge, our computational atom-mapping approach is the most accurate and among the fastest published to date. The atom-mapping data will be available in the MetaCyc database later in 2012; the atom-mapping software will be available within the Pathway Tools software later in 2012.

  11. DNA barcode data accurately assign higher spider taxa

    PubMed Central

    Coddington, Jonathan A.; Agnarsson, Ingi; Cheng, Ren-Chung; Čandek, Klemen; Driskell, Amy; Frick, Holger; Gregorič, Matjaž; Kostanjšek, Rok; Kropf, Christian; Kweskin, Matthew; Lokovšek, Tjaša; Pipan, Miha; Vidergar, Nina

    2016-01-01

    The use of unique DNA sequences as a method for taxonomic identification is no longer fundamentally controversial, even though debate continues on the best markers, methods, and technology to use. Although both existing databanks such as GenBank and BOLD, as well as reference taxonomies, are imperfect, in best case scenarios “barcodes” (whether single or multiple, organelle or nuclear, loci) clearly are an increasingly fast and inexpensive method of identification, especially as compared to manual identification of unknowns by increasingly rare expert taxonomists. Because most species on Earth are undescribed, a complete reference database at the species level is impractical in the near term. The question therefore arises whether unidentified species can, using DNA barcodes, be accurately assigned to more inclusive groups such as genera and families—taxonomic ranks of putatively monophyletic groups for which the global inventory is more complete and stable. We used a carefully chosen test library of CO1 sequences from 49 families, 313 genera, and 816 species of spiders to assess the accuracy of genus and family-level assignment. We used BLAST queries of each sequence against the entire library and got the top ten hits. The percent sequence identity was reported from these hits (PIdent, range 75–100%). Accurate assignment of higher taxa (PIdent above which errors totaled less than 5%) occurred for genera at PIdent values >95 and families at PIdent values ≥ 91, suggesting these as heuristic thresholds for accurate generic and familial identifications in spiders. Accuracy of identification increases with numbers of species/genus and genera/family in the library; above five genera per family and fifteen species per genus all higher taxon assignments were correct. We propose that using percent sequence identity between conventional barcode sequences may be a feasible and reasonably accurate method to identify animals to family/genus. However, the quality of

  12. Optimisation of colour schemes to accurately display mass spectrometry imaging data based on human colour perception.

    PubMed

    Race, Alan M; Bunch, Josephine

    2015-03-01

    The choice of colour scheme used to present data can have a dramatic effect on the perceived structure present within the data. This is of particular significance in mass spectrometry imaging (MSI), where ion images that provide 2D distributions of a wide range of analytes are used to draw conclusions about the observed system. Commonly employed colour schemes are generally suboptimal for providing an accurate representation of the maximum amount of data. Rainbow-based colour schemes are extremely popular within the community, but they introduce well-documented artefacts which can be actively misleading in the interpretation of the data. In this article, we consider the suitability of colour schemes and composite image formation found in MSI literature in the context of human colour perception. We also discuss recommendations of rules for colour scheme selection for ion composites and multivariate analysis techniques such as principal component analysis (PCA).

  13. Sex-specific lean body mass predictive equations are accurate in the obese paediatric population

    PubMed Central

    Jackson, Lanier B.; Henshaw, Melissa H.; Carter, Janet; Chowdhury, Shahryar M.

    2015-01-01

    Background The clinical assessment of lean body mass (LBM) is challenging in obese children. A sex-specific predictive equation for LBM derived from anthropometric data was recently validated in children. Aim The purpose of this study was to independently validate these predictive equations in the obese paediatric population. Subjects and methods Obese subjects aged 4–21 were analysed retrospectively. Predicted LBM (LBMp) was calculated using equations previously developed in children. Measured LBM (LBMm) was derived from dual-energy x-ray absorptiometry. Agreement was expressed as [(LBMm-LBMp)/LBMm] with 95% limits of agreement. Results Of 310 enrolled patients, 195 (63%) were females. The mean age was 11.8 ± 3.4 years and mean BMI Z-score was 2.3 ± 0.4. The average difference between LBMm and LBMp was −0.6% (−17.0%, 15.8%). Pearson’s correlation revealed a strong linear relationship between LBMm and LBMp (r=0.97, p<0.01). Conclusion This study validates the use of these clinically-derived sex-specific LBM predictive equations in the obese paediatric population. Future studies should use these equations to improve the ability to accurately classify LBM in obese children. PMID:26287383

  14. Histamine quantification in human plasma using high resolution accurate mass LC-MS technology.

    PubMed

    Laurichesse, Mathieu; Gicquel, Thomas; Moreau, Caroline; Tribut, Olivier; Tarte, Karin; Morel, Isabelle; Bendavid, Claude; Amé-Thomas, Patricia

    2016-01-01

    Histamine (HA) is a small amine playing an important role in anaphylactic reactions. In order to identify and quantify HA in plasma matrix, different methods have been developed but present several disadvantages. Here, we developed an alternative method using liquid chromatography coupled with an ultra-high resolution and accurate mass instrument, Q Exactive™ (Thermo Fisher) (LCHRMS). The method includes a protein precipitation of plasma samples spiked with HA-d4 as internal standard (IS). LC separation was performed on a C18 Accucore column (100∗2.1mm, 2.6μm) using a mobile phase containing nonafluoropentanoic acid (3nM) and acetonitrile with 0.1% (v/v) formic acid on gradient mode. Separation of analytes was obtained within 10min. Analysis was performed from full scan mode and targeted MS2 mode using a 5ppm mass window. Ion transitions monitored for targeted MS2 mode were 112.0869>95.0607m/z for HA and 116.1120>99.0855m/z for HA-d4. Calibration curves were obtained by adding standard calibration dilution at 1 to 180nM in TrisBSA. Elution of HA and IS occurred at 4.1min. The method was validated over a range of concentrations from 1nM to 100nM. The intra- and inter-run precisions were <15% for quality controls. Human plasma samples from 30 patients were analyzed by LCHRMS, and the results were highly correlated with those obtained using the gold standard radioimmunoassay (RIA) method. Overall, we demonstrate here that LCHRMS is a sensitive method for histamine quantification in biological human plasmas, suitable for routine use in medical laboratories. In addition, LCHRMS is less time-consuming than RIA, avoids the use of radioactivity, and could then be considered as an alternative quantitative method. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  15. Content Is King: Databases Preserve the Collective Information of Science.

    PubMed

    Yates, John R

    2018-04-01

    Databases store sequence information experimentally gathered to create resources that further science. In the last 20 years databases have become critical components of fields like proteomics where they provide the basis for large-scale and high-throughput proteomic informatics. Amos Bairoch, winner of the Association of Biomolecular Resource Facilities Frederick Sanger Award, has created some of the important databases proteomic research depends upon for accurate interpretation of data.

  16. Incentives Increase Participation in Mass Dog Rabies Vaccination Clinics and Methods of Coverage Estimation Are Assessed to Be Accurate

    PubMed Central

    Steinmetz, Melissa; Czupryna, Anna; Bigambo, Machunde; Mzimbiri, Imam; Powell, George; Gwakisa, Paul

    2015-01-01

    In this study we show that incentives (dog collars and owner wristbands) are effective at increasing owner participation in mass dog rabies vaccination clinics and we conclude that household questionnaire surveys and the mark-re-sight (transect survey) method for estimating post-vaccination coverage are accurate when all dogs, including puppies, are included. Incentives were distributed during central-point rabies vaccination clinics in northern Tanzania to quantify their effect on owner participation. In villages where incentives were handed out participation increased, with an average of 34 more dogs being vaccinated. Through economies of scale, this represents a reduction in the cost-per-dog of $0.47. This represents the price-threshold under which the cost of the incentive used must fall to be economically viable. Additionally, vaccination coverage levels were determined in ten villages through the gold-standard village-wide census technique, as well as through two cheaper and quicker methods (randomized household questionnaire and the transect survey). Cost data were also collected. Both non-gold standard methods were found to be accurate when puppies were included in the calculations, although the transect survey and the household questionnaire survey over- and under-estimated the coverage respectively. Given that additional demographic data can be collected through the household questionnaire survey, and that its estimate of coverage is more conservative, we recommend this method. Despite the use of incentives the average vaccination coverage was below the 70% threshold for eliminating rabies. We discuss the reasons and suggest solutions to improve coverage. Given recent international targets to eliminate rabies, this study provides valuable and timely data to help improve mass dog vaccination programs in Africa and elsewhere. PMID:26633821

  17. CASMI 2013: Identification of Small Molecules by Tandem Mass Spectrometry Combined with Database and Literature Mining

    PubMed Central

    Newsome, Andrew G.; Nikolic, Dejan

    2014-01-01

    The Critical Assessment of Small Molecule Identification (CASMI) contest was initiated in 2012 to evaluate manual and automated strategies for the identification of small molecules from raw mass spectrometric data. The authors participated in both category 1 (molecular formula determination) and category 2 (molecular structure determination) of the second annual CASMI contest (CASMI 2013) using slow but effective manual methods. The provided high resolution mass spectrometric data were interpreted manually using a combination of molecular formula calculators, fragment and neutral loss analysis, literature consultation, manual database searches, deductive logic, and experience. The authors submitted correct formulas as lead candidates for 16 of 16 challenges and submitted correct structure solutions as lead candidates for 14 of 16 challenges. One structure submission (Challenge 3) was very close but not exact (N2-acetylglutaminylisoleucinamide instead of the correct N2-acetylglutaminylleucinamide). A solution for one (Challenge 13) was not submitted due to an inability to reconcile the provided fragmentation pattern with any known structures with the provided molecular composition. PMID:26819877

  18. Cross-Border Use of Food Databases: Equivalence of US and Australian Databases for Macronutrients

    PubMed Central

    Summer, Suzanne S.; Ollberding, Nicholas J.; Guy, Trish; Setchell, Kenneth D. R.; Brown, Nadine; Kalkwarf, Heidi J.

    2013-01-01

    When estimating dietary intake across multiple countries, the lack of a single comprehensive dietary database may lead researchers to modify one database to analyze intakes for all participants. This approach may yield results different from those using the country-specific database and introduce measurement error. We examined whether nutrient intakes of Australians calculated with a modified US database would be similar to those calculated with an Australian database. We analyzed 3-day food records of 68 Australian adults using the US-based Nutrition Data System for Research, modified to reflect food items consumed in Australia. Modification entailed identifying a substitute food whose energy and macronutrient content were within 10% of the Australian food or by adding a new food to the database. Paired Wilcoxon signed rank tests were used to compare differences in nutrient intakes estimated by both databases, and Pearson and intraclass correlation coefficients measured degree of association and agreement between intake estimates for individuals. Median intakes of energy, carbohydrate, protein, and fiber differed by <5% at the group level. Larger discrepancies were seen for fat (11%; P<0.0001) and most micronutrients. Despite strong correlations, nutrient intakes differed by >10% for an appreciable percentage of participants (35% for energy to 69% for total fat). Adding country-specific food items to an existing database resulted in similar overall macronutrient intake estimates but was insufficient for estimating individual intakes. When analyzing nutrient intakes in multinational studies, greater standardization and modification of databases may be required to more accurately estimate intake of individuals. PMID:23871108

  19. LipidHome: a database of theoretical lipids optimized for high throughput mass spectrometry lipidomics.

    PubMed

    Foster, Joseph M; Moreno, Pablo; Fabregat, Antonio; Hermjakob, Henning; Steinbeck, Christoph; Apweiler, Rolf; Wakelam, Michael J O; Vizcaíno, Juan Antonio

    2013-01-01

    Protein sequence databases are the pillar upon which modern proteomics is supported, representing a stable reference space of predicted and validated proteins. One example of such resources is UniProt, enriched with both expertly curated and automatic annotations. Taken largely for granted, similar mature resources such as UniProt are not available yet in some other "omics" fields, lipidomics being one of them. While having a seasoned community of wet lab scientists, lipidomics lies significantly behind proteomics in the adoption of data standards and other core bioinformatics concepts. This work aims to reduce the gap by developing an equivalent resource to UniProt called 'LipidHome', providing theoretically generated lipid molecules and useful metadata. Using the 'FASTLipid' Java library, a database was populated with theoretical lipids, generated from a set of community agreed upon chemical bounds. In parallel, a web application was developed to present the information and provide computational access via a web service. Designed specifically to accommodate high throughput mass spectrometry based approaches, lipids are organised into a hierarchy that reflects the variety in the structural resolution of lipid identifications. Additionally, cross-references to other lipid related resources and papers that cite specific lipids were used to annotate lipid records. The web application encompasses a browser for viewing lipid records and a 'tools' section where an MS1 search engine is currently implemented. LipidHome can be accessed at http://www.ebi.ac.uk/apweiler-srv/lipidhome.

  20. Rapid and accurate prediction of degradant formation rates in pharmaceutical formulations using high-performance liquid chromatography-mass spectrometry.

    PubMed

    Darrington, Richard T; Jiao, Jim

    2004-04-01

    Rapid and accurate stability prediction is essential to pharmaceutical formulation development. Commonly used stability prediction methods include monitoring parent drug loss at intended storage conditions or initial rate determination of degradants under accelerated conditions. Monitoring parent drug loss at the intended storage condition does not provide a rapid and accurate stability assessment because often <0.5% drug loss is all that can be observed in a realistic time frame, while the accelerated initial rate method in conjunction with extrapolation of rate constants using the Arrhenius or Eyring equations often introduces large errors in shelf-life prediction. In this study, the shelf life prediction of a model pharmaceutical preparation utilizing sensitive high-performance liquid chromatography-mass spectrometry (LC/MS) to directly quantitate degradant formation rates at the intended storage condition is proposed. This method was compared to traditional shelf life prediction approaches in terms of time required to predict shelf life and associated error in shelf life estimation. Results demonstrated that the proposed LC/MS method using initial rates analysis provided significantly improved confidence intervals for the predicted shelf life and required less overall time and effort to obtain the stability estimation compared to the other methods evaluated. Copyright 2004 Wiley-Liss, Inc. and the American Pharmacists Association.

  1. Increasing the productivity of glycopeptides analysis by using higher-energy collision dissociation-accurate mass-product-dependent electron transfer dissociation.

    PubMed

    Saba, Julian; Dutta, Sucharita; Hemenway, Eric; Viner, Rosa

    2012-01-01

    Currently, glycans are attracting attention from the scientific community as potential biomarkers or as posttranslational modifications (PTMs) of therapeutic proteins. However, structural characterization of glycoproteins and glycopeptides remains analytically challenging. Here, we report on the implementation of a novel acquisition strategy termed higher-energy collision dissociation-accurate mass-product-dependent electron transfer dissociation (HCD-PD-ETD) on a hybrid linear ion trap-orbitrap mass spectrometer. This acquisition strategy uses the complementary fragmentations of ETD and HCD for glycopeptides analysis in an intelligent fashion. Furthermore, the approach minimizes user input for optimizing instrumental parameters and enables straightforward detection of glycopeptides. ETD spectra are only acquired when glycan oxonium ions from MS/MS HCD are detected. The advantage of this approach is that it streamlines data analysis and improves dynamic range and duty cycle. Here, we present the benefits of HCD-PD-ETD relative to the traditional alternating HCD/ETD for a trainer set containing twelve-protein mixture with two glycoproteins: human serotransferrin, ovalbumin and contaminations of two other: bovine alpha 1 acid glycoprotein (bAGP) and bovine fetuin.

  2. Implications of (Less) Accurate Mass-Radius-Measurements for the Habitability of Extrasolar Terrestrial Planets: Why Do We Need PLATO?

    NASA Astrophysics Data System (ADS)

    Noack, L.; Wagner, F. W.; Plesa, A.-C.; Höning, D.; Sohl, F.; Breuer, D.; Rauer, H.

    2012-04-01

    Several space missions (CoRoT, Kepler and others) already provided promising candidates for terrestrial exoplanets (i.e. with masses less than about 10 Earth masses) and thereby triggered an exciting new research branch of planetary modelling to investigate the possible habitability of such planets. Earth analogues (low-mass planets with an Earth-like structure and composition) are likely to be found in the near future with new missions such as the proposed M3 mission PLATO. Planets may be more diverse in the universe than they are in the solar system. Our neighbouring planets in the habitable zone are all terrestrial by the means of being differentiated into an iron core, a silicate mantle and a crust. To reliably determine the interior structure of an exoplanet, measurements of mass and radius have to be sufficiently accurate (around +/-2% error allowed for the radius and +/-5% for the mass). An Earth-size planet with an Earth-like mass but an expected error of ~15% in mass for example may have either a Mercury-like, an Earth-like or a Moon-like (i.e. small iron core) structure [1,2]. Even though the atmospheric escape is not strongly influenced by the interior structure, the outgassing of volatiles and the likeliness of plate tectonics and an ongoing carbon-cycle may be very different. Our investigations show, that a planet with a small silicate mantle is less likely to shift into the plate-tectonics regime, cools faster (which may lead to the loss of a magnetic field after a short time) and outgasses less volatiles than a planet with the same mass but a large silicate mantle and small iron core. To be able to address the habitability of exoplanets, space missions such as PLATO, which can lead up to 2% accuracy in radius [3], are extremely important. Moreover, information about the occurrence of different planetary types helps us to better understand the formation of planetary systems and to further constrain the Drake's equation, which gives an estimate of the

  3. Methods to achieve accurate projection of regional and global raster databases

    USGS Publications Warehouse

    Usery, E. Lynn; Seong, Jeong Chang; Steinwand, Dan

    2002-01-01

    Modeling regional and global activities of climatic and human-induced change requires accurate geographic data from which we can develop mathematical and statistical tabulations of attributes and properties of the environment. Many of these models depend on data formatted as raster cells or matrices of pixel values. Recently, it has been demonstrated that regional and global raster datasets are subject to significant error from mathematical projection and that these errors are of such magnitude that model results may be jeopardized (Steinwand, et al., 1995; Yang, et al., 1996; Usery and Seong, 2001; Seong and Usery, 2001). There is a need to develop methods of projection that maintain the accuracy of these datasets to support regional and global analyses and modeling

  4. A Database of Reaction Monitoring Mass Spectrometry Assays for Elucidating Therapeutic Response in Cancer

    PubMed Central

    Remily-Wood, Elizabeth R.; Liu, Richard Z.; Xiang, Yun; Chen, Yi; Thomas, C. Eric; Rajyaguru, Neal; Kaufman, Laura M.; Ochoa, Joana E.; Hazlehurst, Lori; Pinilla-Ibarz, Javier; Lancet, Jeffrey; Zhang, Guolin; Haura, Eric; Shibata, David; Yeatman, Timothy; Smalley, Keiran S.M.; Dalton, William S.; Huang, Emina; Scott, Ed; Bloom, Gregory C.; Eschrich, Steven A.; Koomen, John M.

    2012-01-01

    Purpose The Quantitative Assay Database (QuAD), http://proteome.moffitt.org/QUAD/, facilitates widespread implementation of quantitative mass spectrometry in cancer biology and clinical research through sharing of methods and reagents for monitoring protein expression and modification. Experimental Design Liquid chromatography coupled to multiple reaction monitoring mass spectrometry (LC-MRM) assays are developed using SDS-PAGE fractionated lysates from cancer cell lines. Pathway maps created using GeneGO Metacore provide the biological relationships between proteins and illustrate concepts for multiplexed analysis; each protein can be selected to examine assay development at the protein and peptide level. Results The coupling of SDS-PAGE and LC-MRM screening has been used to detect 876 peptides from 218 cancer-related proteins in model systems including colon, lung, melanoma, leukemias, and myeloma, which has led to the development of 95 quantitative assays including stable-isotope labeled peptide standards. Methods are published online and peptide standards are made available to the research community. Protein expression measurements for heat shock proteins, including a comparison with ELISA and monitoring response to the HSP90 inhibitor, 17-DMAG, are used to illustrate the components of the QuAD and its potential utility. Conclusions and Clinical Relevance This resource enables quantitative assessment of protein components of signaling pathways and biological processes and holds promise for systematic investigation of treatment responses in cancer. PMID:21656910

  5. Application of fast Fourier transform cross-correlation and mass spectrometry data for accurate alignment of chromatograms.

    PubMed

    Zheng, Yi-Bao; Zhang, Zhi-Min; Liang, Yi-Zeng; Zhan, De-Jian; Huang, Jian-Hua; Yun, Yong-Huan; Xie, Hua-Lin

    2013-04-19

    Chromatography has been established as one of the most important analytical methods in the modern analytical laboratory. However, preprocessing of the chromatograms, especially peak alignment, is usually a time-consuming task prior to extracting useful information from the datasets because of the small unavoidable differences in the experimental conditions caused by minor changes and drift. Most of the alignment algorithms are performed on reduced datasets using only the detected peaks in the chromatograms, which means a loss of data and introduces the problem of extraction of peak data from the chromatographic profiles. These disadvantages can be overcome by using the full chromatographic information that is generated from hyphenated chromatographic instruments. A new alignment algorithm called CAMS (Chromatogram Alignment via Mass Spectra) is present here to correct the retention time shifts among chromatograms accurately and rapidly. In this report, peaks of each chromatogram were detected based on Continuous Wavelet Transform (CWT) with Haar wavelet and were aligned against the reference chromatogram via the correlation of mass spectra. The aligning procedure was accelerated by Fast Fourier Transform cross correlation (FFT cross correlation). This approach has been compared with several well-known alignment methods on real chromatographic datasets, which demonstrates that CAMS can preserve the shape of peaks and achieve a high quality alignment result. Furthermore, the CAMS method was implemented in the Matlab language and available as an open source package at http://www.github.com/matchcoder/CAMS. Copyright © 2013. Published by Elsevier B.V.

  6. Development and Validation of a Qualitative Method for Target Screening of 448 Pesticide Residues in Fruits and Vegetables Using UHPLC/ESI Q-Orbitrap Based on Data-Independent Acquisition and Compound Database.

    PubMed

    Wang, Jian; Chow, Willis; Chang, James; Wong, Jon W

    2017-01-18

    A semiautomated qualitative method for target screening of 448 pesticide residues in fruits and vegetables was developed and validated using ultrahigh-performance liquid chromatography coupled with electrospray ionization quadrupole Orbitrap high-resolution mass spectrometry (UHPLC/ESI Q-Orbitrap). The Q-Orbitrap Full MS/dd-MS 2 (data dependent acquisition) was used to acquire product-ion spectra of individual pesticides to build a compound database or an MS library, while its Full MS/DIA (data independent acquisition) was utilized for sample data acquisition from fruit and vegetable matrices fortified with pesticides at 10 and 100 μg/kg for target screening purpose. Accurate mass, retention time and response threshold were three key parameters in a compound database that were used to detect incurred pesticide residues in samples. The concepts and practical aspects of in-spectrum mass correction or solvent background lock-mass correction, retention time alignment and response threshold adjustment are discussed while building a functional and working compound database for target screening. The validated target screening method is capable of screening at least 94% and 99% of 448 pesticides at 10 and 100 μg/kg, respectively, in fruits and vegetables without having to evaluate every compound manually during data processing, which significantly reduced the workload in routine practice.

  7. Metabolite identification of triptolide by data-dependent accurate mass spectrometric analysis in combination with online hydrogen/deuterium exchange and multiple data-mining techniques.

    PubMed

    Du, Fuying; Liu, Ting; Liu, Tian; Wang, Yongwei; Wan, Yakun; Xing, Jie

    2011-10-30

    Triptolide (TP), the primary active component of the herbal medicine Tripterygium wilfordii Hook F, has shown promising antileukemic and anti-inflammatory activity. The pharmacokinetic profile of TP indicates an extensive metabolic elimination in vivo; however, its metabolic data is rarely available partly because of the difficulty in identifying it due to the absence of appropriate ultraviolet chromophores in the structure and the presence of endogenous interferences in biological samples. In the present study, the biotransformation of TP was investigated by improved data-dependent accurate mass spectrometric analysis, using an LTQ/Orbitrap hybrid mass spectrometer in conjunction with the online hydrogen (H)/deuterium (D) exchange technique for rapid structural characterization. Accurate full-scan MS and MS/MS data were processed with multiple post-acquisition data-mining techniques, which were complementary and effective in detecting both common and uncommon metabolites from biological matrices. As a result, 38 phase I, 9 phase II and 8 N-acetylcysteine (NAC) metabolites of TP were found in rat urine. Accurate MS/MS data were used to support assignments of metabolite structures, and online H/D exchange experiments provided additional evidence for exchangeable hydrogen atoms in the structure. The results showed the main phase I metabolic pathways of TP are hydroxylation, hydrolysis and desaturation, and the resulting metabolites subsequently undergo phase II processes. The presence of NAC conjugates indicated the capability of TP to form reactive intermediate species. This study also demonstrated the effectiveness of LC/HR-MS(n) in combination with multiple post-acquisition data-mining methods and the online H/D exchange technique for the rapid identification of drug metabolites. Copyright © 2011 John Wiley & Sons, Ltd.

  8. NATIONAL URBAN DATABASE AND ACCESS PROTAL TOOL

    EPA Science Inventory

    Current mesoscale weather prediction and microscale dispersion models are limited in their ability to perform accurate assessments in urban areas. A project called the National Urban Database with Access Portal Tool (NUDAPT) is beginning to provide urban data and improve the para...

  9. Methods to achieve accurate projection of regional and global raster databases

    USGS Publications Warehouse

    Usery, E.L.; Seong, J.C.; Steinwand, D.R.; Finn, M.P.

    2002-01-01

    This research aims at building a decision support system (DSS) for selecting an optimum projection considering various factors, such as pixel size, areal extent, number of categories, spatial pattern of categories, resampling methods, and error correction methods. Specifically, this research will investigate three goals theoretically and empirically and, using the already developed empirical base of knowledge with these results, develop an expert system for map projection of raster data for regional and global database modeling. The three theoretical goals are as follows: (1) The development of a dynamic projection that adjusts projection formulas for latitude on the basis of raster cell size to maintain equal-sized cells. (2) The investigation of the relationships between the raster representation and the distortion of features, number of categories, and spatial pattern. (3) The development of an error correction and resampling procedure that is based on error analysis of raster projection.

  10. New glycoproteomics software, GlycoPep Evaluator, generates decoy glycopeptides de novo and enables accurate false discovery rate analysis for small data sets.

    PubMed

    Zhu, Zhikai; Su, Xiaomeng; Go, Eden P; Desaire, Heather

    2014-09-16

    Glycoproteins are biologically significant large molecules that participate in numerous cellular activities. In order to obtain site-specific protein glycosylation information, intact glycopeptides, with the glycan attached to the peptide sequence, are characterized by tandem mass spectrometry (MS/MS) methods such as collision-induced dissociation (CID) and electron transfer dissociation (ETD). While several emerging automated tools are developed, no consensus is present in the field about the best way to determine the reliability of the tools and/or provide the false discovery rate (FDR). A common approach to calculate FDRs for glycopeptide analysis, adopted from the target-decoy strategy in proteomics, employs a decoy database that is created based on the target protein sequence database. Nonetheless, this approach is not optimal in measuring the confidence of N-linked glycopeptide matches, because the glycopeptide data set is considerably smaller compared to that of peptides, and the requirement of a consensus sequence for N-glycosylation further limits the number of possible decoy glycopeptides tested in a database search. To address the need to accurately determine FDRs for automated glycopeptide assignments, we developed GlycoPep Evaluator (GPE), a tool that helps to measure FDRs in identifying glycopeptides without using a decoy database. GPE generates decoy glycopeptides de novo for every target glycopeptide, in a 1:20 target-to-decoy ratio. The decoys, along with target glycopeptides, are scored against the ETD data, from which FDRs can be calculated accurately based on the number of decoy matches and the ratio of the number of targets to decoys, for small data sets. GPE is freely accessible for download and can work with any search engine that interprets ETD data of N-linked glycopeptides. The software is provided at https://desairegroup.ku.edu/research.

  11. Creating databases for biological information: an introduction.

    PubMed

    Stein, Lincoln

    2013-06-01

    The essence of bioinformatics is dealing with large quantities of information. Whether it be sequencing data, microarray data files, mass spectrometric data (e.g., fingerprints), the catalog of strains arising from an insertional mutagenesis project, or even large numbers of PDF files, there inevitably comes a time when the information can simply no longer be managed with files and directories. This is where databases come into play. This unit briefly reviews the characteristics of several database management systems, including flat file, indexed file, relational databases, and NoSQL databases. It compares their strengths and weaknesses and offers some general guidelines for selecting an appropriate database management system. Copyright 2013 by JohnWiley & Sons, Inc.

  12. Database citation in full text biomedical articles.

    PubMed

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

  13. Database Citation in Full Text Biomedical Articles

    PubMed Central

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R.

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services. PMID:23734176

  14. Computerized decision support system for mass identification in breast using digital mammogram: a study on GA-based neuro-fuzzy approaches.

    PubMed

    Das, Arpita; Bhattacharya, Mahua

    2011-01-01

    In the present work, authors have developed a treatment planning system implementing genetic based neuro-fuzzy approaches for accurate analysis of shape and margin of tumor masses appearing in breast using digital mammogram. It is obvious that a complicated structure invites the problem of over learning and misclassification. In proposed methodology, genetic algorithm (GA) has been used for searching of effective input feature vectors combined with adaptive neuro-fuzzy model for final classification of different boundaries of tumor masses. The study involves 200 digitized mammograms from MIAS and other databases and has shown 86% correct classification rate.

  15. Characterization of photo-transformation products of the antibiotic drug Ciprofloxacin with liquid chromatography-tandem mass spectrometry in combination with accurate mass determination using an LTQ-Orbitrap.

    PubMed

    Haddad, Tarek; Kümmerer, Klaus

    2014-11-01

    The presence of pharmaceuticals, especially antibiotics, in the aquatic environment is of growing concern. Several studies have been carried out on the occurrence and environmental risk of these compounds. Ciprofloxacin (CIP), a broad-spectrum anti-microbial second-generation fluoroquinolone, is widely used in human and veterinary medicine. In this work, photo-degradation of CIP in aqueous solution using UV and xenon lamps was studied. The transformation products (TPs), created from CIP, were initially analyzed by an ion trap in the MS, MS/MS and MS(3) modes. These data were used to clarify the structures of the degradation products. Furthermore, the proposed products were confirmed by accurate mass measurement and empirical formula calculation for the molecular ions of TPs using LTQ-Orbitrap XL mass spectrometer. The degree of mineralization, the abundance of detected TPs and degradation pathways were determined. Eleven TPs were detected in the present study. TP1, which was never detected before, was structurally characterized in this work. All TPs still retained the core quinolone structure, which is responsible for the biological activity. As mineralization of CIP and its transformation products did not happen, the formation of stable TPs can be expected in waste water treatment and in surface water with further follow-up problems. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. Comprehensive two-dimensional liquid chromatography tandem diode array detector (DAD) and accurate mass QTOF-MS for the analysis of flavonoids and iridoid glycosides in Hedyotis diffusa.

    PubMed

    Li, Duxin; Schmitz, Oliver J

    2015-01-01

    The analysis of chemical constituents in Chinese herbal medicines (CHMs) is a challenge because of numerous compounds with various polarities and functional groups. Liquid chromatography coupled with quadrupole time-of-flight (QTOF) mass spectrometry (LC/MS) is of particular interest in the analysis of herbal components. One of the main attributes of QTOF that makes it an attractive analytical technique is its accurate mass measurement for both precursor and product ions. For the separation of CHMs, comprehensive two-dimensional chromatography (LCxLC) provides much higher resolving power than traditional one-dimensional separation. Therefore, a LCxLC-QTOF-MS system was developed and applied to the analysis of flavonoids and iridoid glycosides in aqueous extracts of Hedyotis diffusa (Rubiaceae). Shift gradient was applied in the two-dimensional separation in the LCxLC system to increase the orthogonality and effective peak distribution area of the analysis. Tentative identification of compounds was done by accurate mass interpretation and validation by UV spectrum. A clear classification of flavonol glycosides (FGs), acylated FGs, and iridoid glycosides (IGs) was shown in different regions of the LCxLC contour plot. In total, five FGs, four acylated FGs, and three IGs were tentatively identified. In addition, several novel flavonoids were found, which demonstrates that LCxLC-QTOF-MS detection also has great potential in herbal medicine analysis.

  17. High-throughput Database Search and Large-scale Negative Polarity Liquid Chromatography–Tandem Mass Spectrometry with Ultraviolet Photodissociation for Complex Proteomic Samples*

    PubMed Central

    Madsen, James A.; Xu, Hua; Robinson, Michelle R.; Horton, Andrew P.; Shaw, Jared B.; Giles, David K.; Kaoud, Tamer S.; Dalby, Kevin N.; Trent, M. Stephen; Brodbelt, Jennifer S.

    2013-01-01

    The use of ultraviolet photodissociation (UVPD) for the activation and dissociation of peptide anions is evaluated for broader coverage of the proteome. To facilitate interpretation and assignment of the resulting UVPD mass spectra of peptide anions, the MassMatrix database search algorithm was modified to allow automated analysis of negative polarity MS/MS spectra. The new UVPD algorithms were developed based on the MassMatrix database search engine by adding specific fragmentation pathways for UVPD. The new UVPD fragmentation pathways in MassMatrix were rigorously and statistically optimized using two large data sets with high mass accuracy and high mass resolution for both MS1 and MS2 data acquired on an Orbitrap mass spectrometer for complex Halobacterium and HeLa proteome samples. Negative mode UVPD led to the identification of 3663 and 2350 peptides for the Halo and HeLa tryptic digests, respectively, corresponding to 655 and 645 peptides that were unique when compared with electron transfer dissociation (ETD), higher energy collision-induced dissociation, and collision-induced dissociation results for the same digests analyzed in the positive mode. In sum, 805 and 619 proteins were identified via UVPD for the Halobacterium and HeLa samples, respectively, with 49 and 50 unique proteins identified in contrast to the more conventional MS/MS methods. The algorithm also features automated charge determination for low mass accuracy data, precursor filtering (including intact charge-reduced peaks), and the ability to combine both positive and negative MS/MS spectra into a single search, and it is freely open to the public. The accuracy and specificity of the MassMatrix UVPD search algorithm was also assessed for low resolution, low mass accuracy data on a linear ion trap. Analysis of a known mixture of three mitogen-activated kinases yielded similar sequence coverage percentages for UVPD of peptide anions versus conventional collision-induced dissociation of

  18. Computational analyses of spectral trees from electrospray multi-stage mass spectrometry to aid metabolite identification.

    PubMed

    Cao, Mingshu; Fraser, Karl; Rasmussen, Susanne

    2013-10-31

    Mass spectrometry coupled with chromatography has become the major technical platform in metabolomics. Aided by peak detection algorithms, the detected signals are characterized by mass-over-charge ratio (m/z) and retention time. Chemical identities often remain elusive for the majority of the signals. Multi-stage mass spectrometry based on electrospray ionization (ESI) allows collision-induced dissociation (CID) fragmentation of selected precursor ions. These fragment ions can assist in structural inference for metabolites of low molecular weight. Computational investigations of fragmentation spectra have increasingly received attention in metabolomics and various public databases house such data. We have developed an R package "iontree" that can capture, store and analyze MS2 and MS3 mass spectral data from high throughput metabolomics experiments. The package includes functions for ion tree construction, an algorithm (distMS2) for MS2 spectral comparison, and tools for building platform-independent ion tree (MS2/MS3) libraries. We have demonstrated the utilization of the package for the systematic analysis and annotation of fragmentation spectra collected in various metabolomics platforms, including direct infusion mass spectrometry, and liquid chromatography coupled with either low resolution or high resolution mass spectrometry. Assisted by the developed computational tools, we have demonstrated that spectral trees can provide informative evidence complementary to retention time and accurate mass to aid with annotating unknown peaks. These experimental spectral trees once subjected to a quality control process, can be used for querying public MS2 databases or de novo interpretation. The putatively annotated spectral trees can be readily incorporated into reference libraries for routine identification of metabolites.

  19. Nonparametric Bayesian Modeling for Automated Database Schema Matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ferragut, Erik M; Laska, Jason A

    2015-01-01

    The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.

  20. Creating databases for biological information: an introduction.

    PubMed

    Stein, Lincoln

    2002-08-01

    The essence of bioinformatics is dealing with large quantities of information. Whether it be sequencing data, microarray data files, mass spectrometric data (e.g., fingerprints), the catalog of strains arising from an insertional mutagenesis project, or even large numbers of PDF files, there inevitably comes a time when the information can simply no longer be managed with files and directories. This is where databases come into play. This unit briefly reviews the characteristics of several database management systems, including flat file, indexed file, and relational databases, as well as ACeDB. It compares their strengths and weaknesses and offers some general guidelines for selecting an appropriate database management system.

  1. Knowledge representation in metabolic pathway databases.

    PubMed

    Stobbe, Miranda D; Jansen, Gerbert A; Moerland, Perry D; van Kampen, Antoine H C

    2014-05-01

    The accurate representation of all aspects of a metabolic network in a structured format, such that it can be used for a wide variety of computational analyses, is a challenge faced by a growing number of researchers. Analysis of five major metabolic pathway databases reveals that each database has made widely different choices to address this challenge, including how to deal with knowledge that is uncertain or missing. In concise overviews, we show how concepts such as compartments, enzymatic complexes and the direction of reactions are represented in each database. Importantly, also concepts which a database does not represent are described. Which aspects of the metabolic network need to be available in a structured format and to what detail differs per application. For example, for in silico phenotype prediction, a detailed representation of gene-protein-reaction relations and the compartmentalization of the network is essential. Our analysis also shows that current databases are still limited in capturing all details of the biology of the metabolic network, further illustrated with a detailed analysis of three metabolic processes. Finally, we conclude that the conceptual differences between the databases, which make knowledge exchange and integration a challenge, have not been resolved, so far, by the exchange formats in which knowledge representation is standardized.

  2. Microfluidic nitrogen-assisted nanoelectrospray emitter: A monolithic interface for accurate mass measurements based on a single nozzle.

    PubMed

    Wang, Lingling; Wang, Yujiao; Jiang, Shichang; Ye, Mingyue; Su, Ping; Xiong, Bo

    2016-10-28

    Nitrogen-assisted nanoelectrospray emitter (NANE) was developed to achieve accurate mass-to-charge ratio (m/z) measurements with a single monolithic nozzle. Deposition patterns of generated electrosprays from NANE confirmed their wrapped configurations. Additionally, the intensity of the sample ion and its ratio relative to a reference ion was inclined to focus on the central region of the spray; this trend further supported the existence of wrapped configurations. Further, the proposed NANE was fabricated from poly-(dimethylsiloxane) (PDMS) with octadecyltrichlorosilane modification to restrain the dissolution of PDMS monomers. Assist nitrogen flows were introduced to improve the ionization of reference ions. Moreover, the NANE could regulate the distribution of reference ions by microfluidic three dimensional hydrodynamic focusing. By regulating the distribution of reference ions, the ionization depression was reduced to some degree, and an improved sensitivity was accomplished compared with the mixing of sample and reference solutions. Achieved relative errors of m/z were between 0.2-4.5ppm and 5.2-9.2ppm for ten organic molecules and four biological macromolecules, respectively. Acceptable linear ranges were obtained in quantifications for rhodamine B and emamectin benzoate. Finally, the NANE was compatible with broad infusion rates (from 50nLmin -1 to 15μLmin -1 ) and solutions of different compositions (from 100% methanol to 100% water). Considering the comprehensive application of PDMS in microfluidics, the proposed NANE could be used as a compact and monolithic interface to achieve accurate m/z measurements. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Organizing a breast cancer database: data management.

    PubMed

    Yi, Min; Hunt, Kelly K

    2016-06-01

    Developing and organizing a breast cancer database can provide data and serve as valuable research tools for those interested in the etiology, diagnosis, and treatment of cancer. Depending on the research setting, the quality of the data can be a major issue. Assuring that the data collection process does not contribute inaccuracies can help to assure the overall quality of subsequent analyses. Data management is work that involves the planning, development, implementation, and administration of systems for the acquisition, storage, and retrieval of data while protecting it by implementing high security levels. A properly designed database provides you with access to up-to-date, accurate information. Database design is an important component of application design. If you take the time to design your databases properly, you'll be rewarded with a solid application foundation on which you can build the rest of your application.

  4. Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation

    PubMed Central

    Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392

  5. Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

    PubMed

    Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

  6. Opportune Landing Site CBR and Low-Density Laboratory Database

    DTIC Science & Technology

    2008-05-01

    Program Opportune Landing Site CBR and Low- Density Laboratory Database Larry S. Danyluk, Sally A. Shoop, Rosa T. Affleck, and Wendy L. Wieder...Opportune Landing Site Program ERDC/CRREL TR-08-9 May 2008 Opportune Landing Site CBR and Low- Density Laboratory Database Larry S. Danyluk, Sally A...reproduce in-situ density , moisture, and CBR values and therefore do not accurately repre- sent the complete range of these values measured in the field

  7. Accurate palm vein recognition based on wavelet scattering and spectral regression kernel discriminant analysis

    NASA Astrophysics Data System (ADS)

    Elnasir, Selma; Shamsuddin, Siti Mariyam; Farokhi, Sajad

    2015-01-01

    Palm vein recognition (PVR) is a promising new biometric that has been applied successfully as a method of access control by many organizations, which has even further potential in the field of forensics. The palm vein pattern has highly discriminative features that are difficult to forge because of its subcutaneous position in the palm. Despite considerable progress and a few practical issues, providing accurate palm vein readings has remained an unsolved issue in biometrics. We propose a robust and more accurate PVR method based on the combination of wavelet scattering (WS) with spectral regression kernel discriminant analysis (SRKDA). As the dimension of WS generated features is quite large, SRKDA is required to reduce the extracted features to enhance the discrimination. The results based on two public databases-PolyU Hyper Spectral Palmprint public database and PolyU Multi Spectral Palmprint-show the high performance of the proposed scheme in comparison with state-of-the-art methods. The proposed approach scored a 99.44% identification rate and a 99.90% verification rate [equal error rate (EER)=0.1%] for the hyperspectral database and a 99.97% identification rate and a 99.98% verification rate (EER=0.019%) for the multispectral database.

  8. Accurate determination of non-metallic impurities in high purity tetramethylammonium hydroxide using inductively coupled plasma tandem mass spectrometry

    NASA Astrophysics Data System (ADS)

    Fu, Liang; Xie, Hualin; Shi, Shuyun; Chen, Xiaoqing

    2018-06-01

    The content of non-metallic impurities in high-purity tetramethylammonium hydroxide (HPTMAH) aqueous solution has an important influence on the yield, electrical properties and reliability of the integrated circuit during the process of chip etching and cleaning. Therefore, an efficient analytical method to directly quantify the content of non-metallic impurities in HPTMAH aqueous solutions is necessary. The present study was aimed to develop a novel method that can accurately determine seven non-metallic impurities (B, Si, P, S, Cl, As, and Se) in an aqueous solution of HPTMAH by inductively coupled plasma tandem mass spectrometry (ICP-MS/MS). The samples were measured using a direct injection method. In the MS/MS mode, oxygen and hydrogen were used as reaction gases in the octopole reaction system (ORS) to eliminate mass spectral interferences during the analytical process. The detection limits of B, Si, P, S, Cl, As, and Se were 0.31, 0.48, 0.051, 0.27, 3.10, 0.008, and 0.005 μg L-1, respectively. The samples were analyzed by the developed method and the sector field inductively coupled plasma mass spectrometry (SF-ICP-MS) was used for contrastive analysis. The values of these seven elements measured using ICP-MS/MS were consistent with those measured by SF-ICP-MS. The proposed method can be utilized to analyze non-metallic impurities in HPTMAH aqueous solution. Table S2 Multiple potential interferences on the analytes. Table S3 Parameters of calibration curve and the detection limit (DL). Table S4 Results obtained for 25% concentration high-purity grade TMAH aqueous solution samples (μg L-1, mean ± standard deviation, n = 10).

  9. A Circular Dichroism Reference Database for Membrane Proteins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wallace,B.; Wien, F.; Stone, T.

    2006-01-01

    Membrane proteins are a major product of most genomes and the target of a large number of current pharmaceuticals, yet little information exists on their structures because of the difficulty of crystallising them; hence for the most part they have been excluded from structural genomics programme targets. Furthermore, even methods such as circular dichroism (CD) spectroscopy which seek to define secondary structure have not been fully exploited because of technical limitations to their interpretation for membrane embedded proteins. Empirical analyses of circular dichroism (CD) spectra are valuable for providing information on secondary structures of proteins. However, the accuracy of themore » results depends on the appropriateness of the reference databases used in the analyses. Membrane proteins have different spectral characteristics than do soluble proteins as a result of the low dielectric constants of membrane bilayers relative to those of aqueous solutions (Chen & Wallace (1997) Biophys. Chem. 65:65-74). To date, no CD reference database exists exclusively for the analysis of membrane proteins, and hence empirical analyses based on current reference databases derived from soluble proteins are not adequate for accurate analyses of membrane protein secondary structures (Wallace et al (2003) Prot. Sci. 12:875-884). We have therefore created a new reference database of CD spectra of integral membrane proteins whose crystal structures have been determined. To date it contains more than 20 proteins, and spans the range of secondary structures from mostly helical to mostly sheet proteins. This reference database should enable more accurate secondary structure determinations of membrane embedded proteins and will become one of the reference database options in the CD calculation server DICHROWEB (Whitmore & Wallace (2004) NAR 32:W668-673).« less

  10. Advances in computational metabolomics and databases deepen the understanding of metabolisms.

    PubMed

    Tsugawa, Hiroshi

    2018-01-29

    Mass spectrometry (MS)-based metabolomics is the popular platform for metabolome analyses. Computational techniques for the processing of MS raw data, for example, feature detection, peak alignment, and the exclusion of false-positive peaks, have been established. The next stage of untargeted metabolomics would be to decipher the mass fragmentation of small molecules for the global identification of human-, animal-, plant-, and microbiota metabolomes, resulting in a deeper understanding of metabolisms. This review is an update on the latest computational metabolomics including known/expected structure databases, chemical ontology classifications, and mass spectrometry cheminformatics for the interpretation of mass fragmentations and for the elucidation of unknown metabolites. The importance of metabolome 'databases' and 'repositories' is also discussed because novel biological discoveries are often attributable to the accumulation of data, to relational databases, and to their statistics. Lastly, a practical guide for metabolite annotations is presented as the summary of this review. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. Use of MALDI-TOF Mass Spectrometry and a Custom Database to Characterize Bacteria Indigenous to a Unique Cave Environment (Kartchner Caverns, AZ, USA)

    PubMed Central

    Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R.

    2015-01-01

    MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level. PMID:25590854

  12. Use of MALDI-TOF mass spectrometry and a custom database to characterize bacteria indigenous to a unique cave environment (Kartchner Caverns, AZ, USA).

    PubMed

    Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R

    2015-01-02

    MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level.

  13. Domain Regeneration for Cross-Database Micro-Expression Recognition

    NASA Astrophysics Data System (ADS)

    Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying

    2018-05-01

    In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.

  14. Conversational high resolution mass spectrographic data reduction

    NASA Technical Reports Server (NTRS)

    Romiez, M. P.

    1973-01-01

    A FORTRAN 4 program is described which reduces the data obtained from a high resolution mass spectrograph. The program (1) calculates an accurate mass for each line on the photoplate, and (2) assigns elemental compositions to each accurate mass. The program is intended for use in a time-shared computing environment and makes use of the conversational aspects of time-sharing operating systems.

  15. A new retrospective, multi-evidence veterinary drug screening method using drift tube ion mobility mass spectrometry.

    PubMed

    Xu, Zhenzhen; Li, Jianzhong; Chen, Ailiang; Ma, Xin; Yang, Shuming

    2018-05-03

    The retrospectivity (the ability to retrospect to a previously unknown compound in raw data) is very meaningful for food safety and risk assessment when facing new emerging drugs. Accurate mass and retention time based screening may lead false positive and false negative results so new retrospective, reliable platform is desirable. Different concentration levels of standards with and without matrix were analyzed using ion mobility (IM)-quadrupole-time-of-flight (Q-TOF) for collecting retrospective accurate mass, retention time, drift time and tandem MS evidence for identification in a single experiment. The isomer separation ability of IM and the four-dimensional (4D) feature abundance quantification abilities were evaluated for veterinary drugs for the first time. The sensitivity of the IM-Q-TOF workflow was obviously higher than that of the traditional database searching algorithm [find by formula (FbF) function] for Q-TOF. In addition, the IM-Q-TOF workflow contained most of the results from FbF and removed the false positive results. Some isomers were separated by IM and the 4D feature abundance quantitation removed interference with similar accurate mass and showed good linearity. A new retrospective, multi-evidence platform was built for veterinary drug screening in a single experiment. The sensitivity was significantly improved and the data can be used for quantification. The platform showed its potential to be used for food safety and risk assessment. This article is protected by copyright. All rights reserved.

  16. A global, open-source database of flood protection standards

    NASA Astrophysics Data System (ADS)

    Scussolini, Paolo; Aerts, Jeroen; Jongman, Brenden; Bouwer, Laurens; Winsemius, Hessel; de Moel, Hans; Ward, Philip

    2016-04-01

    Accurate flood risk estimation is pivotal in that it enables risk-informed policies in disaster risk reduction, as emphasized in the recent Sendai framework for Disaster Risk Reduction. To improve our understanding of flood risk, models are now capable to provide actionable risk information on the (sub)global scale. Still the accuracy of their results is greatly limited by the lack of information on standards of protection to flood that are actually in place; and researchers thus take large assumptions on the extent of protection. With our work we propose a first global, open-source database of FLOod PROtection Standards, FLOPROS, covering a range of spatial scales. FLOPROS is structured in three layers of information, and merges them into one consistent database: 1) the Design layer contains empirical information about the standard of protection presently in place; 2) the Policy layer contains intended protection standards from normative documents; 3) the Model layer uses a validated numerical approach to calculate protection standards for areas not covered in the other layers. The FLOPROS database can be used for more accurate risk assessment exercises across scales. As the database should be continually updated to reflect new interventions, we invite researchers and practitioners to contribute information. Further, we look for partners within the risk community to participate in additional strategies to implement the amount and accuracy of information contained in this first version of FLOPROS.

  17. A Knowledge Database on Thermal Control in Manufacturing Processes

    NASA Astrophysics Data System (ADS)

    Hirasawa, Shigeki; Satoh, Isao

    A prototype version of a knowledge database on thermal control in manufacturing processes, specifically, molding, semiconductor manufacturing, and micro-scale manufacturing has been developed. The knowledge database has search functions for technical data, evaluated benchmark data, academic papers, and patents. The database also displays trends and future roadmaps for research topics. It has quick-calculation functions for basic design. This paper summarizes present research topics and future research on thermal control in manufacturing engineering to collate the information to the knowledge database. In the molding process, the initial mold and melt temperatures are very important parameters. In addition, thermal control is related to many semiconductor processes, and the main parameter is temperature variation in wafers. Accurate in-situ temperature measurment of wafers is important. And many technologies are being developed to manufacture micro-structures. Accordingly, the knowledge database will help further advance these technologies.

  18. Development and applications of the EntomopathogenID MLSA database for use in agricultural systems

    USDA-ARS?s Scientific Manuscript database

    The current study reports the development and application of a publicly accessible, curated database of Hypocrealean entomopathogenic fungi sequence data. The goal was to provide a platform for users to easily access sequence data from reference strains. The database can be used to accurately identi...

  19. Recent advances in thermal desorption-gas chromatography-mass spectrometery method to eliminate the matrix effect between air and water samples: application to the accurate determination of Henry's law constant.

    PubMed

    Kim, Yong-Hyun; Kim, Ki-Hyun

    2014-05-16

    Accurate values for the Henry's law constants are essential to describe the environmental dynamics of a solute, but substantial errors are recognized in many reported data due to practical difficulties in measuring solubility and/or vapor pressure. Despite such awareness, validation of experimental approaches has scarcely been made. An experimental approach based on thermal desorption-gas chromatography-mass spectrometery (TD-GC-MS) method was developed to concurrently allow the accurate determination of target compounds from the headspace and aqueous samples in closed equilibrated system. The analysis of six aromatics and eight non-aromatic oxygenates was then carried out in a static headspace mode. An estimation of the potential bias and mass balance (i.e., sum of mass measured individually from gas and liquid phases vs. the mass initially added to the system) demonstrates compound-specific phase dependency so that the best results are obtained by aqueous (less soluble aromatics) and headspace analysis (more soluble non-aromatics). Accordingly, we were able to point to the possible sources of biases in previous studies and provide the best estimates for the Henry's constants (Matm(-1)): benzene (0.17), toluene (0.15), p-xylene (0.13), m-xylene (0.13), o-xylene (0.19), styrene (0.27); propionaldehyde (9.26), butyraldehyde (6.19), isovaleraldehyde (2.14), n-valeraldehyde (3.98), methyl ethyl ketone (10.5), methyl isobutyl ketone (3.93), n-butyl acetate (2.41), and isobutyl alcohol (22.2). Copyright © 2014 Elsevier B.V. All rights reserved.

  20. A prototypic small molecule database for bronchoalveolar lavage-based metabolomics

    NASA Astrophysics Data System (ADS)

    Walmsley, Scott; Cruickshank-Quinn, Charmion; Quinn, Kevin; Zhang, Xing; Petrache, Irina; Bowler, Russell P.; Reisdorph, Richard; Reisdorph, Nichole

    2018-04-01

    The analysis of bronchoalveolar lavage fluid (BALF) using mass spectrometry-based metabolomics can provide insight into lung diseases, such as asthma. However, the important step of compound identification is hindered by the lack of a small molecule database that is specific for BALF. Here we describe prototypic, small molecule databases derived from human BALF samples (n=117). Human BALF was extracted into lipid and aqueous fractions and analyzed using liquid chromatography mass spectrometry. Following filtering to reduce contaminants and artifacts, the resulting BALF databases (BALF-DBs) contain 11,736 lipid and 658 aqueous compounds. Over 10% of these were found in 100% of samples. Testing the BALF-DBs using nested test sets produced a 99% match rate for lipids and 47% match rate for aqueous molecules. Searching an independent dataset resulted in 45% matching to the lipid BALF-DB compared to<25% when general databases are searched. The BALF-DBs are available for download from MetaboLights. Overall, the BALF-DBs can reduce false positives and improve confidence in compound identification compared to when general databases are used.

  1. Accurate analysis of parabens in human urine using isotope-dilution ultrahigh-performance liquid chromatography-high resolution mass spectrometry.

    PubMed

    Zhou, Hui-Ting; Chen, Hsin-Chang; Ding, Wang-Hsien

    2018-02-20

    An analytical method that utilizes isotope-dilution ultrahigh-performance liquid chromatography coupled with hybrid quadrupole time-of-flight mass spectrometry (UHPLC-QTOF-MS or called UHPLC-HRMS) was developed, and validated to be highly precise and accurate for the detection of nine parabens (methyl-, ethyl-, propyl-, isopropyl-, butyl-, isobutyl-, pentyl-, hexyl-, and benzyl-parabens) in human urine samples. After sample preparation by ultrasound-assisted emulsification microextraction (USAEME), the extract was directly injected into UHPLC-HRMS. By using negative electrospray ionization in the multiple reaction monitoring (MRM) mode and measuring the peak area ratios of both the natural and the labeled-analogues in the samples and calibration standards, the target analytes could be accurately identified and quantified. Another use for the labeled-analogues was to correct for systematic errors associated with the analysis, such as the matrix effect and other variations. The limits of quantitation (LOQs) were ranging from 0.3 to 0.6 ng/mL. High precisions for both repeatability and reproducibility were obtained ranging from 1 to 8%. High trueness (mean extraction recovery, or called accuracy) ranged from 93 to 107% on two concentration levels. According to preliminary results, the total concentrations of four most detected parabens (methyl-, ethyl-, propyl- and butyl-) ranged from 0.5 to 79.1 ng/mL in male urine samples, and from 17 to 237 ng/mL in female urine samples. Interestingly, two infrequently detected pentyl- and hexyl-parabens were found in one of the male samples in this study. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Sub-millimeter detected z ~ 2 radio-quiet QSOs. Accurate redshifts, black hole masses, and inflow/outflow velocities

    NASA Astrophysics Data System (ADS)

    Orellana, G.; Nagar, N. M.; Isaak, K. G.; Priddey, R.; Maiolino, R.; McMahon, R.; Marconi, A.; Oliva, E.

    2011-07-01

    Context. We present near-IR spectroscopy of a sample of luminous (MB - 27.5; Lbol > 1014 L⊙), sub-millimeter-detected, dusty (Md ~ 109 M⊙), radio-quiet quasi-stellar objects (QSOs) at z ~ 2. Aims: A primary aim is to provide a more accurate QSO redshift determination in order to trace kinematics and inflows/outflows in these sub-mm bright QSOs. Additionally, the Hα and continuum properties allow an estimation of the black hole mass and accretion rate, offering insights into the starburst-AGN connection in sub-mm bright QSOs. Methods: We measure the redshift, width, and luminosity of the Hα line, and the continuum luminosity near Hα. Relative velocity differences between Hα and rest-frame UV emission lines are used to study the presence and strength of outflows/inflows. Luminosities and line widths are used to estimate the black hole masses, bolometric luminosities, Eddington fractions, and accretion rates; these are compared to the star-formation-rate (SFR), estimated from the sub-mm derived far-infrared (FIR) luminosity. Finally our sub-mm-bright QSO sample is compared with other QSO samples at similar redshifts. Results: The Hα emission line was strongly detected in all sources. Two components - a very broad (≳5000 km s-1) Gaussian and an intermediate-width (≳1500 km s-1) Gaussian, were required to fit the Hα profile of all observed QSOs. Narrow (≲1000 km s-1) lines were not detected in the sample QSOs. The rest-frame UV emission lines in these sub-mm bright QSOs show larger than average blue-shifted velocities, potentially tracing strong - up to 3000 km s-1 - outflows in the broad line region. With the exception of the one QSO which shows exceptionally broad Hα lines, the black hole masses of the QSO sample are in the range log MBH = 9.0-9.7 and the Eddington fractions are between 0.5 and ~1. In black hole mass and accretion rate, this sub-mm bright QSO sample is indistinguishable from the Shemmer et al. (2004, ApJ, 614, 547) optically

  3. Linking high resolution mass spectrometry data with exposure and toxicity forecasts to advance high-throughput environmental monitoring

    EPA Pesticide Factsheets

    There is a growing need in the field of exposure science for monitoring methods that rapidly screen environmental media for suspect contaminants. Measurement and analysis platforms, based on high resolution mass spectrometry (HRMS), now exist to meet this need. Here we describe results of a study that links HRMS data with exposure predictions from the U.S. EPA's ExpoCast? program and in vitro bioassay data from the U.S. interagency Tox21 consortium. Vacuum dust samples were collected from 56 households across the U.S. as part of the American Healthy Homes Survey (AHHS). Sample extracts were analyzed using liquid chromatography time-of-flight mass spectrometry (LC??TOF/MS) with electrospray ionization. On average, approximately 2000 molecular features were identified per sample (based on accurate mass) in negative ion mode, and 3000 in positive ion mode. Exact mass, isotope distribution, and isotope spacing were used to match molecular features with a unique listing of chemical formulas extracted from EPA's Distributed Structure-Searchable Toxicity (DSSTox) database. A total of 978 DSSTox formulas were consistent with the dust LC??TOF/molecular feature data (match score ? 90); these formulas mapped to 3228 possible chemicals in the database. Correct assignment of a unique chemical to a given formula required additional validation steps. Each suspect chemical was prioritized for follow-up confirmation using abundance and detection frequency results, along with exp

  4. Comparing surgical infections in National Surgical Quality Improvement Project and an Institutional Database.

    PubMed

    Selby, Luke V; Sjoberg, Daniel D; Cassella, Danielle; Sovel, Mindy; Weiser, Martin R; Sepkowitz, Kent; Jones, David R; Strong, Vivian E

    2015-06-15

    Surgical quality improvement requires accurate tracking and benchmarking of postoperative adverse events. We track surgical site infections (SSIs) with two systems; our in-house surgical secondary events (SSE) database and the National Surgical Quality Improvement Project (NSQIP). The SSE database, a modification of the Clavien-Dindo classification, categorizes SSIs by their anatomic site, whereas NSQIP categorizes by their level. Our aim was to directly compare these different definitions. NSQIP and the SSE database entries for all surgeries performed in 2011 and 2012 were compared. To match NSQIP definitions, and while blinded to NSQIP results, entries in the SSE database were categorized as either incisional (superficial or deep) or organ space infections. These categorizations were compared with NSQIP records; agreement was assessed with Cohen kappa. The 5028 patients in our cohort had a 6.5% SSI in the SSE database and a 4% rate in NSQIP, with an overall agreement of 95% (kappa = 0.48, P < 0.0001). The rates of categorized infections were similarly well matched; incisional rates of 4.1% and 2.7% for the SSE database and NSQIP and organ space rates of 2.6% and 1.5%. Overall agreements were 96% (kappa = 0.36, P < 0.0001) and 98% (kappa = 0.55, P < 0.0001), respectively. Over 80% of cases recorded by the SSE database but not NSQIP did not meet NSQIP criteria. The SSE database is an accurate, real-time record of postoperative SSIs. Institutional databases that capture all surgical cases can be used in conjunction with NSQIP with excellent concordance. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. The use of in vitro technologies coupled with high resolution accurate mass LC-MS for studying drug metabolism in equine drug surveillance.

    PubMed

    Scarth, James P; Spencer, Holly A; Timbers, Sarah E; Hudson, Simon C; Hillyer, Lynn L

    2010-01-01

    The detection of drug abuse in horseracing often requires knowledge of drug metabolism, especially if urine is the matrix of choice. In this study, equine liver/lung microsomes/S9 tissue fractions were used to study the phase I metabolism of eight drugs of relevance to equine drug surveillance (acepromazine, azaperone, celecoxib, fentanyl, fluphenazine, mepivacaine, methylphenidate and tripelennamine). In vitro samples were analyzed qualitatively alongside samples originating from in vivo administrations using LC-MS on a high resolution accurate mass Thermo Orbitrap Discovery instrument and by LC-MS/MS on an Applied Biosystems Sciex 5500 Q Trap.Using high resolution accurate mass full-scan analysis on the Orbitrap, the in vitro systems were found to generate at least the two most abundant phase I metabolites observed in vitro for all eight drugs studied. In the majority of cases, in vitro experiments were also able to generate the minor in vivo metabolites and sometimes metabolites that were only observed in vitro. More detailed analyses of fentanyl incubates using LC-MS/MS showed that it was possible to generate good quality spectra from the metabolites generated in vitro. These data support the suggestion of using in vitro incubates as metabolite reference material in place of in vivo post-administration samples in accordance with new qualitative identification guidelines in the 2009 International Laboratory Accreditation Cooperation-G7 (ILAC-G7) document.In summary, the in vitro and in vivo phase I metabolism results reported herein compare well and demonstrate the potential of in vitro studies to compliment, refine and reduce the existing equine in vivo paradigm. © 2010 John Wiley & Sons, Ltd.

  6. CLASH-VLT: INSIGHTS ON THE MASS SUBSTRUCTURES IN THE FRONTIER FIELDS CLUSTER MACS J0416.1–2403 THROUGH ACCURATE STRONG LENS MODELING

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grillo, C.; Suyu, S. H.; Umetsu, K.

    2015-02-10

    We present a detailed mass reconstruction and a novel study on the substructure properties in the core of the Cluster Lensing And Supernova survey with Hubble (CLASH) and Frontier Fields galaxy cluster MACS J0416.1–2403. We show and employ our extensive spectroscopic data set taken with the VIsible Multi-Object Spectrograph instrument as part of our CLASH-VLT program, to confirm spectroscopically 10 strong lensing systems and to select a sample of 175 plausible cluster members to a limiting stellar mass of log (M {sub *}/M {sub ☉}) ≅ 8.6. We reproduce the measured positions of a set of 30 multiple images withmore » a remarkable median offset of only 0.''3 by means of a comprehensive strong lensing model comprised of two cluster dark-matter halos, represented by cored elliptical pseudo-isothermal mass distributions, and the cluster member components, parameterized with dual pseudo-isothermal total mass profiles. The latter have total mass-to-light ratios increasing with the galaxy HST/WFC3 near-IR (F160W) luminosities. The measurement of the total enclosed mass within the Einstein radius is accurate to ∼5%, including the systematic uncertainties estimated from six distinct mass models. We emphasize that the use of multiple-image systems with spectroscopic redshifts and knowledge of cluster membership based on extensive spectroscopic information is key to constructing robust high-resolution mass maps. We also produce magnification maps over the central area that is covered with HST observations. We investigate the galaxy contribution, both in terms of total and stellar mass, to the total mass budget of the cluster. When compared with the outcomes of cosmological N-body simulations, our results point to a lack of massive subhalos in the inner regions of simulated clusters with total masses similar to that of MACS J0416.1–2403. Our findings of the location and shape of the cluster dark-matter halo density profiles and on the cluster substructures provide

  7. Assessing the accuracy of body mass estimation equations from pelvic and femoral variables among modern British women of known mass.

    PubMed

    Young, Mariel; Johannesdottir, Fjola; Poole, Ken; Shaw, Colin; Stock, J T

    2018-02-01

    Femoral head diameter is commonly used to estimate body mass from the skeleton. The three most frequently employed methods, designed by Ruff, Grine, and McHenry, were developed using different populations to address different research questions. They were not specifically designed for application to female remains, and their accuracy for this purpose has rarely been assessed or compared in living populations. This study analyzes the accuracy of these methods using a sample of modern British women through the use of pelvic CT scans (n = 97) and corresponding information about the individuals' known height and weight. Results showed that all methods provided reasonably accurate body mass estimates (average percent prediction errors under 20%) for the normal weight and overweight subsamples, but were inaccurate for the obese and underweight subsamples (average percent prediction errors over 20%). When women of all body mass categories were combined, the methods provided reasonable estimates (average percent prediction errors between 16 and 18%). The results demonstrate that different methods provide more accurate results within specific body mass index (BMI) ranges. The McHenry Equation provided the most accurate estimation for women of small body size, while the original Ruff Equation is most likely to be accurate if the individual was obese or severely obese. The refined Ruff Equation was the most accurate predictor of body mass on average for the entire sample, indicating that it should be utilized when there is no knowledge of the individual's body size or if the individual is assumed to be of a normal body size. The study also revealed a correlation between pubis length and body mass, and an equation for body mass estimation using pubis length was accurate in a dummy sample, suggesting that pubis length can also be used to acquire reliable body mass estimates. This has implications for how we interpret body mass in fossil hominins and has particular relevance

  8. Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics.

    PubMed

    Deutsch, Eric W; Sun, Zhi; Campbell, David S; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S; Moritz, Robert L

    2016-11-04

    The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the

  9. Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics

    PubMed Central

    Deutsch, Eric W.; Sun, Zhi; Campbell, David S.; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S.; Moritz, Robert L.

    2016-01-01

    The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances – a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ~20,000 primary isoforms plus contaminants to a very large database that includes almost all non-redundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the

  10. Using the Proteomics Identifications Database (PRIDE).

    PubMed

    Martens, Lennart; Jones, Phil; Côté, Richard

    2008-03-01

    The Proteomics Identifications Database (PRIDE) is a public data repository designed to store, disseminate, and analyze mass spectrometry based proteomics datasets. The PRIDE database can accommodate any level of detailed metadata about the submitted results, which can be queried, explored, viewed, or downloaded via the PRIDE Web interface. The PRIDE database also provides a simple, yet powerful, access control mechanism that fully supports confidential peer-reviewing of data related to a manuscript, ensuring that these results remain invisible to the general public while allowing referees and journal editors anonymized access to the data. This unit describes in detail the functionality that PRIDE provides with regards to searching, viewing, and comparing the available data, as well as different options for submitting data to PRIDE.

  11. Identification of Aeromonas isolates by matrix-assisted laser desorption ionization time-of-flight mass spectrometry.

    PubMed

    Lamy, Brigitte; Kodjo, Angeli; Laurent, Frédéric

    2011-09-01

    We evaluated the accuracy of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry for identifying aeromonads with an extraction procedure. Genus-level accuracy was 100%. Compared to rpoB gene sequencing, species-level accuracy was 90.6% (29/32) for type and reference strains and 91.4% for a collection of 139 clinical and environmental isolates, making this system one of the most accurate and rapid methods for phenotypic identification. The reliability of this technique was very promising, although some improvements in database composition, taxonomy, and discriminatory power are needed. Copyright © 2011 Elsevier Inc. All rights reserved.

  12. LMSD: LIPID MAPS structure database

    PubMed Central

    Sud, Manish; Fahy, Eoin; Cotter, Dawn; Brown, Alex; Dennis, Edward A.; Glass, Christopher K.; Merrill, Alfred H.; Murphy, Robert C.; Raetz, Christian R. H.; Russell, David W.; Subramaniam, Shankar

    2007-01-01

    The LIPID MAPS Structure Database (LMSD) is a relational database encompassing structures and annotations of biologically relevant lipids. Structures of lipids in the database come from four sources: (i) LIPID MAPS Consortium's core laboratories and partners; (ii) lipids identified by LIPID MAPS experiments; (iii) computationally generated structures for appropriate lipid classes; (iv) biologically relevant lipids manually curated from LIPID BANK, LIPIDAT and other public sources. All the lipid structures in LMSD are drawn in a consistent fashion. In addition to a classification-based retrieval of lipids, users can search LMSD using either text-based or structure-based search options. The text-based search implementation supports data retrieval by any combination of these data fields: LIPID MAPS ID, systematic or common name, mass, formula, category, main class, and subclass data fields. The structure-based search, in conjunction with optional data fields, provides the capability to perform a substructure search or exact match for the structure drawn by the user. Search results, in addition to structure and annotations, also include relevant links to external databases. The LMSD is publicly available at PMID:17098933

  13. Evaluation of consumer drug information databases.

    PubMed

    Choi, J A; Sullivan, J; Pankaskie, M; Brufsky, J

    1999-01-01

    To evaluate prescription drug information contained in six consumer drug information databases available on CD-ROM, and to make health care professionals aware of the information provided, so that they may appropriately recommend these databases for use by their patients. Observational study of six consumer drug information databases: The Corner Drug Store, Home Medical Advisor, Mayo Clinic Family Pharmacist, Medical Drug Reference, Mosby's Medical Encyclopedia, and PharmAssist. Not applicable. Not applicable. Information on 20 frequently prescribed drugs was evaluated in each database. The databases were ranked using a point-scale system based on primary and secondary assessment criteria. For the primary assessment, 20 categories of information based on those included in the 1998 edition of the USP DI Volume II, Advice for the Patient: Drug Information in Lay Language were evaluated for each of the 20 drugs, and each database could earn up to 400 points (for example, 1 point was awarded if the database mentioned a drug's mechanism of action). For the secondary assessment, the inclusion of 8 additional features that could enhance the utility of the databases was evaluated (for example, 1 point was awarded if the database contained a picture of the drug), and each database could earn up to 8 points. The results of the primary and secondary assessments, listed in order of highest to lowest number of points earned, are as follows: Primary assessment--Mayo Clinic Family Pharmacist (379), Medical Drug Reference (251), PharmAssist (176), Home Medical Advisor (113.5), The Corner Drug Store (98), and Mosby's Medical Encyclopedia (18.5); secondary assessment--The Mayo Clinic Family Pharmacist (8), The Corner Drug Store (5), Mosby's Medical Encyclopedia (5), Home Medical Advisor (4), Medical Drug Reference (4), and PharmAssist (3). The Mayo Clinic Family Pharmacist was the most accurate and complete source of prescription drug information based on the USP DI Volume II and

  14. Determining accurate distances to nearby galaxies

    NASA Astrophysics Data System (ADS)

    Bonanos, Alceste Zoe

    2005-11-01

    , which confirmed that the system consists of two extremely massive stars and refined the values of the masses. It is the most massive binary known with an accurate mass determination.

  15. Identification of the chemical components of Saussurea involucrata by high-resolution mass spectrometry and the mass spectral trees similarity filter technique.

    PubMed

    Jia, Zhixin; Wu, Caisheng; Jin, Hongtao; Zhang, Jinlan

    2014-11-15

    Saussurea involucrata is a rare traditional Chinese medicine (TCM) that displays anti-fatigue, anti-inflammatory and anti-tumor effects. In this paper, the different chemical components of Saussurea involucrata were characterized and identified over a wide dynamic range by high-performance liquid chromatography coupled with high-resolution hybrid mass spectrometry (HPLC/HRMS/MS(n)) and the mass spectral trees similarity filter (MTSF) technique. The aerial parts of Saussurea involucrata were extracted with 75% ethanol. The partial extract was separated on a chromatography column to concentrate the low-concentration compounds. Mass data were acquired using full-scan mass analysis (resolving power 50,000) with data-dependent incorporation of dynamic exclusion analysis. The identified compounds were used as templates to construct a database of mass spectral trees. Data for the unknown compounds were matched with those templates and matching candidate structures were obtained. The detected compounds were characterized based on matching to candidate structures by the MTSF technique and were further identified by their accurate mass weight, multiple-stage analysis and fragmentation patterns and through comparison with literature data. A total of 38 compounds were identified including 19 flavones, 11 phenylpropanoids and 8 sphingolipids. Among them, 7 flavonoids, 8 phenylpropanoids and 8 sphingolipids were identified for the first time in Saussurea involucrata. HPLC/HRMS/MS(n) combined with MTSF was successfully used to discover and identify the chemical compounds in Saussurea involucrata. The results indicated that this combined technique was extremely useful for the rapid detection and identification of the chemical components in TCMs. Copyright © 2014 John Wiley & Sons, Ltd.

  16. Device for accurately measuring mass flow of gases

    DOEpatents

    Hylton, James O.; Remenyik, Carl J.

    1994-01-01

    A device for measuring mass flow of gases which utilizes a substantially buoyant pressure vessel suspended within a fluid/liquid in an enclosure. The pressure vessel is connected to a weighing device for continuously determining weight change of the vessel as a function of the amount of gas within the pressure vessel. In the preferred embodiment, this pressure vessel is formed from inner and outer right circular cylindrical hulls, with a volume between the hulls being vented to the atmosphere external the enclosure. The fluid/liquid, normally in the form of water typically with an added detergent, is contained within an enclosure with the fluid/liquid being at a level such that the pressure vessel is suspended beneath this level but above a bottom of the enclosure. The buoyant pressure vessel can be interconnected with selected valves to an auxiliary pressure vessel so that initial flow can be established to or from the auxiliary pressure vessel prior to flow to or from the buoyant pressure vessel.

  17. Validating abortion procedure coding in Canadian administrative databases.

    PubMed

    Samiedaluie, Saied; Peterson, Sandra; Brant, Rollin; Kaczorowski, Janusz; Norman, Wendy V

    2016-07-12

    The British Columbia (BC) Ministry of Health collects abortion procedure data in the Medical Services Plan (MSP) physician billings database and in the hospital information Discharge Abstracts Database (DAD). Our study seeks to validate abortion procedure coding in these databases. Two randomized controlled trials enrolled a cohort of 1031 women undergoing abortion. The researcher collected database includes both enrollment and follow up chart review data. The study cohort was linked to MSP and DAD data to identify all abortions events captured in the administrative databases. We compared clinical chart data on abortion procedures with health administrative data. We considered a match to occur if an abortion related code was found in administrative data within 30 days of the date of the same event documented in a clinical chart. Among 1158 abortion events performed during enrollment and follow-up period, 99.1 % were found in at least one of the administrative data sources. The sensitivities for the two databases, evaluated using a gold standard, were 97.7 % (95 % confidence interval (CI): 96.6-98.5) for the MSP database and 91.9 % (95 % CI: 90.0-93.4) for the DAD. Abortion events coded in the BC health administrative databases are highly accurate. Single-payer health administrative databases at the provincial level in Canada have the potential to offer valid data reflecting abortion events. ClinicalTrials.gov Identifier NCT01174225 , Current Controlled Trials ISRCTN19506752 .

  18. Evaluation of the performance of a tandem mass spectral library with mass spectral data extracted from literature.

    PubMed

    Würtinger, Philipp; Oberacher, Herbert

    2012-01-01

    MSforID represents a database of tandem mass spectral data obtained from (quasi-)molecular ions produced by atmospheric pressure ionization methods. At the current stage of development the library contains 12 122 spectra of 1208 small (bio-)organic molecules. The present work was aimed to evaluate the performance of the MSforID library in terms of accuracy and transferability with a collection of fragment ion mass spectra from various compounds acquired on multiple instruments. A literature survey was conducted to collect the set of sample spectra. A total number of 554 spectra covering 291 compounds were extracted from 109 publications. The majority of spectra originated from publications on applications of LC/MS/MS in drug monitoring, pharmacokinetics, environmental analysis, forensic analysis as well as food analysis. Almost all types of tandem mass spectrometric instruments distributed by the five most important instrument vendors were included in the study. The overall sensitivity of library search was found to be 96.4%, which clearly proves that the MSforID library can successfully handle data from a huge variety of mass spectrometric instruments to allow accurate compound identification. Only for spectra containing three or more fragment ions, however, the rate of classified matches (= matches with a relative average match probability (ramp) score > 40.0) was 95%. Ambiguous or unclassified results were mainly obtained for searches with single precursor-to-fragment ion transitions due to the insufficient specificity of such a low amount of structural information to unequivocally define a single compound. Copyright © 2011 John Wiley & Sons, Ltd.

  19. What Is New in Clinical Microbiology—Microbial Identification by MALDI-TOF Mass Spectrometry

    PubMed Central

    Murray, Patrick R.

    2012-01-01

    Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) offers the possibility of accurate, rapid, inexpensive identification of bacteria, fungi, and mycobacteria isolated in clinical microbiology laboratories. The procedures for preanalytic processing of organisms and analysis by MALDI-TOF MS are technically simple and reproducible, and commercial databases and interpretive algorithms are available for the identification of a wide spectrum of clinically significant organisms. Although only limited work has been reported on the use of this technique to identify molds, perform strain typing, or determine antibiotic susceptibility results, these are fruitful areas of promising research. As experience is gained with MALDI-TOF MS, it is expected that the databases will be expanded to resolve many of the current inadequate identifications (eg, no identification, genus-level identification) and algorithms for potential misidentification will be developed. The current lack of Food and Drug Administration approval of any MALDI-TOF MS system for organism identification limits widespread use in the United States. PMID:22795961

  20. Computational Prediction of Electron Ionization Mass Spectra to Assist in GC/MS Compound Identification.

    PubMed

    Allen, Felicity; Pon, Allison; Greiner, Russ; Wishart, David

    2016-08-02

    We describe a tool, competitive fragmentation modeling for electron ionization (CFM-EI) that, given a chemical structure (e.g., in SMILES or InChI format), computationally predicts an electron ionization mass spectrum (EI-MS) (i.e., the type of mass spectrum commonly generated by gas chromatography mass spectrometry). The predicted spectra produced by this tool can be used for putative compound identification, complementing measured spectra in reference databases by expanding the range of compounds able to be considered when availability of measured spectra is limited. The tool extends CFM-ESI, a recently developed method for computational prediction of electrospray tandem mass spectra (ESI-MS/MS), but unlike CFM-ESI, CFM-EI can handle odd-electron ions and isotopes and incorporates an artificial neural network. Tests on EI-MS data from the NIST database demonstrate that CFM-EI is able to model fragmentation likelihoods in low-resolution EI-MS data, producing predicted spectra whose dot product scores are significantly better than full enumeration "bar-code" spectra. CFM-EI also outperformed previously reported results for MetFrag, MOLGEN-MS, and Mass Frontier on one compound identification task. It also outperformed MetFrag in a range of other compound identification tasks involving a much larger data set, containing both derivatized and nonderivatized compounds. While replicate EI-MS measurements of chemical standards are still a more accurate point of comparison, CFM-EI's predictions provide a much-needed alternative when no reference standard is available for measurement. CFM-EI is available at https://sourceforge.net/projects/cfm-id/ for download and http://cfmid.wishartlab.com as a web service.

  1. The usefulness of administrative databases for identifying disease cohorts is increased with a multivariate model.

    PubMed

    van Walraven, Carl; Austin, Peter C; Manuel, Douglas; Knoll, Greg; Jennings, Allison; Forster, Alan J

    2010-12-01

    Administrative databases commonly use codes to indicate diagnoses. These codes alone are often inadequate to accurately identify patients with particular conditions. In this study, we determined whether we could quantify the probability that a person has a particular disease-in this case renal failure-using other routinely collected information available in an administrative data set. This would allow the accurate identification of a disease cohort in an administrative database. We determined whether patients in a randomly selected 100,000 hospitalizations had kidney disease (defined as two or more sequential serum creatinines or the single admission creatinine indicating a calculated glomerular filtration rate less than 60 mL/min/1.73 m²). The independent association of patient- and hospitalization-level variables with renal failure was measured using a multivariate logistic regression model in a random 50% sample of the patients. The model was validated in the remaining patients. Twenty thousand seven hundred thirteen patients had kidney disease (20.7%). A diagnostic code of kidney disease was strongly associated with kidney disease (relative risk: 34.4), but the accuracy of the code was poor (sensitivity: 37.9%; specificity: 98.9%). Twenty-nine patient- and hospitalization-level variables entered the kidney disease model. This model had excellent discrimination (c-statistic: 90.1%) and accurately predicted the probability of true renal failure. The probability threshold that maximized sensitivity and specificity for the identification of true kidney disease was 21.3% (sensitivity: 80.0%; specificity: 82.2%). Multiple variables available in administrative databases can be combined to quantify the probability that a person has a particular disease. This process permits accurate identification of a disease cohort in an administrative database. These methods may be extended to other diagnoses or procedures and could both facilitate and clarify the use of

  2. [Genetic mutation databases: stakes and perspectives for orphan genetic diseases].

    PubMed

    Humbertclaude, V; Tuffery-Giraud, S; Bareil, C; Thèze, C; Paulet, D; Desmet, F-O; Hamroun, D; Baux, D; Girardet, A; Collod-Béroud, G; Khau Van Kien, P; Roux, A-F; des Georges, M; Béroud, C; Claustres, M

    2010-10-01

    New technologies, which constantly become available for mutation detection and gene analysis, have contributed to an exponential rate of discovery of disease genes and variation in the human genome. The task of collecting and documenting this enormous amount of data in genetic databases represents a major challenge for the future of biological and medical science. The Locus Specific Databases (LSDBs) are so far the most efficient mutation databases. This review presents the main types of databases available for the analysis of mutations responsible for genetic disorders, as well as open perspectives for new therapeutic research or challenges for future medicine. Accurate and exhaustive collection of variations in human genomes will be crucial for research and personalized delivery of healthcare. Copyright © 2009 Elsevier Masson SAS. All rights reserved.

  3. Providing accurate near real-time fire alerts for Protected Areas through NASA FIRMS: Opportunities and Challenges

    NASA Astrophysics Data System (ADS)

    Ilavajhala, S.; Davies, D.; Schmaltz, J. E.; Wong, M.; Murphy, K. J.

    2013-12-01

    The NASA Fire Information for Resource Management System (FIRMS) is at the forefront of providing global near real-time (NRT) MODIS thermal anomalies / hotspot location data to end-users . FIRMS serves the data via an interactive Web GIS named Web Fire Mapper, downloads of NRT active fire, archive data downloads for MODIS hotspots dating back to 1999 and a hotspot email alert system The FIRMS Email Alerts system has been successfully alerting users of fires in their area of interest in near real-time and/or via daily and weekly email summaries, with an option to receive MODIS hotspot data as a text file (CSV) attachment. Currently, there are more than 7000 email alert subscriptions from more than 100 countries. Specifically, the email alerts system is designed to generate and send an email alert for any region or area on the globe, with a special focus on providing alerts for protected areas worldwide. For many protected areas, email alerts are particularly useful for early fire detection, monitoring on going fires, as well as allocating resources to protect wildlife and natural resources of particular value. For protected areas, FIRMS uses the World Database on Protected Areas (WDPA) supplied by United Nations Environment Program - World Conservation Monitoring Centre (UNEP-WCMC). Maintaining the most up-to-date, accurate boundary geometry for the protected areas for the email alerts is a challenge as the WDPA is continuously updated due to changing boundaries, merging or delisting of certain protected areas. Because of this dynamic nature of the protected areas database, the FIRMS protected areas database is frequently out-of-date with the most current version of WDPA database. To maintain the most up-to-date boundary information for protected areas and to be in compliance with the WDPA terms and conditions, FIRMS needs to constantly update its database of protected areas. Currently, FIRMS strives to keep its database up to date by downloading the most recent

  4. STRAPS v1.0: Evaluating a methodology for predicting electron impact ionisation mass spectra for the aerosol mass spectrometer

    NASA Astrophysics Data System (ADS)

    Topping, David; Allan, James; Alfarra, Rami; Aumont, Bernard

    2017-04-01

    Our ability to model the chemical and thermodynamic processes that lead to secondary organic aerosol (SOA) formation is thought to be hampered by the complexity of the system. While there are fundamental models now available that can simulate the tens of thousands of reactions thought to take place, validation against experiments is highly challenging. Techniques capable of identifying individual molecules such as chromatography are generally only capable of quantifying a subset of the material present, making it unsuitable for a carbon budget analysis. Integrative analytical methods such as the Aerosol Mass Spectrometer (AMS) are capable of quantifying all mass, but because of their inability to isolate individual molecules, comparisons have been limited to simple data products such as total organic mass and O:C ratio. More detailed comparisons could be made if more of the mass spectral information could be used, but because a discrete inversion of AMS data is not possible, this activity requires a system of predicting mass spectra based on molecular composition. In this proof of concept study, the ability to train supervised methods to predict electron impact ionisation (EI) mass spectra for the AMS is evaluated. Supervised Training Regression for the Arbitrary Prediction of Spectra (STRAPS), is not built from first principles. A methodology is constructed whereby the presence of specific mass-to-charge ratio (m/z) channels are fit as a function of molecular structure before the relative peak height for each channel is similarly fit using a range of regression methods. The widely-used AMS mass spectral database is used as a basis for this, using unit mass resolution spectra of laboratory standards. Key to the fitting process is choice of structural information, or molecular fingerprint. Initial results suggest the generic public 'MACCS' fingerprints provide the most accurate trained model when combined with both decision trees and random forests with median

  5. Rapid identification of moulds and arthroconidial yeasts from positive blood cultures by MALDI-TOF mass spectrometry.

    PubMed

    de Almeida, João N; Sztajnbok, Jaques; da Silva, Afonso Rafael; Vieira, Vinicius Adriano; Galastri, Anne Layze; Bissoli, Leandro; Litvinov, Nadia; Del Negro, Gilda Maria Barbaro; Motta, Adriana Lopes; Rossi, Flávia; Benard, Gil

    2016-11-01

    Moulds and arthroconidial yeasts are potential life-threatening agents of fungemia in immunocompromised patients. Fast and accurate identification (ID) of these pathogens hastens initiation of targeted antifungal therapy, thereby improving the patients' prognosis. We describe a new strategy that enabled the identification of moulds and arthroconidial yeasts directly from positive blood cultures by MALDI-TOF mass spectrometry (MS). Positive blood cultures (BCs) with Gram staining showing hyphae and/or arthroconidia were prospectively selected and submitted to an in-house protein extraction protocol. Mass spectra were obtained by Vitek MS™ system, and identifications were carried out with in the research use only (RUO) mode with an extended database (SARAMIS™ [v.4.12] plus in-house database). Fusarium solani, Fusarium verticillioides, Exophiala dermatitidis, Saprochaete clavata, and Trichosporon asahii had correct species ID by MALDI-TOF MS analysis of positive BCs. All cases were related to critically ill patients with high mortality fungemia and direct ID from positive BCs was helpful for rapid administration of targeted antifungal therapy. © The Author 2016. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  6. Antarctic mass balance changes from GRACE

    NASA Astrophysics Data System (ADS)

    Kallenberg, B.; Tregoning, P.

    2012-04-01

    The Antarctic ice sheet contains ~30 million km3 of ice and constitutes a significant component of the global water balance with enough freshwater to raise global sea level by ~60 m. Altimetry measurements and climate models suggest variable behaviour across the Antarctic ice sheet, with thickening occurring in a vast area of East Antarctica and substantial thinning in West Antarctica caused by increased temperature gradients in the surrounding ocean. However, the rate at which the polar ice cap is melting is still poorly constrained. To calculate the mass loss of an ice sheet it is necessary to separate present day mass balance changes from glacial isostatic adjustment (GIA), the response of the Earth's crust to mass loss, wherefore it is essential to undertake sufficient geological and geomorphological sampling. As there is only a limited possibility for this in Antarctica, all models (i.e. geological, hydrological as well as atmospheric) are very poorly constrained. Therefore, space-geodetic observations play an important role in detecting changes in mass and spatial variations in the Earth's gravity field. The Gravity Recovery And Climate Experiment (GRACE) observed spatial variations in the Earth's gravity field over the past ten years. The satellite detects mass variations in the Earth system including geophysical, hydrological and atmospheric shifts. GRACE itself is not able to separate the GIA from mass balance changes and, due to the insufficient geological and geomorphological database, it is not possible to model the GIA effect accurately for Antarctica. However, the results from GRACE can be compared with other scientific results, coming from other geodetic observations such as satellite altimetry and GPS or by the use of geological observations. In our contribution we compare the GRACE data with recorded precipitation patterns and mass anomalies over East Antarctica to separate the observed GRACE signal into its two components: GIA as a result of mass

  7. Application of China's National Forest Continuous Inventory database.

    PubMed

    Xie, Xiaokui; Wang, Qingli; Dai, Limin; Su, Dongkai; Wang, Xinchuang; Qi, Guang; Ye, Yujing

    2011-12-01

    The maintenance of a timely, reliable and accurate spatial database on current forest ecosystem conditions and changes is essential to characterize and assess forest resources and support sustainable forest management. Information for such a database can be obtained only through a continuous forest inventory. The National Forest Continuous Inventory (NFCI) is the first level of China's three-tiered inventory system. The NFCI is administered by the State Forestry Administration; data are acquired by five inventory institutions around the country. Several important components of the database include land type, forest classification and ageclass/ age-group. The NFCI database in China is constructed based on 5-year inventory periods, resulting in some of the data not being timely when reports are issued. To address this problem, a forest growth simulation model has been developed to update the database for years between the periodic inventories. In order to aid in forest plan design and management, a three-dimensional virtual reality system of forest landscapes for selected units in the database (compartment or sub-compartment) has also been developed based on Virtual Reality Modeling Language. In addition, a transparent internet publishing system for a spatial database based on open source WebGIS (UMN Map Server) has been designed and utilized to enhance public understanding and encourage free participation of interested parties in the development, implementation, and planning of sustainable forest management.

  8. Accurate LC Peak Boundary Detection for 16 O/ 18 O Labeled LC-MS Data

    PubMed Central

    Cui, Jian; Petritis, Konstantinos; Tegeler, Tony; Petritis, Brianne; Ma, Xuepo; Jin, Yufang; Gao, Shou-Jiang (SJ); Zhang, Jianqiu (Michelle)

    2013-01-01

    In liquid chromatography-mass spectrometry (LC-MS), parts of LC peaks are often corrupted by their co-eluting peptides, which results in increased quantification variance. In this paper, we propose to apply accurate LC peak boundary detection to remove the corrupted part of LC peaks. Accurate LC peak boundary detection is achieved by checking the consistency of intensity patterns within peptide elution time ranges. In addition, we remove peptides with erroneous mass assignment through model fitness check, which compares observed intensity patterns to theoretically constructed ones. The proposed algorithm can significantly improve the accuracy and precision of peptide ratio measurements. PMID:24115998

  9. Device for accurately measuring mass flow of gases

    DOEpatents

    Hylton, J.O.; Remenyik, C.J.

    1994-08-09

    A device for measuring mass flow of gases which utilizes a substantially buoyant pressure vessel suspended within a fluid/liquid in an enclosure is disclosed. The pressure vessel is connected to a weighing device for continuously determining weight change of the vessel as a function of the amount of gas within the pressure vessel. In the preferred embodiment, this pressure vessel is formed from inner and outer right circular cylindrical hulls, with a volume between the hulls being vented to the atmosphere external the enclosure. The fluid/liquid, normally in the form of water typically with an added detergent, is contained within an enclosure with the fluid/liquid being at a level such that the pressure vessel is suspended beneath this level but above a bottom of the enclosure. The buoyant pressure vessel can be interconnected with selected valves to an auxiliary pressure vessel so that initial flow can be established to or from the auxiliary pressure vessel prior to flow to or from the buoyant pressure vessel. 5 figs.

  10. Image database for digital hand atlas

    NASA Astrophysics Data System (ADS)

    Cao, Fei; Huang, H. K.; Pietka, Ewa; Gilsanz, Vicente; Dey, Partha S.; Gertych, Arkadiusz; Pospiech-Kurkowska, Sywia

    2003-05-01

    Bone age assessment is a procedure frequently performed in pediatric patients to evaluate their growth disorder. A commonly used method is atlas matching by a visual comparison of a hand radiograph with a small reference set of old Greulich-Pyle atlas. We have developed a new digital hand atlas with a large set of clinically normal hand images of diverse ethnic groups. In this paper, we will present our system design and implementation of the digital atlas database to support the computer-aided atlas matching for bone age assessment. The system consists of a hand atlas image database, a computer-aided diagnostic (CAD) software module for image processing and atlas matching, and a Web user interface. Users can use a Web browser to push DICOM images, directly or indirectly from PACS, to the CAD server for a bone age assessment. Quantitative features on the examined image, which reflect the skeletal maturity, are then extracted and compared with patterns from the atlas image database to assess the bone age. The digital atlas method built on a large image database and current Internet technology provides an alternative to supplement or replace the traditional one for a quantitative, accurate and cost-effective assessment of bone age.

  11. Matrix-assisted laser desorption ionization-time of flight mass spectrometry for fast and accurate identification of Pseudallescheria/Scedosporium species.

    PubMed

    Sitterlé, E; Giraud, S; Leto, J; Bouchara, J P; Rougeron, A; Morio, F; Dauphin, B; Angebault, C; Quesne, G; Beretti, J L; Hassouni, N; Nassif, X; Bougnoux, M E

    2014-09-01

    An increasing number of infections due to Pseudallescheria/Scedosporium species has been reported during the past decades, both in immunocompromised and immunocompetent patients. Additionally, these fungi are now recognized worldwide as common agents of fungal colonization of the airways in cystic fibrosis patients, which represents a risk factor for disseminated infections after lung transplantation. Currently six species are described within the Pseudallescheria/Scedosporium genus, including Scedosporium prolificans and species of the Pseudallescheria/Scedosporium apiospermum complex (i.e. S. apiospermum sensu stricto, Pseudallescheria boydii, Scedosporium aurantiacum, Pseudallescheria minutispora and Scedosporium dehoogii). Precise identification of clinical isolates at the species level is required because these species differ in their antifungal drug susceptibility patterns. Matrix-assisted laser desorption ionization (MALDI)-time of flight (TOF)/mass spectrometry (MS) is a powerful tool to rapidly identify moulds at the species level. We investigated the potential of this technology to discriminate Pseudallescheria/Scedosporium species. Forty-seven reference strains were used to build a reference database library. Profiles from 3-, 5- and 7-day-old cultures of each reference strain were analysed to identify species-specific discriminating profiles. The database was tested for accuracy using a set of 64 clinical or environmental isolates previously identified by multilocus sequencing. All isolates were unequivocally identified at the species level by MALDI-TOF/MS. Our results, obtained using a simple protocol, without prior protein extraction or standardization of the culture, demonstrate that MALDI-TOF/MS is a powerful tool for rapid identification of Pseudallescheria/Scedosporium species that cannot be currently identified by morphological examination in the clinical setting. © 2014 The Authors Clinical Microbiology and Infection © 2014 European Society

  12. Identification of triacylglycerol using automated annotation of high resolution multistage mass spectral trees.

    PubMed

    Wang, Xiupin; Peng, Qingzhi; Li, Peiwu; Zhang, Qi; Ding, Xiaoxia; Zhang, Wen; Zhang, Liangxiao

    2016-10-12

    High complexity of identification for non-target triacylglycerols (TAGs) is a major challenge in lipidomics analysis. To identify non-target TAGs, a powerful tool named accurate MS(n) spectrometry generating so-called ion trees is used. In this paper, we presented a technique for efficient structural elucidation of TAGs on MS(n) spectral trees produced by LTQ Orbitrap MS(n), which was implemented as an open source software package, or TIT. The TIT software was used to support automatic annotation of non-target TAGs on MS(n) ion trees from a self-built fragment ion database. This database includes 19108 simulate TAG molecules from a random combination of fatty acids and corresponding 500582 self-built multistage fragment ions (MS ≤ 3). Our software can identify TAGs using a "stage-by-stage elimination" strategy. By utilizing the MS(1) accurate mass and referenced RKMD, the TIT software can discriminate unique elemental composition candidates. The regiospecific isomers of fatty acyl chains will be distinguished using MS(2) and MS(3) fragment spectra. We applied the algorithm to the selection of 45 TAG standards and demonstrated that the molecular ions could be 100% correctly assigned. Therefore, the TIT software could be applied to TAG identification in complex biological samples such as mouse plasma extracts. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. NoSQL technologies for the CMS Conditions Database

    NASA Astrophysics Data System (ADS)

    Sipos, Roland

    2015-12-01

    With the restart of the LHC in 2015, the growth of the CMS Conditions dataset will continue, therefore the need of consistent and highly available access to the Conditions makes a great cause to revisit different aspects of the current data storage solutions. We present a study of alternative data storage backends for the Conditions Databases, by evaluating some of the most popular NoSQL databases to support a key-value representation of the CMS Conditions. The definition of the database infrastructure is based on the need of storing the conditions as BLOBs. Because of this, each condition can reach the size that may require special treatment (splitting) in these NoSQL databases. As big binary objects may be problematic in several database systems, and also to give an accurate baseline, a testing framework extension was implemented to measure the characteristics of the handling of arbitrary binary data in these databases. Based on the evaluation, prototypes of a document store, using a column-oriented and plain key-value store, are deployed. An adaption layer to access the backends in the CMS Offline software was developed to provide transparent support for these NoSQL databases in the CMS context. Additional data modelling approaches and considerations in the software layer, deployment and automatization of the databases are also covered in the research. In this paper we present the results of the evaluation as well as a performance comparison of the prototypes studied.

  14. Empirical cost models for estimating power and energy consumption in database servers

    NASA Astrophysics Data System (ADS)

    Valdivia Garcia, Harold Dwight

    The explosive growth in the size of data centers, coupled with the widespread use of virtualization technology has brought power and energy consumption as major concerns for data center administrators. Provisioning decisions must take into consideration not only target application performance but also the power demands and total energy consumption incurred by the hardware and software to be deployed at the data center. Failure to do so will result in damaged equipment, power outages, and inefficient operation. Since database servers comprise one of the most popular and important server applications deployed in such facilities, it becomes necessary to have accurate cost models that can predict the power and energy demands that each database workloads will impose in the system. In this work we present an empirical methodology to estimate the power and energy cost of database operations. Our methodology uses multiple-linear regression to derive accurate cost models that depend only on readily available statistics such as selectivity factors, tuple size, numbers columns and relational cardinality. Moreover, our method does not need measurement of individual hardware components, but rather total power and energy consumption measured at a server. We have implemented our methodology, and ran experiments with several server configurations. Our experiments indicate that we can predict power and energy more accurately than alternative methods found in the literature.

  15. Remote Sensing Estimates of Glacier Mass Balance Changes in the Himalayas of Nepal

    NASA Astrophysics Data System (ADS)

    Ambinakudige, S.; Joshi, K.

    2011-12-01

    Mass balance changes of glaciers are important indicators of climate change. There are only 30 'reference' glaciers in the world that have continuous mass balance data with world glacier monitoring service since 1976. Especially, Himalayan glaciers are conspicuously absent from global mass balance records. This shows the urgent need for mass balance data for glaciers throughout the world. In this study, we estimated mass balance of some major glaciers in the Sagarmatha National Park (SNP) in Nepal using remote sensing applications. The SNP is one of the densest glaciated regions in the Himalayan range consisting approximately 296 glacial lakes. The region has experienced several glacial lake outburst floods (GLOFs) in recent years, causing extensive damage to local infrastructure and loss of human life. In general, mass balance is determined at seasonal or yearly intervals. Because of the rugged and difficult terrain of the Himalayan region, there are only a few field based measurements of mass balance available. Moreover, there are only few cases where the applications of remote sensing methods were used to calculate mass balance of the Himalayan glaciers due to the lack of accurate elevation data. Studies have shown that estimations of mass balance using remote sensing applications were within the range of field-based mass balance measurements from the same period. This study used ASTER VNIR, 3N (nadir view) and 3B (backward view) bands to generate Digital Elevation Models (DEMs) for the SNP area. 3N and 3B bands generate an along track stereo pair with a base-to-height (B/H) ratio of about 0.6. Accurate measurement of ground control points (GCPs), their numbers and distribution are important inputs in creating accurate DEMs. Because of the availability of topographic maps for this area, we were able to provide very accurate GCPs, in sufficient numbers and distribution. We created DEMs for the years 2002, 2003, 2004 and 2005 using ENVI DEM extraction tool. Bands

  16. Assigning statistical significance to proteotypic peptides via database searches

    PubMed Central

    Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo

    2011-01-01

    Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489

  17. Clustering and Filtering Tandem Mass Spectra Acquired in Data-Independent Mode

    NASA Astrophysics Data System (ADS)

    Pak, Huisong; Nikitin, Frederic; Gluck, Florent; Lisacek, Frederique; Scherl, Alexander; Muller, Markus

    2013-12-01

    Data-independent mass spectrometry activates all ion species isolated within a given mass-to-charge window ( m/z) regardless of their abundance. This acquisition strategy overcomes the traditional data-dependent ion selection boosting data reproducibility and sensitivity. However, several tandem mass (MS/MS) spectra of the same precursor ion are acquired during chromatographic elution resulting in large data redundancy. Also, the significant number of chimeric spectra and the absence of accurate precursor ion masses hamper peptide identification. Here, we describe an algorithm to preprocess data-independent MS/MS spectra by filtering out noise peaks and clustering the spectra according to both the chromatographic elution profiles and the spectral similarity. In addition, we developed an approach to estimate the m/z value of precursor ions from clustered MS/MS spectra in order to improve database search performance. Data acquired using a small 3 m/z units precursor mass window and multiple injections to cover a m/z range of 400-1400 was processed with our algorithm. It showed an improvement in the number of both peptide and protein identifications by 8 % while reducing the number of submitted spectra by 18 % and the number of peaks by 55 %. We conclude that our clustering method is a valid approach for data analysis of these data-independent fragmentation spectra. The software including the source code is available for the scientific community.

  18. Mammography status using patient self-reports and computerized radiology database.

    PubMed

    Thompson, B; Taylor, V; Goldberg, H; Mullen, M

    1999-10-01

    This study sought to compare self-reported mammography use of low-income women utilizing an inner-city public hospital with a computerized hospital database for tracking mammography use. A survey of all age-eligible women using the hospital's internal medicine clinic was done; responses were matched with the radiology database. We examined concordance among the two data sources. Concordance between self-report and the database was high (82%) when using "ever had a mammogram at the hospital," but low (58%) when comparing self-reported last mammogram with the information contained in the database. Disagreements existed between self-reports and the database. Because we sought to ensure that women would know exactly what a mammogram entailed by including a picture of a woman having a mammogram, it is possible that women's responses were accurate, leading to concerns that discrepancies might be present in the database. Physicians and staff must ensure that they understand the full history of a woman's experience with mammography before recommending for or against the procedure.

  19. Evolution, Nucleosynthesis, and Yields of Low-mass Asymptotic Giant Branch Stars at Different Metallicities. II. The FRUITY Database

    NASA Astrophysics Data System (ADS)

    Cristallo, S.; Piersanti, L.; Straniero, O.; Gallino, R.; Domínguez, I.; Abia, C.; Di Rico, G.; Quintini, M.; Bisterzo, S.

    2011-12-01

    By using updated stellar low-mass stars models, we systematically investigate the nucleosynthesis processes occurring in asymptotic giant branch (AGB) stars. In this paper, we present a database dedicated to the nucleosynthesis of AGB stars: FRANEC Repository of Updated Isotopic Tables & Yields (FRUITY). An interactive Web-based interface allows users to freely download the full (from H to Bi) isotopic composition, as it changes after each third dredge-up (TDU) episode and the stellar yields the models produce. A first set of AGB models, having masses in the range 1.5 <=M/M ⊙ <= 3.0 and metallicities 1 × 10-3 <= Z <= 2 × 10-2, is discussed. For each model, a detailed description of the physical and the chemical evolution is provided. In particular, we illustrate the details of the s-process and we evaluate the theoretical uncertainties due to the parameterization adopted to model convection and mass loss. The resulting nucleosynthesis scenario is checked by comparing the theoretical [hs/ls] and [Pb/hs] ratios to those obtained from the available abundance analysis of s-enhanced stars. On the average, the variation with the metallicity of these spectroscopic indexes is well reproduced by theoretical models, although the predicted spread at a given metallicity is substantially smaller than the observed one. Possible explanations for such a difference are briefly discussed. An independent check of the TDU efficiency is provided by the C-stars luminosity function. Consequently, theoretical C-stars luminosity functions for the Galactic disk and the Magellanic Clouds have been derived. We generally find good agreement with observations.

  20. X-Ray Transition Energies Database

    National Institute of Standards and Technology Data Gateway

    SRD 128 NIST X-Ray Transition Energies Database (Web, free access)   This X-ray transition table provides the energies and wavelengths for the K and L transitions connecting energy levels having principal quantum numbers n = 1, 2, 3, and 4. The elements covered include Z = 10, neon to Z = 100, fermium. There are two unique features of this data base: (1) a serious attempt to have all experimental values on a scale consistent with the International System of measurement (the SI) and (2) inclusion of accurate theoretical estimates for all transitions.

  1. Design and Establishment of Quality Model of Fundamental Geographic Information Database

    NASA Astrophysics Data System (ADS)

    Ma, W.; Zhang, J.; Zhao, Y.; Zhang, P.; Dang, Y.; Zhao, T.

    2018-04-01

    In order to make the quality evaluation for the Fundamental Geographic Information Databases(FGIDB) more comprehensive, objective and accurate, this paper studies and establishes a quality model of FGIDB, which formed by the standardization of database construction and quality control, the conformity of data set quality and the functionality of database management system, and also designs the overall principles, contents and methods of the quality evaluation for FGIDB, providing the basis and reference for carry out quality control and quality evaluation for FGIDB. This paper designs the quality elements, evaluation items and properties of the Fundamental Geographic Information Database gradually based on the quality model framework. Connected organically, these quality elements and evaluation items constitute the quality model of the Fundamental Geographic Information Database. This model is the foundation for the quality demand stipulation and quality evaluation of the Fundamental Geographic Information Database, and is of great significance on the quality assurance in the design and development stage, the demand formulation in the testing evaluation stage, and the standard system construction for quality evaluation technology of the Fundamental Geographic Information Database.

  2. Proteomics: Protein Identification Using Online Databases

    ERIC Educational Resources Information Center

    Eurich, Chris; Fields, Peter A.; Rice, Elizabeth

    2012-01-01

    Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues, or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory…

  3. DtaRefinery: a software tool for elimination of systematic errors from parent ion mass measurements in tandem mass spectra datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Petyuk, Vladislav A.; Mayampurath, Anoop M.; Monroe, Matthew E.

    2009-12-16

    Hybrid two-stage mass spectrometers capable of both highly accurate mass measurement and MS/MS fragmentation have become widely available in recent years and have allowed for sig-nificantly better discrimination between true and false MS/MS pep-tide identifications by applying relatively narrow windows for maxi-mum allowable deviations for parent ion mass measurements. To fully gain the advantage of highly accurate parent ion mass meas-urements, it is important to limit systematic mass measurement errors. The DtaRefinery software tool can correct systematic errors in parent ion masses by reading a set of fragmentation spectra, searching for MS/MS peptide identifications, then fitting a model that canmore » estimate systematic errors, and removing them. This results in a new fragmentation spectrum file with updated parent ion masses.« less

  4. Profiling modifications for glioblastoma proteome using ultra-tolerant database search: Are the peptide mass shifts biologically relevant or chemically induced?

    PubMed

    Tarasova, Irina A; Chumakov, Peter M; Moshkovskii, Sergei A; Gorshkov, Mikhail V

    2018-05-17

    Peptide mass shifts were profiled using ultra-tolerant database search strategy for shotgun proteomics data sets of human glioblastoma cell lines demonstrating strong response to the type I interferon (IFNα-2b) treatment. The main objective of this profiling was revealing the cell response to IFN treatment at the level of protein modifications. To achieve this objective, statistically significant changes in peptide mass shift profiles between IFN treated and untreated glioblastoma samples were analyzed. Detailed analysis of MS/MS spectra allowed further interpretation of the observed mass shifts and differentiation between post-translational and artifact modifications. Malignant cells typically acquire increased sensitivity to viruses due to the deregulated antiviral mechanisms. Therefore, a viral therapy is considered as one of the promising approaches to treat cancer. However, recent studies have demonstrated that malignant cells can preserve intact antiviral mechanisms, e.g. interferon signaling, and develop resistance to virus infection in response to interferon treatment. Post translational modifications, e.g. tyrosine phosphorylation, are the interferon signaling drivers. Thus, comprehensive characterization of modifications is crucially important, yet, most challenging problem in cancer proteomics. Here, we report on the application of the recently introduced ultra-tolerant search strategy for profiling peptide modifications in the human glioblastoma cell lines demonstrating strong response to the type I interferon (IFNα-2b) treatment. The specific aim of the study was identification of statistically significant changes in peptide mass shift profiles between IFN treated and untreated glioblastoma samples, as well as determination of whether these shifts represent the biologically relevant modification. Copyright © 2018 Elsevier B.V. All rights reserved.

  5. MEASURING THE MASS OF SOLAR SYSTEM PLANETS USING PULSAR TIMING

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Champion, D. J.; Hobbs, G. B.; Manchester, R. N.

    High-precision pulsar timing relies on a solar system ephemeris in order to convert times of arrival (TOAs) of pulses measured at an observatory to the solar system barycenter. Any error in the conversion to the barycentric TOAs leads to a systematic variation in the observed timing residuals; specifically, an incorrect planetary mass leads to a predominantly sinusoidal variation having a period and phase associated with the planet's orbital motion about the Sun. By using an array of pulsars (PSRs J0437-4715, J1744-1134, J1857+0943, J1909-3744), the masses of the planetary systems from Mercury to Saturn have been determined. These masses are consistentmore » with the best-known masses determined by spacecraft observations, with the mass of the Jovian system, 9.547921(2) x10{sup -4} M {sub sun}, being significantly more accurate than the mass determined from the Pioneer and Voyager spacecraft, and consistent with but less accurate than the value from the Galileo spacecraft. While spacecraft are likely to produce the most accurate measurements for individual solar system bodies, the pulsar technique is sensitive to planetary system masses and has the potential to provide the most accurate values of these masses for some planets.« less

  6. Accurate Identification of Unknown and Known Metabolic Mixture Components by Combining 3D NMR with Fourier Transform Ion Cyclotron Resonance Tandem Mass Spectrometry.

    PubMed

    Wang, Cheng; He, Lidong; Li, Da-Wei; Bruschweiler-Li, Lei; Marshall, Alan G; Brüschweiler, Rafael

    2017-10-06

    Metabolite identification in metabolomics samples is a key step that critically impacts downstream analysis. We recently introduced the SUMMIT NMR/mass spectrometry (MS) hybrid approach for the identification of the molecular structure of unknown metabolites based on the combination of NMR, MS, and combinatorial cheminformatics. Here, we demonstrate the feasibility of the approach for an untargeted analysis of both a model mixture and E. coli cell lysate based on 2D/3D NMR experiments in combination with Fourier transform ion cyclotron resonance MS and MS/MS data. For 19 of the 25 model metabolites, SUMMIT yielded complete structures that matched those in the mixture independent of database information. Of those, seven top-ranked structures matched those in the mixture, and four of those were further validated by positive ion MS/MS. For five metabolites, not part of the 19 metabolites, correct molecular structural motifs could be identified. For E. coli, SUMMIT MS/NMR identified 20 previously known metabolites with three or more 1 H spins independent of database information. Moreover, for 15 unknown metabolites, molecular structural fragments were determined consistent with their spin systems and chemical shifts. By providing structural information for entire metabolites or molecular fragments, SUMMIT MS/NMR greatly assists the targeted or untargeted analysis of complex mixtures of unknown compounds.

  7. Detection and Identification of Heme c-Modified Peptides by Histidine Affinity Chromatography, High-Performance Liquid Chromatography-Mass Spectrometry, and Database Searching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Merkley, Eric D.; Anderson, Brian J.; Park, Jea H.

    2012-12-07

    Multiheme c-type cytochromes (proteins with covalently attached heme c moieties) play important roles in extracellular metal respiration in dissimilatory metal-reducing bacteria. Liquid chromatography-tandem mass spectrometry-(LC-MS/MS) characterization of c-type cytochromes is hindered by the presence of multiple heme groups, since the heme c modified peptides are typically not observed, or if observed, not identified. Using a recently reported histidine affinity chromatography (HAC) procedure, we enriched heme c tryptic peptides from purified bovine heart cytochrome c, a bacterial decaheme cytochrome, and subjected these samples to LC-MS/MS analysis. Enriched bovine cytochrome c samples yielded three- to six-fold more confident peptide-spectrum matches to heme-cmore » containing peptides than unenriched digests. In unenriched digests of the decaheme cytochrome MtoA from Sideroxydans lithotrophicus ES-1, heme c peptides for four of the ten expected sites were observed by LC-MS/MS; following HAC fractionation, peptides covering nine out of ten sites were obtained. Heme c peptide spiked into E. coli lysates at mass ratios as low as 10-4 was detected with good signal-to-noise after HAC and LC-MS/MS analysis. In addition to HAC, we have developed a proteomics database search strategy that takes into account the unique physicochemical properties of heme c peptides. The results suggest that accounting for the double thioether link between heme c and peptide, and the use of the labile heme fragment as a reporter ion, can improve database searching results. The combination of affinity chromatography and heme-specific informatics yielded increases in the number of peptide-spectrum matches of 20-100-fold for bovine cytochrome c.« less

  8. Basophile: Accurate Fragment Charge State Prediction Improves Peptide Identification Rates

    DOE PAGES

    Wang, Dong; Dasari, Surendra; Chambers, Matthew C.; ...

    2013-03-07

    In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher-energy collision induced dissociation (HCD) of chargedmore » peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.« less

  9. An optimized method for the accurate determination of patulin in apple products by isotope dilution-liquid chromatography/mass spectrometry.

    PubMed

    Seo, Miyeong; Kim, Byungjoo; Baek, Song-Yee

    2015-07-01

    Patulin, a mycotoxin produced by several molds in fruits, has been frequently detected in apple products. Therefore, regulatory bodies have established recommended maximum permitted patulin concentrations for each type of apple product. Although several analytical methods have been adopted to determine patulin in food, quality control of patulin analysis is not easy, as reliable certified reference materials (CRMs) are not available. In this study, as a part of a project for developing CRMs for patulin analysis, we developed isotope dilution liquid chromatography-tandem mass spectrometry (ID-LC/MS/MS) as a higher-order reference method for the accurate value-assignment of CRMs. (13)C7-patulin was used as internal standard. Samples were extracted with ethyl acetate to improve recovery. For further sample cleanup with solid-phase extraction (SPE), the HLB SPE cartridge was chosen after comparing with several other types of SPE cartridges. High-performance liquid chromatography was performed on a multimode column for proper retention and separation of highly polar and water-soluble patulin from sample interferences. Sample extracts were analyzed by LC/MS/MS with electrospray ionization in negative ion mode with selected reaction monitoring of patulin and (13)C7-patulin at m/z 153→m/z 109 and m/z 160→m/z 115, respectively. The validity of the method was tested by measuring gravimetrically fortified samples of various apple products. In addition, the repeatability and the reproducibility of the method were tested to evaluate the performance of the method. The method was shown to provide accurate measurements in the 3-40 μg/kg range with a relative expanded uncertainty of around 1%.

  10. Method for calibrating mass spectrometers

    DOEpatents

    Anderson, Gordon A [Benton City, WA; Brands, Michael D [Richland, WA; Bruce, James E [Schwenksville, PA; Pasa-Tolic, Ljiljana [Richland, WA; Smith, Richard D [Richland, WA

    2002-12-24

    A method whereby a mass spectra generated by a mass spectrometer is calibrated by shifting the parameters used by the spectrometer to assign masses to the spectra in a manner which reconciles the signal of ions within the spectra having equal mass but differing charge states, or by reconciling ions having known differences in mass to relative values consistent with those known differences. In this manner, the mass spectrometer is calibrated without the need for standards while allowing the generation of a highly accurate mass spectra by the instrument.

  11. Food Composition Database Format and Structure: A User Focused Approach

    PubMed Central

    Clancy, Annabel K.; Woods, Kaitlyn; McMahon, Anne; Probst, Yasmine

    2015-01-01

    This study aimed to investigate the needs of Australian food composition database user’s regarding database format and relate this to the format of databases available globally. Three semi structured synchronous online focus groups (M = 3, F = 11) and n = 6 female key informant interviews were recorded. Beliefs surrounding the use, training, understanding, benefits and limitations of food composition data and databases were explored. Verbatim transcriptions underwent preliminary coding followed by thematic analysis with NVivo qualitative analysis software to extract the final themes. Schematic analysis was applied to the final themes related to database format. Desktop analysis also examined the format of six key globally available databases. 24 dominant themes were established, of which five related to format; database use, food classification, framework, accessibility and availability, and data derivation. Desktop analysis revealed that food classification systems varied considerably between databases. Microsoft Excel was a common file format used in all databases, and available software varied between countries. User’s also recognised that food composition databases format should ideally be designed specifically for the intended use, have a user-friendly food classification system, incorporate accurate data with clear explanation of data derivation and feature user input. However, such databases are limited by data availability and resources. Further exploration of data sharing options should be considered. Furthermore, user’s understanding of food composition data and databases limitations is inherent to the correct application of non-specific databases. Therefore, further exploration of user FCDB training should also be considered. PMID:26554836

  12. Searching molecular structure databases with tandem mass spectra using CSI:FingerID

    PubMed Central

    Dührkop, Kai; Shen, Huibin; Meusel, Marvin; Rousu, Juho; Böcker, Sebastian

    2015-01-01

    Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem MS to identify the thousands of compounds in a biological sample. Today, the vast majority of metabolites remain unknown. We present a method for searching molecular structure databases using tandem MS data of small molecules. Our method computes a fragmentation tree that best explains the fragmentation spectrum of an unknown molecule. We use the fragmentation tree to predict the molecular structure fingerprint of the unknown compound using machine learning. This fingerprint is then used to search a molecular structure database such as PubChem. Our method is shown to improve on the competing methods for computational metabolite identification by a considerable margin. PMID:26392543

  13. Mass spectrometry applied to the identification of Mycobacterium tuberculosis and biomarker discovery.

    PubMed

    López-Hernández, Y; Patiño-Rodríguez, O; García-Orta, S T; Pinos-Rodríguez, J M

    2016-12-01

    An adequate and effective tuberculosis (TB) diagnosis system has been identified by the World Health Organization as a priority in the fight against this disease. Over the years, several methods have been developed to identify the bacillus, but bacterial culture remains one of the most affordable methods for most countries. For rapid and accurate identification, however, it is more feasible to implement molecular techniques, taking advantage of the availability of public databases containing protein sequences. Mass spectrometry (MS) has become an interesting technique for the identification of TB. Here, we review some of the most widely employed methods for identifying Mycobacterium tuberculosis and present an update on MS applied for the identification of mycobacterial species. © 2016 The Society for Applied Microbiology.

  14. Rapid and accurate bacterial identification in probiotics and yoghurts by MALDI-TOF mass spectrometry.

    PubMed

    Angelakis, Emmanouil; Million, Matthieu; Henry, Mireille; Raoult, Didier

    2011-10-01

    Probiotic food is manufactured by adding probiotic strains simultaneously with starter cultures in fermentation tanks. Here, we investigate the accuracy and feasibility of matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS) for bacterial identification at the species level in probiotic food and yoghurts. Probiotic food and yoghurts were cultured in Columbia and Lactobacillus specific agar and tested by quantitative real-time PCR (qPCR) for the detection and quantification of Lactobacillus sp. Bacterial identification was performed by MALDI-TOF analysis and by amplification and sequencing of tuf and 16S rDNA genes. We tested 13 probiotic food and yoghurts and we identified by qPCR that they presented 10(6) to 10(7) copies of Lactobacillus spp. DNA/g. All products contained very large numbers of living bacteria varying from 10(6) to 10(9) colony forming units/g. These bacteria were identified as Lactobacillus casei, Lactococcus lactis, Bifidobacterium animalis, Lactobacillus delbrueckii, and Streptococcus thermophilus. MALDI-TOF MS presented 92% specificity compared to the molecular assays. In one product we found L. lactis, instead of Bifidus spp. which was mentioned on the label and for another L. delbrueckii and S. thermophilus instead of Bifidus spp. MALDI-TOF MS allows a rapid and accurate bacterial identification at the species level in probiotic food and yoghurts. Although the safety and functionality of probiotics are species and strain dependent, we found a discrepancy between the bacterial strain announced on the label and the strain identified. Practical Application:  MALDI-TOF MS is rapid and specific for the identification of bacteria in probiotic food and yoghurts. Although the safety and functionality of probiotics are species and strain dependent, we found a discrepancy between the bacterial strain announced on the label and the strain identified. © 2011 Institute of Food Technologists®

  15. Accurate quantification of creatinine in serum by coupling a measurement standard to extractive electrospray ionization mass spectrometry

    NASA Astrophysics Data System (ADS)

    Huang, Keke; Li, Ming; Li, Hongmei; Li, Mengwan; Jiang, You; Fang, Xiang

    2016-01-01

    Ambient ionization (AI) techniques have been widely used in chemistry, medicine, material science, environmental science, forensic science. AI takes advantage of direct desorption/ionization of chemicals in raw samples under ambient environmental conditions with minimal or no sample preparation. However, its quantitative accuracy is restricted by matrix effects during the ionization process. To improve the quantitative accuracy of AI, a matrix reference material, which is a particular form of measurement standard, was coupled to an AI technique in this study. Consequently the analyte concentration in a complex matrix can be easily quantified with high accuracy. As a demonstration, this novel method was applied for the accurate quantification of creatinine in serum by using extractive electrospray ionization (EESI) mass spectrometry. Over the concentration range investigated (0.166 ~ 1.617 μg/mL), a calibration curve was obtained with a satisfactory linearity (R2 = 0.994), and acceptable relative standard deviations (RSD) of 4.6 ~ 8.0% (n = 6). Finally, the creatinine concentration value of a serum sample was determined to be 36.18 ± 1.08 μg/mL, which is in excellent agreement with the certified value of 35.16 ± 0.39 μg/mL.

  16. Investigation of mass dependence effects for the accurate determination of molybdenum isotope amount ratios by MC-ICP-MS using synthetic isotope mixtures.

    PubMed

    Malinovsky, Dmitry; Dunn, Philip J H; Petrov, Panayot; Goenaga-Infante, Heidi

    2015-01-01

    Methodology for absolute Mo isotope amount ratio measurements by multicollector inductively coupled plasma-mass spectrometry (MC-ICP-MS) using calibration with synthetic isotope mixtures (SIMs) is presented. For the first time, synthetic isotope mixtures prepared from seven commercially available isotopically enriched molybdenum metal powders ((92)Mo, (94)Mo, (95)Mo, (96)Mo, (97)Mo, (98)Mo, and (100)Mo) are used to investigate whether instrumental mass discrimination of Mo isotopes in MC-ICP-MS is consistent with mass-dependent isotope distribution. The parent materials were dissolved and mixed as solutions to obtain mixtures with accurately known isotope amount ratios. The level of elemental impurities in the isotopically enriched molybdenum metal powders was quantified by ICP-MS by using both high-resolution and reaction cell instruments to completely resolve spectral interferences. The Mo isotope amount ratio values with expanded uncertainty (k = 2), determined by MC-ICP-MS for a high-purity Mo rod from Johnson Matthey, were as follows: (92)Mo/(95)Mo = 0.9235(9), (94)Mo/(95)Mo = 0.5785(8), (96)Mo/(95)Mo = 1.0503(9), (97)Mo/(95)Mo = 0.6033(6), (98)Mo/(95)Mo = 1.5291(20), and (100)Mo/(95)Mo = 0.6130(7). A full uncertainty budget for the measurements is presented which shows that the largest contribution to the uncertainty budget comes from correction for elemental impurities (∼51%), followed by the contribution from weighing operations (∼26 %). The atomic weight of molybdenum was calculated to be 95.947(2); the uncertainty in parentheses is expanded uncertainty with the coverage factor of 2. A particular advantage of the developed method is that calibration factors for all six Mo isotope amount ratios, involving the (95)Mo isotope, were experimentally determined. This allows avoiding any assumption on mass-dependent isotope fractions in MC-ICP-MS, inherent to the method of double spike previously used for Mo isotope amount ratio

  17. Evaluation of the Vitek MS Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry System for Identification of Clinically Relevant Filamentous Fungi.

    PubMed

    McMullen, Allison R; Wallace, Meghan A; Pincus, David H; Wilkey, Kathy; Burnham, C A

    2016-08-01

    Invasive fungal infections have a high rate of morbidity and mortality, and accurate identification is necessary to guide appropriate antifungal therapy. With the increasing incidence of invasive disease attributed to filamentous fungi, rapid and accurate species-level identification of these pathogens is necessary. Traditional methods for identification of filamentous fungi can be slow and may lack resolution. Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) has emerged as a rapid and accurate method for identification of bacteria and yeasts, but a paucity of data exists on the performance characteristics of this method for identification of filamentous fungi. The objective of our study was to evaluate the accuracy of the Vitek MS for mold identification. A total of 319 mold isolates representing 43 genera recovered from clinical specimens were evaluated. Of these isolates, 213 (66.8%) were correctly identified using the Vitek MS Knowledge Base, version 3.0 database. When a modified SARAMIS (Spectral Archive and Microbial Identification System) database was used to augment the version 3.0 Knowledge Base, 245 (76.8%) isolates were correctly identified. Unidentified isolates were subcultured for repeat testing; 71/319 (22.3%) remained unidentified. Of the unidentified isolates, 69 were not in the database. Only 3 (0.9%) isolates were misidentified by MALDI-TOF MS (including Aspergillus amoenus [n = 2] and Aspergillus calidoustus [n = 1]) although 10 (3.1%) of the original phenotypic identifications were not correct. In addition, this methodology was able to accurately identify 133/144 (93.6%) Aspergillus sp. isolates to the species level. MALDI-TOF MS has the potential to expedite mold identification, and misidentifications are rare. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  18. Differential equation based method for accurate approximations in optimization

    NASA Technical Reports Server (NTRS)

    Pritchard, Jocelyn I.; Adelman, Howard M.

    1990-01-01

    This paper describes a method to efficiently and accurately approximate the effect of design changes on structural response. The key to this new method is to interpret sensitivity equations as differential equations that may be solved explicitly for closed form approximations, hence, the method is denoted the Differential Equation Based (DEB) method. Approximations were developed for vibration frequencies, mode shapes and static displacements. The DEB approximation method was applied to a cantilever beam and results compared with the commonly-used linear Taylor series approximations and exact solutions. The test calculations involved perturbing the height, width, cross-sectional area, tip mass, and bending inertia of the beam. The DEB method proved to be very accurate, and in msot cases, was more accurate than the linear Taylor series approximation. The method is applicable to simultaneous perturbation of several design variables. Also, the approximations may be used to calculate other system response quantities. For example, the approximations for displacement are used to approximate bending stresses.

  19. Differential equation based method for accurate approximations in optimization

    NASA Technical Reports Server (NTRS)

    Pritchard, Jocelyn I.; Adelman, Howard M.

    1990-01-01

    A method to efficiently and accurately approximate the effect of design changes on structural response is described. The key to this method is to interpret sensitivity equations as differential equations that may be solved explicitly for closed form approximations, hence, the method is denoted the Differential Equation Based (DEB) method. Approximations were developed for vibration frequencies, mode shapes and static displacements. The DEB approximation method was applied to a cantilever beam and results compared with the commonly-used linear Taylor series approximations and exact solutions. The test calculations involved perturbing the height, width, cross-sectional area, tip mass, and bending inertia of the beam. The DEB method proved to be very accurate, and in most cases, was more accurate than the linear Taylor series approximation. The method is applicable to simultaneous perturbation of several design variables. Also, the approximations may be used to calculate other system response quantities. For example, the approximations for displacements are used to approximate bending stresses.

  20. RADS Version 4: An Efficient Way to Analyse the Multi-Mission Altimeter Database

    NASA Astrophysics Data System (ADS)

    Scharroo, Remko; Leuliette, Eric; Naeije, Marc; Martin-Puig, Cristina; Pires, Nelson

    2016-08-01

    The Radar Altimeter Database System (RADS) has grown to become a mature altimeter database. Over the last 18 years it is continuously being developed, first at Delft University of Technology, now also at the National Oceanic and Atmospheric Administration (NOAA) and the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT).RADS now serves as a fundamental Climate Data Record for sea level. Because of the multiple users involved in vetting the data and the regular updates to the database, RADS is one of the most accurate and complete databases of satellite altimeter data around.RADS version 4 is a major change from the previous version. While the database is compatible with both software versions, the new software provides new tools, allows easier expansion, and has a better and more standardised interface.

  1. Comprehensive identification of bioactive compounds of avocado peel by liquid chromatography coupled to ultra-high-definition accurate-mass Q-TOF.

    PubMed

    Figueroa, Jorge G; Borrás-Linares, Isabel; Lozano-Sánchez, Jesús; Segura-Carretero, Antonio

    2018-04-15

    Industrially the avocado pulp is exploited principally as oil and paste, generating a huge quantity of peel and seed as by-products. Avocado peel is a promising inexpensive candidate for recovery phenolic compounds. The aim of this work was to identify the bioactive compounds present in an extract of avocado peel obtained by a green extraction technique. Accelerated solvent extraction was performed using water and ethanol as extraction solvents. Liquid chromatography coupled to ultra-high-definition accurate-mass spectrometry was used in order to identify the bioactive compounds. A total of sixty-one compounds belonging to eleven families were identified. Procyanidins, flavonols, hydroxybenzoic and hydroxycinnamic acids were the most common compounds. A sum of thirty-five compounds has been identified here for the first time in avocado peel. These results confirm the potential of avocado peel as a source of bioactive ingredients for its use in the food, cosmetic or pharmaceutical sector. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Measuring masses of large biomolecules and bioparticles using mass spectrometric techniques.

    PubMed

    Peng, Wen-Ping; Chou, Szu-Wei; Patil, Avinash A

    2014-07-21

    Large biomolecules and bioparticles play a vital role in biology, chemistry, biomedical science and physics. Mass is a critical parameter for the characterization of large biomolecules and bioparticles. To achieve mass analysis, choosing a suitable ion source is the first step and the instruments for detecting ions, mass analyzers and detectors should also be considered. Abundant mass spectrometric techniques have been proposed to determine the masses of large biomolecules and bioparticles and these techniques can be divided into two categories. The first category measures the mass (or size) of intact particles, including single particle quadrupole ion trap mass spectrometry, cell mass spectrometry, charge detection mass spectrometry and differential mobility mass analysis; the second category aims to measure the mass and tandem mass of biomolecular ions, including quadrupole ion trap mass spectrometry, time-of-flight mass spectrometry, quadrupole orthogonal time-of-flight mass spectrometry and orbitrap mass spectrometry. Moreover, algorithms for the mass and stoichiometry assignment of electrospray mass spectra are developed to obtain accurate structure information and subunit combinations.

  3. The Transporter Classification Database: recent advances.

    PubMed

    Saier, Milton H; Yen, Ming Ren; Noto, Keith; Tamang, Dorjee G; Elkan, Charles

    2009-01-01

    The Transporter Classification Database (TCDB), freely accessible at http://www.tcdb.org, is a relational database containing sequence, structural, functional and evolutionary information about transport systems from a variety of living organisms, based on the International Union of Biochemistry and Molecular Biology-approved transporter classification (TC) system. It is a curated repository for factual information compiled largely from published references. It uses a functional/phylogenetic system of classification, and currently encompasses about 5000 representative transporters and putative transporters in more than 500 families. We here describe novel software designed to support and extend the usefulness of TCDB. Our recent efforts render it more user friendly, incorporate machine learning to input novel data in a semiautomatic fashion, and allow analyses that are more accurate and less time consuming. The availability of these tools has resulted in recognition of distant phylogenetic relationships and tremendous expansion of the information available to TCDB users.

  4. A web based relational database management system for filariasis control

    PubMed Central

    Murty, Upadhyayula Suryanarayana; Kumar, Duvvuri Venkata Rama Satya; Sriram, Kumaraswamy; Rao, Kadiri Madhusudhan; Bhattacharyulu, Chakravarthula Hayageeva Narasimha Venakata; Praveen, Bhoopathi; Krishna, Amirapu Radha

    2005-01-01

    The present study describes a RDBMS (relational database management system) for the effective management of Filariasis, a vector borne disease. Filariasis infects 120 million people from 83 countries. The possible re-emergence of the disease and the complexity of existing control programs warrant the development of new strategies. A database containing comprehensive data associated with filariasis finds utility in disease control. We have developed a database containing information on the socio-economic status of patients, mosquito collection procedures, mosquito dissection data, filariasis survey report and mass blood data. The database can be searched using a user friendly web interface. Availability http://www.webfil.org (login and password can be obtained from the authors) PMID:17597846

  5. A kinetics database and scripts for PHREEQC

    NASA Astrophysics Data System (ADS)

    Hu, B.; Zhang, Y.; Teng, Y.; Zhu, C.

    2017-12-01

    Kinetics of geochemical reactions has been increasingly used in numerical models to simulate coupled flow, mass transport, and chemical reactions. However, the kinetic data are scattered in the literature. To assemble a kinetic dataset for a modeling project is an intimidating task for most. In order to facilitate the application of kinetics in geochemical modeling, we assembled kinetics parameters into a database for the geochemical simulation program, PHREEQC (version 3.0). Kinetics data were collected from the literature. Our database includes kinetic data for over 70 minerals. The rate equations are also programmed into scripts with the Basic language. Using the new kinetic database, we simulated reaction path during the albite dissolution process using various rate equations in the literature. The simulation results with three different rate equations gave difference reaction paths at different time scale. Another application involves a coupled reactive transport model simulating the advancement of an acid plume in an acid mine drainage site associated with Bear Creek Uranium tailings pond. Geochemical reactions including calcite, gypsum, and illite were simulated with PHREEQC using the new kinetic database. The simulation results successfully demonstrated the utility of new kinetic database.

  6. Advanced Neuropsychological Diagnostics Infrastructure (ANDI): A Normative Database Created from Control Datasets.

    PubMed

    de Vent, Nathalie R; Agelink van Rentergem, Joost A; Schmand, Ben A; Murre, Jaap M J; Huizenga, Hilde M

    2016-01-01

    In the Advanced Neuropsychological Diagnostics Infrastructure (ANDI), datasets of several research groups are combined into a single database, containing scores on neuropsychological tests from healthy participants. For most popular neuropsychological tests the quantity, and range of these data surpasses that of traditional normative data, thereby enabling more accurate neuropsychological assessment. Because of the unique structure of the database, it facilitates normative comparison methods that were not feasible before, in particular those in which entire profiles of scores are evaluated. In this article, we describe the steps that were necessary to combine the separate datasets into a single database. These steps involve matching variables from multiple datasets, removing outlying values, determining the influence of demographic variables, and finding appropriate transformations to normality. Also, a brief description of the current contents of the ANDI database is given.

  7. Global distribution of soil organic carbon, based on the Harmonized World Soil Database - Part 1: Masses and frequency distribution of SOC stocks for the tropics, permafrost regions, wetlands, and the world

    NASA Astrophysics Data System (ADS)

    Köchy, M.; Hiederer, R.; Freibauer, A.

    2014-09-01

    The global soil organic carbon (SOC) mass is relevant for the carbon cycle budget. We review current estimates of soil organic carbon stocks (mass/area) and mass (stock × area) in wetlands, permafrost and tropical regions and the world in the upper 1 m of soil. The Harmonized World Soil Database (HWSD) v.1.2 provides one of the most recent and coherent global data sets of SOC, giving a total mass of 2476 Pg. Correcting the HWSD's bulk density of organic soils, especially Histosols, results in a mass of 1062 Pg. The uncertainty of bulk density of Histosols alone introduces a range of -56 to +180 Pg for the estimate of global SOC in the top 1 m, larger than estimates of global soil respiration. We report the spatial distribution of SOC stocks per 0.5 arc minutes, the areal masses of SOC and the quantiles of SOC stocks by continents, wetland types, and permafrost types. Depending on the definition of "wetland", wetland soils contain between 82 and 158 Pg SOC. Incorporating more detailed estimates for permafrost from the Northern Circumpolar Soil Carbon Data Base (496 Pg SOC) and tropical peatland carbon, global soils contain 1324 Pg SOC in the upper 1 m including 421 Pg in tropical soils, whereof 40 Pg occur in tropical wetlands. Global SOC amounts to just under 3000 Pg when estimates for deeper soil layers are included. Variability in estimates is due to variation in definitions of soil units, differences in soil property databases, scarcity of information about soil carbon at depths > 1 m in peatlands, and variation in definitions of "peatland".

  8. The identification of complete domains within protein sequences using accurate E-values for semi-global alignment

    PubMed Central

    Kann, Maricel G.; Sheetlin, Sergey L.; Park, Yonil; Bryant, Stephen H.; Spouge, John L.

    2007-01-01

    The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance. PMID:17596268

  9. Neighborhood Structural Similarity Mapping for the Classification of Masses in Mammograms.

    PubMed

    Rabidas, Rinku; Midya, Abhishek; Chakraborty, Jayasree

    2018-05-01

    In this paper, two novel feature extraction methods, using neighborhood structural similarity (NSS), are proposed for the characterization of mammographic masses as benign or malignant. Since gray-level distribution of pixels is different in benign and malignant masses, more regular and homogeneous patterns are visible in benign masses compared to malignant masses; the proposed method exploits the similarity between neighboring regions of masses by designing two new features, namely, NSS-I and NSS-II, which capture global similarity at different scales. Complementary to these global features, uniform local binary patterns are computed to enhance the classification efficiency by combining with the proposed features. The performance of the features are evaluated using the images from the mini-mammographic image analysis society (mini-MIAS) and digital database for screening mammography (DDSM) databases, where a tenfold cross-validation technique is incorporated with Fisher linear discriminant analysis, after selecting the optimal set of features using stepwise logistic regression method. The best area under the receiver operating characteristic curve of 0.98 with an accuracy of is achieved with the mini-MIAS database, while the same for the DDSM database is 0.93 with accuracy .

  10. Accurate LC peak boundary detection for ¹⁶O/¹⁸O labeled LC-MS data.

    PubMed

    Cui, Jian; Petritis, Konstantinos; Tegeler, Tony; Petritis, Brianne; Ma, Xuepo; Jin, Yufang; Gao, Shou-Jiang S J; Zhang, Jianqiu Michelle

    2013-01-01

    In liquid chromatography-mass spectrometry (LC-MS), parts of LC peaks are often corrupted by their co-eluting peptides, which results in increased quantification variance. In this paper, we propose to apply accurate LC peak boundary detection to remove the corrupted part of LC peaks. Accurate LC peak boundary detection is achieved by checking the consistency of intensity patterns within peptide elution time ranges. In addition, we remove peptides with erroneous mass assignment through model fitness check, which compares observed intensity patterns to theoretically constructed ones. The proposed algorithm can significantly improve the accuracy and precision of peptide ratio measurements.

  11. Identification of RNA molecules by specific enzyme digestion and mass spectrometry: software for and implementation of RNA mass mapping

    PubMed Central

    Matthiesen, Rune; Kirpekar, Finn

    2009-01-01

    The idea of identifying or characterizing an RNA molecule based on a mass spectrum of specifically generated RNA fragments has been used in various forms for well over a decade. We have developed software—named RRM for ‘RNA mass mapping’—which can search whole prokaryotic genomes or RNA FASTA sequence databases to identify the origin of a given RNA based on a mass spectrum of RNA fragments. As input, the program uses the masses of specific RNase cleavage of the RNA under investigation. RNase T1 digestion is used here as a demonstration of the usability of the method for RNA identification. The concept for identification is that the masses of the digestion products constitute a specific fingerprint, which characterize the given RNA. The search algorithm is based on the same principles as those used in peptide mass fingerprinting, but has here been extended to work for both RNA sequence databases and for genome searches. A simple and powerful probability model for ranking RNA matches is proposed. We demonstrate viability of the entire setup by identifying the DNA template of a series of RNAs of biological and of in vitro transcriptional origin in complete microbial genomes and by identifying authentic 16S ribosomal RNAs in a ‘small ribosomal subunit RNA’ database. Thus, we present a new tool for a rapid identification of unknown RNAs using only a few picomoles of starting material. PMID:19264806

  12. Validated MicroRNA Target Databases: An Evaluation.

    PubMed

    Lee, Yun Ji Diana; Kim, Veronica; Muth, Dillon C; Witwer, Kenneth W

    2015-11-01

    Preclinical Research Positive findings from preclinical and clinical studies involving depletion or supplementation of microRNA (miRNA) engender optimism about miRNA-based therapeutics. However, off-target effects must be considered. Predicting these effects is complicated. Each miRNA may target many gene transcripts, and the rules governing imperfectly complementary miRNA: target interactions are incompletely understood. Several databases provide lists of the relatively small number of experimentally confirmed miRNA: target pairs. Although incomplete, this information might allow assessment of at least some of the off-target effects. We evaluated the performance of four databases of experimentally validated miRNA: target interactions (miRWalk 2.0, miRTarBase, miRecords, and TarBase 7.0) using a list of 50 alphabetically consecutive genes. We examined the provided citations to determine the degree to which each interaction was experimentally supported. To assess stability, we tested at the beginning and end of a five-month period. Results varied widely by database. Two of the databases changed significantly over the course of 5 months. Most reported evidence for miRNA: target interactions were indirect or otherwise weak, and relatively few interactions were supported by more than one publication. Some returned results appear to arise from simplistic text searches that offer no insight into the relationship of the search terms, may not even include the reported gene or miRNA, and may thus, be invalid. We conclude that validation databases provide important information, but not all information in all extant databases is up-to-date or accurate. Nevertheless, the more comprehensive validation databases may provide useful starting points for investigation of off-target effects of proposed small RNA therapies. © 2015 Wiley Periodicals, Inc.

  13. Data management and database framework for the MICE experiment

    NASA Astrophysics Data System (ADS)

    Martyniak, J.; Nebrensky, J. J.; Rajaram, D.; MICE Collaboration

    2017-10-01

    The international Muon Ionization Cooling Experiment (MICE) currently operating at the Rutherford Appleton Laboratory in the UK, is designed to demonstrate the principle of muon ionization cooling for application to a future Neutrino Factory or Muon Collider. We present the status of the framework for the movement and curation of both raw and reconstructed data. A raw data-mover has been designed to safely upload data files onto permanent tape storage as soon as they have been written out. The process has been automated, and checks have been built in to ensure the integrity of data at every stage of the transfer. The data processing framework has been recently redesigned in order to provide fast turnaround of reconstructed data for analysis. The automated reconstruction is performed on a dedicated machine in the MICE control room and any reprocessing is done at Tier-2 Grid sites. In conjunction with this redesign, a new reconstructed-data-mover has been designed and implemented. We also review the implementation of a robust database system that has been designed for MICE. The processing of data, whether raw or Monte Carlo, requires accurate knowledge of the experimental conditions. MICE has several complex elements ranging from beamline magnets to particle identification detectors to superconducting magnets. A Configuration Database, which contains information about the experimental conditions (magnet currents, absorber material, detector calibrations, etc.) at any given time has been developed to ensure accurate and reproducible simulation and reconstruction. A fully replicated, hot-standby database system has been implemented with a firewall-protected read-write master running in the control room, and a read-only slave running at a different location. The actual database is hidden from end users by a Web Service layer, which provides platform and programming language-independent access to the data.

  14. Decision peptide-driven: a free software tool for accurate protein quantification using gel electrophoresis and matrix assisted laser desorption ionization time of flight mass spectrometry.

    PubMed

    Santos, Hugo M; Reboiro-Jato, Miguel; Glez-Peña, Daniel; Nunes-Miranda, J D; Fdez-Riverola, Florentino; Carvallo, R; Capelo, J L

    2010-09-15

    The decision peptide-driven tool implements a software application for assisting the user in a protocol for accurate protein quantification based on the following steps: (1) protein separation through gel electrophoresis; (2) in-gel protein digestion; (3) direct and inverse (18)O-labeling and (4) matrix assisted laser desorption ionization time of flight mass spectrometry, MALDI analysis. The DPD software compares the MALDI results of the direct and inverse (18)O-labeling experiments and quickly identifies those peptides with paralleled loses in different sets of a typical proteomic workflow. Those peptides are used for subsequent accurate protein quantification. The interpretation of the MALDI data from direct and inverse labeling experiments is time-consuming requiring a significant amount of time to do all comparisons manually. The DPD software shortens and simplifies the searching of the peptides that must be used for quantification from a week to just some minutes. To do so, it takes as input several MALDI spectra and aids the researcher in an automatic mode (i) to compare data from direct and inverse (18)O-labeling experiments, calculating the corresponding ratios to determine those peptides with paralleled losses throughout different sets of experiments; and (ii) allow to use those peptides as internal standards for subsequent accurate protein quantification using (18)O-labeling. In this work the DPD software is presented and explained with the quantification of protein carbonic anhydrase. Copyright (c) 2010 Elsevier B.V. All rights reserved.

  15. An open access thyroid ultrasound image database

    NASA Astrophysics Data System (ADS)

    Pedraza, Lina; Vargas, Carlos; Narváez, Fabián.; Durán, Oscar; Muñoz, Emma; Romero, Eduardo

    2015-01-01

    Computer aided diagnosis systems (CAD) have been developed to assist radiologists in the detection and diagnosis of abnormalities and a large number of pattern recognition techniques have been proposed to obtain a second opinion. Most of these strategies have been evaluated using different datasets making their performance incomparable. In this work, an open access database of thyroid ultrasound images is presented. The dataset consists of a set of B-mode Ultrasound images, including a complete annotation and diagnostic description of suspicious thyroid lesions by expert radiologists. Several types of lesions as thyroiditis, cystic nodules, adenomas and thyroid cancers were included while an accurate lesion delineation is provided in XML format. The diagnostic description of malignant lesions was confirmed by biopsy. The proposed new database is expected to be a resource for the community to assess different CAD systems.

  16. Anomaly detection applied to a materials control and accounting database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Whiteson, R.; Spanks, L.; Yarbro, T.

    An important component of the national mission of reducing the nuclear danger includes accurate recording of the processing and transportation of nuclear materials. Nuclear material storage facilities, nuclear chemical processing plants, and nuclear fuel fabrication facilities collect and store large amounts of data describing transactions that involve nuclear materials. To maintain confidence in the integrity of these data, it is essential to identify anomalies in the databases. Anomalous data could indicate error, theft, or diversion of material. Yet, because of the complex and diverse nature of the data, analysis and evaluation are extremely tedious. This paper describes the authors workmore » in the development of analysis tools to automate the anomaly detection process for the Material Accountability and Safeguards System (MASS) that tracks and records the activities associated with accountable quantities of nuclear material at Los Alamos National Laboratory. Using existing guidelines that describe valid transactions, the authors have created an expert system that identifies transactions that do not conform to the guidelines. Thus, this expert system can be used to focus the attention of the expert or inspector directly on significant phenomena.« less

  17. Verification of Data Accuracy in Japan Congenital Cardiovascular Surgery Database Including Its Postprocedural Complication Reports.

    PubMed

    Takahashi, Arata; Kumamaru, Hiraku; Tomotaki, Ai; Matsumura, Goki; Fukuchi, Eriko; Hirata, Yasutaka; Murakami, Arata; Hashimoto, Hideki; Ono, Minoru; Miyata, Hiroaki

    2018-03-01

    Japan Congenital Cardiovascluar Surgical Database (JCCVSD) is a nationwide registry whose data are used for health quality assessment and clinical research in Japan. We evaluated the completeness of case registration and the accuracy of recorded data components including postprocedural mortality and complications in the database via on-site data adjudication. We validated the records from JCCVSD 2010 to 2012 containing congenital cardiovascular surgery data performed in 111 facilities throughout Japan. We randomly chose nine facilities for site visit by the auditor team and conducted on-site data adjudication. We assessed whether the records in JCCVSD matched the data in the source materials. We identified 1,928 cases of eligible surgeries performed at the facilities, of which 1,910 were registered (99.1% completeness), with 6 cases of duplication and 1 inappropriate case registration. Data components including gender, age, and surgery time (hours) were highly accurate with 98% to 100% concordance. Mortality at discharge and at 30 and 90 postoperative days was 100% accurate. Among the five complications studied, reoperation was the most frequently observed, with 16 and 21 cases recorded in the database and source materials, respectively, having a sensitivity of 0.67 and a specificity of 0.99. Validation of JCCVSD database showed high registration completeness and high accuracy especially in the categorical data components. Adjudicated mortality was 100% accurate. While limited in numbers, the recorded cases of postoperative complications all had high specificities but had lower sensitivity (0.67-1.00). Continued activities for data quality improvement and assessment are necessary for optimizing the utility of these registries.

  18. PAMDB: a comprehensive Pseudomonas aeruginosa metabolome database.

    PubMed

    Huang, Weiliang; Brewer, Luke K; Jones, Jace W; Nguyen, Angela T; Marcu, Ana; Wishart, David S; Oglesby-Sherrouse, Amanda G; Kane, Maureen A; Wilks, Angela

    2018-01-04

    The Pseudomonas aeruginosaMetabolome Database (PAMDB, http://pseudomonas.umaryland.edu) is a searchable, richly annotated metabolite database specific to P. aeruginosa. P. aeruginosa is a soil organism and significant opportunistic pathogen that adapts to its environment through a versatile energy metabolism network. Furthermore, P. aeruginosa is a model organism for the study of biofilm formation, quorum sensing, and bioremediation processes, each of which are dependent on unique pathways and metabolites. The PAMDB is modelled on the Escherichia coli (ECMDB), yeast (YMDB) and human (HMDB) metabolome databases and contains >4370 metabolites and 938 pathways with links to over 1260 genes and proteins. The database information was compiled from electronic databases, journal articles and mass spectrometry (MS) metabolomic data obtained in our laboratories. For each metabolite entered, we provide detailed compound descriptions, names and synonyms, structural and physiochemical information, nuclear magnetic resonance (NMR) and MS spectra, enzymes and pathway information, as well as gene and protein sequences. The database allows extensive searching via chemical names, structure and molecular weight, together with gene, protein and pathway relationships. The PAMBD and its future iterations will provide a valuable resource to biologists, natural product chemists and clinicians in identifying active compounds, potential biomarkers and clinical diagnostics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Pattern database applications from design to manufacturing

    NASA Astrophysics Data System (ADS)

    Zhuang, Linda; Zhu, Annie; Zhang, Yifan; Sweis, Jason; Lai, Ya-Chieh

    2017-03-01

    Pattern-based approaches are becoming more common and popular as the industry moves to advanced technology nodes. At the beginning of a new technology node, a library of process weak point patterns for physical and electrical verification are starting to build up and used to prevent known hotspots from re-occurring on new designs. Then the pattern set is expanded to create test keys for process development in order to verify the manufacturing capability and precheck new tape-out designs for any potential yield detractors. With the database growing, the adoption of pattern-based approaches has expanded from design flows to technology development and then needed for mass-production purposes. This paper will present the complete downstream working flows of a design pattern database(PDB). This pattern-based data analysis flow covers different applications across different functional teams from generating enhancement kits to improving design manufacturability, populating new testing design data based on previous-learning, generating analysis data to improve mass-production efficiency and manufacturing equipment in-line control to check machine status consistency across different fab sites.

  20. Advanced Neuropsychological Diagnostics Infrastructure (ANDI): A Normative Database Created from Control Datasets

    PubMed Central

    de Vent, Nathalie R.; Agelink van Rentergem, Joost A.; Schmand, Ben A.; Murre, Jaap M. J.; Huizenga, Hilde M.

    2016-01-01

    In the Advanced Neuropsychological Diagnostics Infrastructure (ANDI), datasets of several research groups are combined into a single database, containing scores on neuropsychological tests from healthy participants. For most popular neuropsychological tests the quantity, and range of these data surpasses that of traditional normative data, thereby enabling more accurate neuropsychological assessment. Because of the unique structure of the database, it facilitates normative comparison methods that were not feasible before, in particular those in which entire profiles of scores are evaluated. In this article, we describe the steps that were necessary to combine the separate datasets into a single database. These steps involve matching variables from multiple datasets, removing outlying values, determining the influence of demographic variables, and finding appropriate transformations to normality. Also, a brief description of the current contents of the ANDI database is given. PMID:27812340

  1. Breast mass segmentation in mammography using plane fitting and dynamic programming.

    PubMed

    Song, Enmin; Jiang, Luan; Jin, Renchao; Zhang, Lin; Yuan, Yuan; Li, Qiang

    2009-07-01

    Segmentation is an important and challenging task in a computer-aided diagnosis (CAD) system. Accurate segmentation could improve the accuracy in lesion detection and characterization. The objective of this study is to develop and test a new segmentation method that aims at improving the performance level of breast mass segmentation in mammography, which could be used to provide accurate features for classification. This automated segmentation method consists of two main steps and combines the edge gradient, the pixel intensity, as well as the shape characteristics of the lesions to achieve good segmentation results. First, a plane fitting method was applied to a background-trend corrected region-of-interest (ROI) of a mass to obtain the edge candidate points. Second, dynamic programming technique was used to find the "optimal" contour of the mass from the edge candidate points. Area-based similarity measures based on the radiologist's manually marked annotation and the segmented region were employed as criteria to evaluate the performance level of the segmentation method. With the evaluation criteria, the new method was compared with 1) the dynamic programming method developed by Timp and Karssemeijer, and 2) the normalized cut segmentation method, based on 337 ROIs extracted from a publicly available image database. The experimental results indicate that our segmentation method can achieve a higher performance level than the other two methods, and the improvements in segmentation performance level were statistically significant. For instance, the mean overlap percentage for the new algorithm was 0.71, whereas those for Timp's dynamic programming method and the normalized cut segmentation method were 0.63 (P < .001) and 0.61 (P < .001), respectively. We developed a new segmentation method by use of plane fitting and dynamic programming, which achieved a relatively high performance level. The new segmentation method would be useful for improving the accuracy of

  2. Performance optimisation of a new-generation orthogonal-acceleration quadrupole-time-of-flight mass spectrometer.

    PubMed

    Bristow, Tony; Constantine, Jill; Harrison, Mark; Cavoit, Fabien

    2008-04-01

    Orthogonal-acceleration quadrupole time-of-flight (oa-QTOF) mass spectrometers, employed for accurate mass measurement, have been commercially available for well over a decade. A limitation of the early instruments of this type was the narrow ion abundance range over which accurate mass measurements could be made with a high degree of certainty. Recently, a new generation of oa-QTOF mass spectrometers has been developed and these allow accurate mass measurements to be recorded over a much greater range of ion abundances. This development has resulted from new ion detection technology and improved electronic stability or by accurate control of the number of ions reaching the detector. In this report we describe the results from experiments performed to evaluate the mass measurement performance of the Bruker micrOTOF-Q, a member of the new-generation oa-QTOFs. The relationship between mass accuracy and ion abundance has been extensively evaluated and mass measurement accuracy remained stable (+/-1.5 m m/z units) over approximately 3-4 orders of magnitude of ion abundance. The second feature of the Bruker micrOTOF-Q that was evaluated was the SigmaFit function of the software. This isotope pattern-matching algorithm provides an exact numerical comparison of the theoretical and measured isotope patterns as an additional identification tool to accurate mass measurement. The smaller the value, the closer the match between theoretical and measured isotope patterns. This information is then employed to reduce the number of potential elemental formulae produced from the mass measurements. A relationship between the SigmaFit value and ion abundance has been established. The results from the study for both mass accuracy and SigmaFit were employed to define the performance criteria for the micrOTOF-Q. This provided increased confidence in the selection of elemental formulae resulting from accurate mass measurements.

  3. High-integrity databases for helicopter operations

    NASA Astrophysics Data System (ADS)

    Pschierer, Christian; Schiefele, Jens; Lüthy, Juerg

    2009-05-01

    Helicopter Emergency Medical Service missions (HEMS) impose a high workload on pilots due to short preparation time, operations in low level flight, and landings in unknown areas. The research project PILAS, a cooperation between Eurocopter, Diehl Avionics, DLR, EADS, Euro Telematik, ESG, Jeppesen, the Universities of Darmstadt and Munich, and funded by the German government, approached this problem by researching a pilot assistance system which supports the pilots during all phases of flight. The databases required for the specified helicopter missions include different types of topological and cultural data for graphical display on the SVS system, AMDB data for operations at airports and helipads, and navigation data for IFR segments. The most critical databases for the PILAS system however are highly accurate terrain and obstacle data. While RTCA DO-276 specifies high accuracies and integrities only for the areas around airports, HEMS helicopters typically operate outside of these controlled areas and thus require highly reliable terrain and obstacle data for their designated response areas. This data has been generated by a LIDAR scan of the specified test region. Obstacles have been extracted into a vector format. This paper includes a short overview of the complete PILAS system and then focus on the generation of the required high quality databases.

  4. The new IAGOS Database Portal

    NASA Astrophysics Data System (ADS)

    Boulanger, Damien; Gautron, Benoit; Thouret, Valérie; Fontaine, Alain

    2016-04-01

    IAGOS (In-service Aircraft for a Global Observing System) is a European Research Infrastructure which aims at the provision of long-term, regular and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. It contains IAGOS-core data and IAGOS-CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data. The IAGOS Database Portal (http://www.iagos.fr, damien.boulanger@obs-mip.fr) is part of the French atmospheric chemistry data center AERIS (http://www.aeris-data.fr). The new IAGOS Database Portal has been released in December 2015. The main improvement is the interoperability implementation with international portals or other databases in order to improve IAGOS data discovery. In the frame of the IGAS project (IAGOS for the Copernicus Atmospheric Service), a data network has been setup. It is composed of three data centers: the IAGOS database in Toulouse; the HALO research aircraft database at DLR (https://halo-db.pa.op.dlr.de); and the CAMS data center in Jülich (http://join.iek.fz-juelich.de). The CAMS (Copernicus Atmospheric Monitoring Service) project is a prominent user of the IGAS data network. The new portal provides improved and new services such as the download in NetCDF or NASA Ames formats, plotting tools (maps, time series, vertical profiles, etc.) and user management. Added value products are available on the portal: back trajectories, origin of air masses, co-location with satellite data, etc. The link with the CAMS data center, through JOIN (Jülich OWS Interface), allows to combine model outputs with IAGOS data for inter-comparison. Finally IAGOS metadata has been standardized (ISO 19115) and now provides complete information about data traceability and quality.

  5. Geothermal NEPA Database on OpenEI (Poster)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Young, K. R.; Levine, A.

    2014-09-01

    The National Renewable Energy Laboratory (NREL) developed the Geothermal National Environmental Policy Act (NEPA) Database as a platform for government agencies and industry to access and maintain information related to geothermal NEPA documents. The data were collected to inform analyses of NEPA timelines, and the collected data were made publically available via this tool in case others might find the data useful. NREL staff and contractors collected documents from agency websites, during visits to the two busiest Bureau of Land Management (BLM) field offices for geothermal development, and through email and phone call requests from other BLM field offices. Theymore » then entered the information into the database, hosted by Open Energy Information (http://en.openei.org/wiki/RAPID/NEPA). The long-term success of the project will depend on the willingness of federal agencies, industry, and others to populate the database with NEPA and related documents, and to use the data for their own analyses. As the information and capabilities of the database expand, developers and agencies can save time on new NEPA reports by accessing a single location to research related activities, their potential impacts, and previously proposed and imposed mitigation measures. NREL used a wiki platform to allow industry and agencies to maintain the content in the future so that it continues to provide relevant and accurate information to users.« less

  6. Tandem mass spectrometry of human tryptic blood peptides calculated by a statistical algorithm and captured by a relational database with exploration by a general statistical analysis system.

    PubMed

    Bowden, Peter; Beavis, Ron; Marshall, John

    2009-11-02

    A goodness of fit test may be used to assign tandem mass spectra of peptides to amino acid sequences and to directly calculate the expected probability of mis-identification. The product of the peptide expectation values directly yields the probability that the parent protein has been mis-identified. A relational database could capture the mass spectral data, the best fit results, and permit subsequent calculations by a general statistical analysis system. The many files of the Hupo blood protein data correlated by X!TANDEM against the proteins of ENSEMBL were collected into a relational database. A redundant set of 247,077 proteins and peptides were correlated by X!TANDEM, and that was collapsed to a set of 34,956 peptides from 13,379 distinct proteins. About 6875 distinct proteins were only represented by a single distinct peptide, 2866 proteins showed 2 distinct peptides, and 3454 proteins showed at least three distinct peptides by X!TANDEM. More than 99% of the peptides were associated with proteins that had cumulative expectation values, i.e. probability of false positive identification, of one in one hundred or less. The distribution of peptides per protein from X!TANDEM was significantly different than those expected from random assignment of peptides.

  7. An Optimized Informatics Pipeline for Mass Spectrometry-Based Peptidomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Chaochao; Monroe, Matthew E.; Xu, Zhe

    2015-12-26

    Comprehensive MS analysis of peptidome, the intracellular and intercellular products of protein degradation, has the potential to provide novel insights on endogenous proteolytic processing and their utility in disease diagnosis and prognosis. Along with the advances in MS instrumentation, a plethora of proteomics data analysis tools have been applied for direct use in peptidomics; however an evaluation of the currently available informatics pipelines for peptidomics data analysis has yet to be reported. In this study, we set off by evaluating the results of several popular MS/MS database search engines including MS-GF+, SEQUEST and MS-Align+ for peptidomics data analysis, followed bymore » identification and label-free quantification using the well-established accurate mass and time (AMT) tag and newly developed informed quantification (IQ) approaches, both based on direct LC-MS analysis. Our result demonstrated that MS-GF+ outperformed both SEQUEST and MS-Align+ in identifying peptidome peptides. Using a database established from the MS-GF+ peptide identifications, both the AMT tag and IQ approaches provided significantly deeper peptidome coverage and less missing value for each individual data set than the MS/MS methods, while achieving robust label-free quantification. Besides having an excellent correlation with the AMT tag quantification results, IQ also provided slightly higher peptidome coverage than AMT. Taken together, we propose an optimal informatics pipeline combining MS-GF+ for initial database searching with IQ (or AMT) for identification and label-free quantification for high-throughput, comprehensive and quantitative peptidomics analysis.« less

  8. A rotorcraft flight database for validation of vision-based ranging algorithms

    NASA Technical Reports Server (NTRS)

    Smith, Phillip N.

    1992-01-01

    A helicopter flight test experiment was conducted at the NASA Ames Research Center to obtain a database consisting of video imagery and accurate measurements of camera motion, camera calibration parameters, and true range information. The database was developed to allow verification of monocular passive range estimation algorithms for use in the autonomous navigation of rotorcraft during low altitude flight. The helicopter flight experiment is briefly described. Four data sets representative of the different helicopter maneuvers and the visual scenery encountered during the flight test are presented. These data sets will be made available to researchers in the computer vision community.

  9. MassTRIX: mass translator into pathways.

    PubMed

    Suhre, Karsten; Schmitt-Kopplin, Philippe

    2008-07-01

    Recent technical advances in mass spectrometry (MS) have brought the field of metabolomics to a point where large numbers of metabolites from numerous prokaryotic and eukaryotic organisms can now be easily and precisely detected. The challenge today lies in the correct annotation of these metabolites on the basis of their accurate measured masses. Assignment of bulk chemical formula is generally possible, but without consideration of the biological and genomic context, concrete metabolite annotations remain difficult and uncertain. MassTRIX responds to this challenge by providing a hypothesis-driven approach to high precision MS data annotation. It presents the identified chemical compounds in their genomic context as differentially colored objects on KEGG pathway maps. Information on gene transcription or differences in the gene complement (e.g. samples from different bacterial strains) can be easily added. The user can thus interpret the metabolic state of the organism in the context of its potential and, in the case of submitted transcriptomics data, real enzymatic capacities. The MassTRIX web server is freely accessible at http://masstrix.org.

  10. Protein structure database search and evolutionary classification.

    PubMed

    Yang, Jinn-Moon; Tung, Chi-Hua

    2006-01-01

    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].

  11. Scheduled Civil Aircraft Emission Inventories for 1999: Database Development and Analysis

    NASA Technical Reports Server (NTRS)

    Sutkus, Donald J., Jr.; Baughcum, Steven L.; DuBois, Douglas P.

    2001-01-01

    This report describes the development of a three-dimensional database of aircraft fuel burn and emissions (NO(x), CO, and hydrocarbons) for the scheduled commercial aircraft fleet for each month of 1999. Global totals of emissions and fuel burn for 1999 are compared to global totals from 1992 and 2015 databases. 1999 fuel burn, departure and distance totals for selected airlines are compared to data reported on DOT Form 41 to evaluate the accuracy of the calculations. DOT Form T-100 data were used to determine typical payloads for freighter aircraft and this information was used to model freighter aircraft more accurately by using more realistic payloads. Differences in the calculation methodology used to create the 1999 fuel burn and emissions database from the methodology used in previous work are described and evaluated.

  12. MitoFish and MitoAnnotator: A Mitochondrial Genome Database of Fish with an Accurate and Automatic Annotation Pipeline

    PubMed Central

    Iwasaki, Wataru; Fukunaga, Tsukasa; Isagozawa, Ryota; Yamada, Koichiro; Maeda, Yasunobu; Satoh, Takashi P.; Sado, Tetsuya; Mabuchi, Kohji; Takeshima, Hirohiko; Miya, Masaki; Nishida, Mutsumi

    2013-01-01

    Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface. PMID:23955518

  13. Identification of Highly Pathogenic Microorganisms by Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry: Results of an Interlaboratory Ring Trial

    PubMed Central

    Lasch, Peter; Wahab, Tara; Weil, Sandra; Pályi, Bernadett; Tomaso, Herbert; Zange, Sabine; Kiland Granerud, Beathe; Drevinek, Michal; Kokotovic, Branko; Wittwer, Matthias; Pflüger, Valentin; Di Caro, Antonino; Stämmler, Maren; Grunow, Roland

    2015-01-01

    In the case of a release of highly pathogenic bacteria (HPB), there is an urgent need for rapid, accurate, and reliable diagnostics. MALDI-TOF mass spectrometry is a rapid, accurate, and relatively inexpensive technique that is becoming increasingly important in microbiological diagnostics to complement classical microbiology, PCR, and genotyping of HPB. In the present study, the results of a joint exercise with 11 partner institutions from nine European countries are presented. In this exercise, 10 distinct microbial samples, among them five HPB, Bacillus anthracis, Brucella canis, Burkholderia mallei, Burkholderia pseudomallei, and Yersinia pestis, were characterized under blinded conditions. Microbial strains were inactivated by high-dose gamma irradiation before shipment. Preparatory investigations ensured that this type of inactivation induced only subtle spectral changes with negligible influence on the quality of the diagnosis. Furthermore, pilot tests on nonpathogenic strains were systematically conducted to ensure the suitability of sample preparation and to optimize and standardize the workflow for microbial identification. The analysis of the microbial mass spectra was carried out by the individual laboratories on the basis of spectral libraries available on site. All mass spectra were also tested against an in-house HPB library at the Robert Koch Institute (RKI). The averaged identification accuracy was 77% in the first case and improved to >93% when the spectral diagnoses were obtained on the basis of the RKI library. The compilation of complete and comprehensive databases with spectra from a broad strain collection is therefore considered of paramount importance for accurate microbial identification. PMID:26063856

  14. Accurate determination of 3-alkyl-2-methoxypyrazines in wines by gas chromatography quadrupole time-of-flight tandem mass spectrometry following solid-phase extraction and dispersive liquid-liquid microextraction.

    PubMed

    Fontana, Ariel; Rodríguez, Isaac; Cela, Rafael

    2017-09-15

    A new reliable method for the determination 3-alkyl-2-methoxypyrazines (MPs) in wine samples based on the sequential combination of solid-phase extraction (SPE), dispersive liquid-liquid microextraction (DLLME) and gas chromatography (GC) quadrupole time-of-flight accurate tandem mass spectrometry (QTOF-MS/MS) is presented. Primary extraction of target analytes was carried out by using a reversed-phase Oasis HLB (200mg) SPE cartridge combined with acetonitrile as elution solvent. Afterwards, the SPE extract was submitted to DLLME concentration using 0.06mL carbon tetrachloride (CCl 4 ) as extractant. Under final working conditions, sample concentration factors above 379 times and limits of quantification (LOQs) between 0.3 and 2.1ngL -1 were achieved. Moreover, the overall extraction efficiency of the method was unaffected by the particular characteristics of each wine; thus, accurate results (relative recoveries from 84 to 108% for samples spiked at concentrations from 5 to 25ngL -1 ) were obtained using matrix-matched standards, without using standard additions over every sample. Highly selective chromatographic records were achieved considering a mass window of 5mDa, centered in the quantification product ion corresponding to each compound. Twelve commercial wines, elaborated with grapes from different varieties and geographical origins, were processed with the optimized method. The 2-isobutyl-3-methoxypyrazine (IBMP) was determined at levels above the LOQs of the method in half of the samples. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Comprehensive T-Matrix Reference Database: A 2012 - 2013 Update

    NASA Technical Reports Server (NTRS)

    Mishchenko, Michael I.; Videen, Gorden; Khlebtsov, Nikolai G.; Wriedt, Thomas

    2013-01-01

    The T-matrix method is one of the most versatile, efficient, and accurate theoretical techniques widely used for numerically exact computer calculations of electromagnetic scattering by single and composite particles, discrete random media, and particles imbedded in complex environments. This paper presents the fifth update to the comprehensive database of peer-reviewed T-matrix publications initiated by us in 2004 and includes relevant publications that have appeared since 2012. It also lists several earlier publications not incorporated in the original database, including Peter Waterman's reports from the 1960s illustrating the history of the T-matrix approach and demonstrating that John Fikioris and Peter Waterman were the true pioneers of the multi-sphere method otherwise known as the generalized Lorenz - Mie theory.

  16. Accessing the SEED genome databases via Web services API: tools for programmers.

    PubMed

    Disz, Terry; Akhter, Sajia; Cuevas, Daniel; Olson, Robert; Overbeek, Ross; Vonstein, Veronika; Stevens, Rick; Edwards, Robert A

    2010-06-14

    The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java. We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.

  17. Concepts and data model for a co-operative neurovascular database.

    PubMed

    Mansmann, U; Taylor, W; Porter, P; Bernarding, J; Jäger, H R; Lasjaunias, P; Terbrugge, K; Meisel, J

    2001-08-01

    Problems of clinical management of neurovascular diseases are very complex. This is caused by the chronic character of the diseases, a long history of symptoms and diverse treatments. If patients are to benefit from treatment, then treatment decisions have to rely on reliable and accurate knowledge of the natural history of the disease and the various treatments. Recent developments in statistical methodology and experience from electronic patient records are used to establish an information infrastructure based on a centralized register. A protocol to collect data on neurovascular diseases with technical as well as logistical aspects of implementing a database for neurovascular diseases are described. The database is designed as a co-operative tool of audit and research available to co-operating centres. When a database is linked to a systematic patient follow-up, it can be used to study prognosis. Careful analysis of patient outcome is valuable for decision-making.

  18. Compilation of the data-base of the star catalogue by ADABAS.

    NASA Astrophysics Data System (ADS)

    Ishikawa, T.

    A data-base of the FK4 Star Catalogue is compiled by using HITAC M-280H in the Computer Center of Tokyo University and a commercial data-base management system (DBMS) ADABAS. The purpose of this attempt is to examine whether the ADABAS, which could be regarded as a representative of the currently available DBMS's developed majorly for business and information retrieval purposes, proves itself useful for handling mass numerical data like the star catalogue data. It is concluded that the data-base could really be a convenient way for storing and utilizing the star catalogue data.

  19. Distance restraints from crosslinking mass spectrometry: mining a molecular dynamics simulation database to evaluate lysine-lysine distances.

    PubMed

    Merkley, Eric D; Rysavy, Steven; Kahraman, Abdullah; Hafen, Ryan P; Daggett, Valerie; Adkins, Joshua N

    2014-06-01

    Integrative structural biology attempts to model the structures of protein complexes that are challenging or intractable by classical structural methods (due to size, dynamics, or heterogeneity) by combining computational structural modeling with data from experimental methods. One such experimental method is chemical crosslinking mass spectrometry (XL-MS), in which protein complexes are crosslinked and characterized using liquid chromatography-mass spectrometry to pinpoint specific amino acid residues in close structural proximity. The commonly used lysine-reactive N-hydroxysuccinimide ester reagents disuccinimidylsuberate (DSS) and bis(sulfosuccinimidyl)suberate (BS(3) ) have a linker arm that is 11.4 Å long when fully extended, allowing Cα (alpha carbon of protein backbone) atoms of crosslinked lysine residues to be up to ∼24 Å apart. However, XL-MS studies on proteins of known structure frequently report crosslinks that exceed this distance. Typically, a tolerance of ∼3 Å is added to the theoretical maximum to account for this observation, with limited justification for the chosen value. We used the Dynameomics database, a repository of high-quality molecular dynamics simulations of 807 proteins representative of diverse protein folds, to investigate the relationship between lysine-lysine distances in experimental starting structures and in simulation ensembles. We conclude that for DSS/BS(3), a distance constraint of 26-30 Å between Cα atoms is appropriate. This analysis provides a theoretical basis for the widespread practice of adding a tolerance to the crosslinker length when comparing XL-MS results to structures or in modeling. We also discuss the comparison of XL-MS results to MD simulations and known structures as a means to test and validate experimental XL-MS methods. © 2014 The Protein Society.

  20. Distance restraints from crosslinking mass spectrometry: Mining a molecular dynamics simulation database to evaluate lysine–lysine distances

    PubMed Central

    Merkley, Eric D; Rysavy, Steven; Kahraman, Abdullah; Hafen, Ryan P; Daggett, Valerie; Adkins, Joshua N

    2014-01-01

    Integrative structural biology attempts to model the structures of protein complexes that are challenging or intractable by classical structural methods (due to size, dynamics, or heterogeneity) by combining computational structural modeling with data from experimental methods. One such experimental method is chemical crosslinking mass spectrometry (XL-MS), in which protein complexes are crosslinked and characterized using liquid chromatography-mass spectrometry to pinpoint specific amino acid residues in close structural proximity. The commonly used lysine-reactive N-hydroxysuccinimide ester reagents disuccinimidylsuberate (DSS) and bis(sulfosuccinimidyl)suberate (BS3) have a linker arm that is 11.4 Å long when fully extended, allowing Cα (alpha carbon of protein backbone) atoms of crosslinked lysine residues to be up to ∼24 Å apart. However, XL-MS studies on proteins of known structure frequently report crosslinks that exceed this distance. Typically, a tolerance of ∼3 Å is added to the theoretical maximum to account for this observation, with limited justification for the chosen value. We used the Dynameomics database, a repository of high-quality molecular dynamics simulations of 807 proteins representative of diverse protein folds, to investigate the relationship between lysine–lysine distances in experimental starting structures and in simulation ensembles. We conclude that for DSS/BS3, a distance constraint of 26–30 Å between Cα atoms is appropriate. This analysis provides a theoretical basis for the widespread practice of adding a tolerance to the crosslinker length when comparing XL-MS results to structures or in modeling. We also discuss the comparison of XL-MS results to MD simulations and known structures as a means to test and validate experimental XL-MS methods. PMID:24639379

  1. Design and development of a multimedia database for emergency telemedicine.

    PubMed

    Pavlopoulos, S; Berler, A; Kyriacou, E; Koutsouris, D

    1998-09-01

    Recent studies conclude that early and specialised pre-hospital patient management contributes to emergency cases survival. Recent developments in telecommunication and medical informatics by means of telemedicine can be extremely useful to accomplish such tasks in a cost-effective manner. Along that direction, we have designed a portable device for emergency telemedicine. This device is able to telematically "bring" the expert doctor at the emergency site, have him perform an accurate diagnosis, and subsequently direct the Emergency Medical Technicians on how to treat the patient until he arrives to the hospital. The need for storing and archiving all data being interchanged during the telemedicine sessions is very crucial for clinical, legal and administrative purposes. For this, we have developed a multimedia database able to store and manage the data collected by the AMBULANCE system. The database was equipped with a user-friendly graphical interface to enable use from computer naive users. Furthermore, the database has the possibility to display, in an standard way, ECG's, X-ray, CT and MRI images. The application is password protected with a three-level hierarchy access for users with different privileges. The scope of this application is to enhance the capabilities of the doctor on duty for a more precise and prompt diagnosis. The application has the ability to store audio files related to each emergency case and still images of the scene. Finally, this database can become a useful multimedia tool which will work together with the AMBULANCE portable device, the HIS and the PACS of the hospital. The system has been validated in selected non-critical cases and proved to be functional and successful in enhancing the ability of the doctor's on duty for prompt and accurate diagnosis and specialised pre-hospital treatment.

  2. Approximating the near-edge mass absorption coefficients for Ni using an ultra-thin bimetal foil

    DOE PAGES

    Alkire, Randall W.

    2016-11-01

    In an effort to improve the characteristics of a fluorescing metal-foil-based beam position monitor, a new bimetal ultra-thin (0.98/0.67 µm) Ti–Ni foil was introduced to replace an existing single-element ultra-thin 0.5 µm thick Cr foil. During characterization it was determined that absorption measurements on the bimetal foil could be used to fit the Ni mass absorption coefficients accurately in the vicinity of the NiKedge. Comparison with experimental results from the literature demonstrated that the fitting procedure produced coefficients with uncertainties of the order of ±1%. Once determined, these fit coefficients allowed the thickness of an independently mounted 8 µm thickmore » Ni foil to be computed from absorption measurements instead of relying on a tool-based measurement of the foil thickness. Using the 8 µm thick foil, a continuous map of Ni mass absorption coefficients was produced at 1 eV resolution throughout the near-edge region. Lastly, this high-resolution map marks a significant improvement over the existing NIST XCOM or FFAST database mass absorption coefficients, which have estimated errors of 10–20% for the near-edge region.« less

  3. NIST Libraries of Peptide Fragmentation Mass Spectra Databass

    National Institute of Standards and Technology Data Gateway

    SRD 4 NIST Libraries of Peptide Fragmentation Mass Spectra Databass (PC database for purchase)   Interactive computer program for predicting thermodynamic and transport properties of pure fluids and fluid mixtures containing up to 20 components. The components are selected from a database of 196 components, mostly hydrocarbons.

  4. Interactive Scene Analysis Module - A sensor-database fusion system for telerobotic environments

    NASA Technical Reports Server (NTRS)

    Cooper, Eric G.; Vazquez, Sixto L.; Goode, Plesent W.

    1992-01-01

    Accomplishing a task with telerobotics typically involves a combination of operator control/supervision and a 'script' of preprogrammed commands. These commands usually assume that the location of various objects in the task space conform to some internal representation (database) of that task space. The ability to quickly and accurately verify the task environment against the internal database would improve the robustness of these preprogrammed commands. In addition, the on-line initialization and maintenance of a task space database is difficult for operators using Cartesian coordinates alone. This paper describes the Interactive Scene' Analysis Module (ISAM) developed to provide taskspace database initialization and verification utilizing 3-D graphic overlay modelling, video imaging, and laser radar based range imaging. Through the fusion of taskspace database information and image sensor data, a verifiable taskspace model is generated providing location and orientation data for objects in a task space. This paper also describes applications of the ISAM in the Intelligent Systems Research Laboratory (ISRL) at NASA Langley Research Center, and discusses its performance relative to representation accuracy and operator interface efficiency.

  5. Ukrainian Database and Atlas of Light Curves of Artificial Space Objects

    NASA Astrophysics Data System (ADS)

    Koshkin, N.; Savanevich, V.; Pohorelov, A.; Shakun, L.; Zhukov, V.; Korobeynikova, E.; Strakhova, S.; Moskalenko, S.; Kashuba, V.; Krasnoshchokov, A.

    This paper describes the Ukrainian database of long-term photometric observations of resident space objects (RSO). For the purpose of using this database for the outer space monitoring and space situational awareness (SSA) the open internet resource has been developed. The paper shows examples of using the Atlas of light curves of RSO's for analyzing the state of rotation around the center of mass of several active and non-functioning satellites in orbit.

  6. Confidence assignment for mass spectrometry based peptide identifications via the extreme value distribution.

    PubMed

    Alves, Gelio; Yu, Yi-Kuo

    2016-09-01

    There is a growing trend for biomedical researchers to extract evidence and draw conclusions from mass spectrometry based proteomics experiments, the cornerstone of which is peptide identification. Inaccurate assignments of peptide identification confidence thus may have far-reaching and adverse consequences. Although some peptide identification methods report accurate statistics, they have been limited to certain types of scoring function. The extreme value statistics based method, while more general in the scoring functions it allows, demands accurate parameter estimates and requires, at least in its original design, excessive computational resources. Improving the parameter estimate accuracy and reducing the computational cost for this method has two advantages: it provides another feasible route to accurate significance assessment, and it could provide reliable statistics for scoring functions yet to be developed. We have formulated and implemented an efficient algorithm for calculating the extreme value statistics for peptide identification applicable to various scoring functions, bypassing the need for searching large random databases. The source code, implemented in C ++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit yyu@ncbi.nlm.nih.gov Supplementary data are available at Bioinformatics online. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.

  7. Cadastral Database Positional Accuracy Improvement

    NASA Astrophysics Data System (ADS)

    Hashim, N. M.; Omar, A. H.; Ramli, S. N. M.; Omar, K. M.; Din, N.

    2017-10-01

    Positional Accuracy Improvement (PAI) is the refining process of the geometry feature in a geospatial dataset to improve its actual position. This actual position relates to the absolute position in specific coordinate system and the relation to the neighborhood features. With the growth of spatial based technology especially Geographical Information System (GIS) and Global Navigation Satellite System (GNSS), the PAI campaign is inevitable especially to the legacy cadastral database. Integration of legacy dataset and higher accuracy dataset like GNSS observation is a potential solution for improving the legacy dataset. However, by merely integrating both datasets will lead to a distortion of the relative geometry. The improved dataset should be further treated to minimize inherent errors and fitting to the new accurate dataset. The main focus of this study is to describe a method of angular based Least Square Adjustment (LSA) for PAI process of legacy dataset. The existing high accuracy dataset known as National Digital Cadastral Database (NDCDB) is then used as bench mark to validate the results. It was found that the propose technique is highly possible for positional accuracy improvement of legacy spatial datasets.

  8. Extracting Time-Accurate Acceleration Vectors From Nontrivial Accelerometer Arrangements.

    PubMed

    Franck, Jennifer A; Blume, Janet; Crisco, Joseph J; Franck, Christian

    2015-09-01

    Sports-related concussions are of significant concern in many impact sports, and their detection relies on accurate measurements of the head kinematics during impact. Among the most prevalent recording technologies are videography, and more recently, the use of single-axis accelerometers mounted in a helmet, such as the HIT system. Successful extraction of the linear and angular impact accelerations depends on an accurate analysis methodology governed by the equations of motion. Current algorithms are able to estimate the magnitude of acceleration and hit location, but make assumptions about the hit orientation and are often limited in the position and/or orientation of the accelerometers. The newly formulated algorithm presented in this manuscript accurately extracts the full linear and rotational acceleration vectors from a broad arrangement of six single-axis accelerometers directly from the governing set of kinematic equations. The new formulation linearizes the nonlinear centripetal acceleration term with a finite-difference approximation and provides a fast and accurate solution for all six components of acceleration over long time periods (>250 ms). The approximation of the nonlinear centripetal acceleration term provides an accurate computation of the rotational velocity as a function of time and allows for reconstruction of a multiple-impact signal. Furthermore, the algorithm determines the impact location and orientation and can distinguish between glancing, high rotational velocity impacts, or direct impacts through the center of mass. Results are shown for ten simulated impact locations on a headform geometry computed with three different accelerometer configurations in varying degrees of signal noise. Since the algorithm does not require simplifications of the actual impacted geometry, the impact vector, or a specific arrangement of accelerometer orientations, it can be easily applied to many impact investigations in which accurate kinematics need

  9. Buoyancy contribution to uncertainty of mass, conventional mass and force

    NASA Astrophysics Data System (ADS)

    Malengo, Andrea; Bich, Walter

    2016-04-01

    The conventional mass is a useful concept introduced to reduce the impact of the buoyancy correction in everyday mass measurements, thus avoiding in most cases its accurate determination, necessary in measurements of ‘true’ mass. Although usage of conventional mass is universal and standardized, the concept is considered as a sort of second-choice tool, to be avoided in high-accuracy applications. In this paper we show that this is a false belief, by elucidating the role played by covariances between volume and mass and between volume and conventional mass at the various stages of the dissemination chain and in the relationship between the uncertainties of mass and conventional mass. We arrive at somewhat counter-intuitive results: the volume of the transfer standard plays a comparatively minor role in the uncertainty budget of the standard under calibration. In addition, conventional mass is preferable to mass in normal, in-air operation, as its uncertainty is smaller than that of mass, if covariance terms are properly taken into account, and the uncertainty over-stating (typically) resulting from neglecting them is less severe than that (always) occurring with mass. The same considerations hold for force. In this respect, we show that the associated uncertainty is the same using mass or conventional mass, and, again, that the latter is preferable if covariance terms are neglected.

  10. Reducing process delays for real-time earthquake parameter estimation - An application of KD tree to large databases for Earthquake Early Warning

    NASA Astrophysics Data System (ADS)

    Yin, Lucy; Andrews, Jennifer; Heaton, Thomas

    2018-05-01

    Earthquake parameter estimations using nearest neighbor searching among a large database of observations can lead to reliable prediction results. However, in the real-time application of Earthquake Early Warning (EEW) systems, the accurate prediction using a large database is penalized by a significant delay in the processing time. We propose to use a multidimensional binary search tree (KD tree) data structure to organize large seismic databases to reduce the processing time in nearest neighbor search for predictions. We evaluated the performance of KD tree on the Gutenberg Algorithm, a database-searching algorithm for EEW. We constructed an offline test to predict peak ground motions using a database with feature sets of waveform filter-bank characteristics, and compare the results with the observed seismic parameters. We concluded that large database provides more accurate predictions of the ground motion information, such as peak ground acceleration, velocity, and displacement (PGA, PGV, PGD), than source parameters, such as hypocenter distance. Application of the KD tree search to organize the database reduced the average searching process by 85% time cost of the exhaustive method, allowing the method to be feasible for real-time implementation. The algorithm is straightforward and the results will reduce the overall time of warning delivery for EEW.

  11. Advanced Mass Spectrometric Methods for the Rapid and Quantitative Characterization of Proteomes

    DOE PAGES

    Smith, Richard D.

    2002-01-01

    Progress is reviewedmore » towards the development of a global strategy that aims to extend the sensitivity, dynamic range, comprehensiveness and throughput of proteomic measurements based upon the use of high performance separations and mass spectrometry. The approach uses high accuracy mass measurements from Fourier transform ion cyclotron resonance mass spectrometry (FTICR) to validate peptide ‘accurate mass tags’ (AMTs) produced by global protein enzymatic digestions for a specific organism, tissue or cell type from ‘potential mass tags’ tentatively identified using conventional tandem mass spectrometry (MS/MS). This provides the basis for subsequent measurements without the need for MS/ MS. High resolution capillary liquid chromatography separations combined with high sensitivity, and high resolution accurate FTICR measurements are shown to be capable of characterizing peptide mixtures of more than 10 5 components. The strategy has been initially demonstrated using the microorganisms Saccharomyces cerevisiae and Deinococcus radiodurans. Advantages of the approach include the high confidence of protein identification, its broad proteome coverage, high sensitivity, and the capability for stableisotope labeling methods for precise relative protein abundance measurements. Abbreviations : LC, liquid chromatography; FTICR, Fourier transform ion cyclotron resonance; AMT, accurate mass tag; PMT, potential mass tag; MMA, mass measurement accuracy; MS, mass spectrometry; MS/MS, tandem mass spectrometry; ppm, parts per million.« less

  12. Accurate interatomic force fields via machine learning with covariant kernels

    NASA Astrophysics Data System (ADS)

    Glielmo, Aldo; Sollich, Peter; De Vita, Alessandro

    2017-06-01

    We present a novel scheme to accurately predict atomic forces as vector quantities, rather than sets of scalar components, by Gaussian process (GP) regression. This is based on matrix-valued kernel functions, on which we impose the requirements that the predicted force rotates with the target configuration and is independent of any rotations applied to the configuration database entries. We show that such covariant GP kernels can be obtained by integration over the elements of the rotation group SO (d ) for the relevant dimensionality d . Remarkably, in specific cases the integration can be carried out analytically and yields a conservative force field that can be recast into a pair interaction form. Finally, we show that restricting the integration to a summation over the elements of a finite point group relevant to the target system is sufficient to recover an accurate GP. The accuracy of our kernels in predicting quantum-mechanical forces in real materials is investigated by tests on pure and defective Ni, Fe, and Si crystalline systems.

  13. Accurate atomistic potentials and training sets for boron-nitride nanostructures

    NASA Astrophysics Data System (ADS)

    Tamblyn, Isaac

    Boron nitride nanotubes exhibit exceptional structural, mechanical, and thermal properties. They are optically transparent and have high thermal stability, suggesting a wide range of opportunities for structural reinforcement of materials. Modeling can play an important role in determining the optimal approach to integrating nanotubes into a supporting matrix. Developing accurate, atomistic scale models of such nanoscale interfaces embedded within composites is challenging, however, due to the mismatch of length scales involved. Typical nanotube diameters range from 5-50 nm, with a length as large as a micron (i.e. a relevant length-scale for structural reinforcement). Unlike their carbon-based counterparts, well tested and transferable interatomic force fields are not common for BNNT. In light of this, we have developed an extensive training database of BN rich materials, under conditions relevant for BNNT synthesis and composites based on extensive first principles molecular dynamics simulations. Using this data, we have produced an artificial neural network potential capable of reproducing the accuracy of first principles data at significantly reduced computational cost, allowing for accurate simulation at the much larger length scales needed for composite design.

  14. A Database of Supercooled Large Droplet Ice Accretions

    NASA Technical Reports Server (NTRS)

    VanZante, Judith Foss

    2007-01-01

    A unique, publicly available database regarding supercooled large droplet ice accretions has been developed in NASA Glenn's Icing Research Tunnel. Identical cloud and flight conditions were generated for five different airfoil models. The models chosen represent a variety of aircraft types from the horizontal stabilizer of a large trans-port aircraft to the wings of regional, business, and general aviation aircraft. In addition to the standard documentation methods of 2D ice shape tracing and imagery, ice mass measurements were also taken. This database will also be used to validate and verify the extension of the ice accretion code, LEWICE, into the SLD realm.

  15. Collision Cross Section (CCS) Database: An Additional Measure to Characterize Steroids.

    PubMed

    Hernández-Mesa, Maykel; Le Bizec, Bruno; Monteau, Fabrice; García-Campaña, Ana M; Dervilly-Pinel, Gaud

    2018-04-03

    Ion mobility spectrometry enhances the performance characteristics of liquid chromatography-mass spectrometry workflows intended to steroid profiling by providing a new separation dimension and a novel characterization parameter, the so-called collision cross section (CCS). This work proposes the first CCS database for 300 steroids (i.e., endogenous, including phase I and phase II metabolites, and exogenous synthetic compounds), which involves 1080 ions and covers the CCS of 127 androgens, 84 estrogens, 50 corticosteroids, and 39 progestagens. This large database provides information related to all the ionized species identified for each steroid in positive electrospray ionization mode as well as for estrogens in negative ionization mode. CCS values have been measured using nitrogen as drift gas in the ion mobility cell. Generally, direct correlation exists between mass-to-charge ratio ( m/ z) and CCS because both are related parameters. However, several steroids mainly steroid glucuronides and steroid esters have been characterized as more compact or elongated molecules than expected. In such cases, CCS results in additional relevant information to retention time and mass spectral data for the identification of steroids. Moreover, several isomeric steroid pairs (e.g., 5β-androstane-3,17-dione and 5α-androstane-3,17-dione) have been separated based on their CCS differences. These results indicate that adding the CCS to databases in analytical workflows increases selectivity, thus improving the confidence in steroids analysis. Consequences in terms of identification and quantification are discussed. Quality criteria and a construction of an interlaboratory reproducibility approach are also reported for the obtained CCS values. The CCS database described here is made publicly available.

  16. Using PHP/MySQL to Manage Potential Mass Impacts

    NASA Technical Reports Server (NTRS)

    Hager, Benjamin I.

    2010-01-01

    This paper presents a new application using commercially available software to manage mass properties for spaceflight vehicles. PHP/MySQL(PHP: Hypertext Preprocessor and My Structured Query Language) are a web scripting language and a database language commonly used in concert with each other. They open up new opportunities to develop cutting edge mass properties tools, and in particular, tools for the management of potential mass impacts (threats and opportunities). The paper begins by providing an overview of the functions and capabilities of PHP/MySQL. The focus of this paper is on how PHP/MySQL are being used to develop an advanced "web accessible" database system for identifying and managing mass impacts on NASA's Ares I Upper Stage program, managed by the Marshall Space Flight Center. To fully describe this application, examples of the data, search functions, and views are provided to promote, not only the function, but the security, ease of use, simplicity, and eye-appeal of this new application. This paper concludes with an overview of the other potential mass properties applications and tools that could be developed using PHP/MySQL. The premise behind this paper is that PHP/MySQL are software tools that are easy to use and readily available for the development of cutting edge mass properties applications. These tools are capable of providing "real-time" searching and status of an active database, automated report generation, and other capabilities to streamline and enhance mass properties management application. By using PHP/MySQL, proven existing methods for managing mass properties can be adapted to present-day information technology to accelerate mass properties data gathering, analysis, and reporting, allowing mass property management to keep pace with today's fast-pace design and development processes.

  17. JICST Factual Database JICST DNA Database

    NASA Astrophysics Data System (ADS)

    Shirokizawa, Yoshiko; Abe, Atsushi

    Japan Information Center of Science and Technology (JICST) has started the on-line service of DNA database in October 1988. This database is composed of EMBL Nucleotide Sequence Library and Genetic Sequence Data Bank. The authors outline the database system, data items and search commands. Examples of retrieval session are presented.

  18. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    USDA-ARS?s Scientific Manuscript database

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  19. Measuring Intermediate-Mass Black-Hole Binaries with Advanced Gravitational Wave Detectors.

    PubMed

    Veitch, John; Pürrer, Michael; Mandel, Ilya

    2015-10-02

    We perform a systematic study to explore the accuracy with which the parameters of intermediate-mass black-hole binary systems can be measured from their gravitational wave (GW) signatures using second-generation GW detectors. We make use of the most recent reduced-order models containing inspiral, merger, and ringdown signals of aligned-spin effective-one-body waveforms to significantly speed up the calculations. We explore the phenomenology of the measurement accuracies for binaries with total masses between 50M(⊙) and 500M(⊙) and mass ratios between 0.1 and 1. We find that (i) at total masses below ∼200M(⊙), where the signal-to-noise ratio is dominated by the inspiral portion of the signal, the chirp mass parameter can be accurately measured; (ii) at higher masses, the information content is dominated by the ringdown, and total mass is measured more accurately; (iii) the mass of the lower-mass companion is poorly estimated, especially at high total mass and more extreme mass ratios; and (iv) spin cannot be accurately measured for our injection set with nonspinning components. Most importantly, we find that for binaries with nonspinning components at all values of the mass ratio in the considered range and at a network signal-to-noise ratio of 15, analyzed with spin-aligned templates, the presence of an intermediate-mass black hole with mass >100M(⊙) can be confirmed with 95% confidence in any binary that includes a component with a mass of 130M(⊙) or greater.

  20. PhosSA: Fast and accurate phosphorylation site assignment algorithm for mass spectrometry data.

    PubMed

    Saeed, Fahad; Pisitkun, Trairak; Hoffert, Jason D; Rashidian, Sara; Wang, Guanghui; Gucek, Marjan; Knepper, Mark A

    2013-11-07

    Phosphorylation site assignment of high throughput tandem mass spectrometry (LC-MS/MS) data is one of the most common and critical aspects of phosphoproteomics. Correctly assigning phosphorylated residues helps us understand their biological significance. The design of common search algorithms (such as Sequest, Mascot etc.) do not incorporate site assignment; therefore additional algorithms are essential to assign phosphorylation sites for mass spectrometry data. The main contribution of this study is the design and implementation of a linear time and space dynamic programming strategy for phosphorylation site assignment referred to as PhosSA. The proposed algorithm uses summation of peak intensities associated with theoretical spectra as an objective function. Quality control of the assigned sites is achieved using a post-processing redundancy criteria that indicates the signal-to-noise ratio properties of the fragmented spectra. The quality assessment of the algorithm was determined using experimentally generated data sets using synthetic peptides for which phosphorylation sites were known. We report that PhosSA was able to achieve a high degree of accuracy and sensitivity with all the experimentally generated mass spectrometry data sets. The implemented algorithm is shown to be extremely fast and scalable with increasing number of spectra (we report up to 0.5 million spectra/hour on a moderate workstation). The algorithm is designed to accept results from both Sequest and Mascot search engines. An executable is freely available at http://helixweb.nih.gov/ESBL/PhosSA/ for academic research purposes.

  1. Carbene footprinting accurately maps binding sites in protein-ligand and protein-protein interactions

    NASA Astrophysics Data System (ADS)

    Manzi, Lucio; Barrow, Andrew S.; Scott, Daniel; Layfield, Robert; Wright, Timothy G.; Moses, John E.; Oldham, Neil J.

    2016-11-01

    Specific interactions between proteins and their binding partners are fundamental to life processes. The ability to detect protein complexes, and map their sites of binding, is crucial to understanding basic biology at the molecular level. Methods that employ sensitive analytical techniques such as mass spectrometry have the potential to provide valuable insights with very little material and on short time scales. Here we present a differential protein footprinting technique employing an efficient photo-activated probe for use with mass spectrometry. Using this methodology the location of a carbohydrate substrate was accurately mapped to the binding cleft of lysozyme, and in a more complex example, the interactions between a 100 kDa, multi-domain deubiquitinating enzyme, USP5 and a diubiquitin substrate were located to different functional domains. The much improved properties of this probe make carbene footprinting a viable method for rapid and accurate identification of protein binding sites utilizing benign, near-UV photoactivation.

  2. Application of materials database (MAT.DB.) to materials education

    NASA Technical Reports Server (NTRS)

    Liu, Ping; Waskom, Tommy L.

    1994-01-01

    Finding the right material for the job is an important aspect of engineering. Sometimes the choice is as fundamental as selecting between steel and aluminum. Other times, the choice may be between different compositions in an alloy. Discovering and compiling materials data is a demanding task, but it leads to accurate models for analysis and successful materials application. Mat. DB. is a database management system designed for maintaining information on the properties and processing of engineered materials, including metals, plastics, composites, and ceramics. It was developed by the Center for Materials Data of American Society for Metals (ASM) International. The ASM Center for Materials Data collects and reviews material property data for publication in books, reports, and electronic database. Mat. DB was developed to aid the data management and material applications.

  3. The Mass Function of Abell Clusters

    NASA Astrophysics Data System (ADS)

    Chen, J.; Huchra, J. P.; McNamara, B. R.; Mader, J.

    1998-12-01

    The velocity dispersion and mass functions for rich clusters of galaxies provide important constraints on models of the formation of Large-Scale Structure (e.g., Frenk et al. 1990). However, prior estimates of the velocity dispersion or mass function for galaxy clusters have been based on either very small samples of clusters (Bahcall and Cen 1993; Zabludoff et al. 1994) or large but incomplete samples (e.g., the Girardi et al. (1998) determination from a sample of clusters with more than 30 measured galaxy redshifts). In contrast, we approach the problem by constructing a volume-limited sample of Abell clusters. We collected individual galaxy redshifts for our sample from two major galaxy velocity databases, the NASA Extragalactic Database, NED, maintained at IPAC, and ZCAT, maintained at SAO. We assembled a database with velocity information for possible cluster members and then selected cluster members based on both spatial and velocity data. Cluster velocity dispersions and masses were calculated following the procedures of Danese, De Zotti, and di Tullio (1980) and Heisler, Tremaine, and Bahcall (1985), respectively. The final velocity dispersion and mass functions were analyzed in order to constrain cosmological parameters by comparison to the results of N-body simulations. Our data for the cluster sample as a whole and for the individual clusters (spatial maps and velocity histograms) in our sample is available on-line at http://cfa-www.harvard.edu/ huchra/clusters. This website will be updated as more data becomes available in the master redshift compilations, and will be expanded to include more clusters and large groups of galaxies.

  4. Digital release of the Alaska Quaternary fault and fold database

    NASA Astrophysics Data System (ADS)

    Koehler, R. D.; Farrell, R.; Burns, P.; Combellick, R. A.; Weakland, J. R.

    2011-12-01

    The Alaska Division of Geological & Geophysical Surveys (DGGS) has designed a Quaternary fault and fold database for Alaska in conformance with standards defined by the U.S. Geological Survey for the National Quaternary fault and fold database. Alaska is the most seismically active region of the United States, however little information exists on the location, style of deformation, and slip rates of Quaternary faults. Thus, to provide an accurate, user-friendly, reference-based fault inventory to the public, we are producing a digital GIS shapefile of Quaternary fault traces and compiling summary information on each fault. Here, we present relevant information pertaining to the digital GIS shape file and online access and availability of the Alaska database. This database will be useful for engineering geologic studies, geologic, geodetic, and seismic research, and policy planning. The data will also contribute to the fault source database being constructed by the Global Earthquake Model (GEM), Faulted Earth project, which is developing tools to better assess earthquake risk. We derived the initial list of Quaternary active structures from The Neotectonic Map of Alaska (Plafker et al., 1994) and supplemented it with more recent data where available. Due to the limited level of knowledge on Quaternary faults in Alaska, pre-Quaternary fault traces from the Plafker map are shown as a layer in our digital database so users may view a more accurate distribution of mapped faults and to suggest the possibility that some older traces may be active yet un-studied. The database will be updated as new information is developed. We selected each fault by reviewing the literature and georegistered the faults from 1:250,000-scale paper maps contained in 1970's vintage and earlier bedrock maps. However, paper map scales range from 1:20,000 to 1:500,000. Fault parameters in our GIS fault attribute tables include fault name, age, slip rate, slip sense, dip direction, fault line type

  5. Matrix-assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) Can Precisely Discriminate the Lineages of Listeria monocytogenes and Species of Listeria.

    PubMed

    Ojima-Kato, Teruyo; Yamamoto, Naomi; Takahashi, Hajime; Tamura, Hiroto

    2016-01-01

    The genetic lineages of Listeria monocytogenes and other species of the genus Listeria are correlated with pathogenesis in humans. Although matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) has become a prevailing tool for rapid and reliable microbial identification, the precise discrimination of Listeria species and lineages remains a crucial issue in clinical settings and for food safety. In this study, we constructed an accurate and reliable MS database to discriminate the lineages of L. monocytogenes and the species of Listeria (L. monocytogenes, L. innocua, L. welshimeri, L. seeligeri, L. ivanovii, L. grayi, and L. rocourtiae) based on the S10-spc-alpha operon gene encoded ribosomal protein mass spectrum (S10-GERMS) proteotyping method, which relies on both genetic information (genomics) and observed MS peaks in MALDI-TOF MS (proteomics). The specific set of eight biomarkers (ribosomal proteins L24, L6, L18, L15, S11, S9, L31 type B, and S16) yielded characteristic MS patterns for the lineages of L. monocytogenes and the different species of Listeria, and led to the construction of a MS database that was successful in discriminating between these organisms in MALDI-TOF MS fingerprinting analysis followed by advanced proteotyping software Strain Solution analysis. We also confirmed the constructed database on the proteotyping software Strain Solution by using 23 Listeria strains collected from natural sources.

  6. Fixed-Wing Micro Aerial Vehicle for Accurate Corridor Mapping

    NASA Astrophysics Data System (ADS)

    Rehak, M.; Skaloud, J.

    2015-08-01

    In this study we present a Micro Aerial Vehicle (MAV) equipped with precise position and attitude sensors that together with a pre-calibrated camera enables accurate corridor mapping. The design of the platform is based on widely available model components to which we integrate an open-source autopilot, customized mass-market camera and navigation sensors. We adapt the concepts of system calibration from larger mapping platforms to MAV and evaluate them practically for their achievable accuracy. We present case studies for accurate mapping without ground control points: first for a block configuration, later for a narrow corridor. We evaluate the mapping accuracy with respect to checkpoints and digital terrain model. We show that while it is possible to achieve pixel (3-5 cm) mapping accuracy in both cases, precise aerial position control is sufficient for block configuration, the precise position and attitude control is required for corridor mapping.

  7. Normal Databases for the Relative Quantification of Myocardial Perfusion

    PubMed Central

    Rubeaux, Mathieu; Xu, Yuan; Germano, Guido; Berman, Daniel S.; Slomka, Piotr J.

    2016-01-01

    Purpose of review Myocardial perfusion imaging (MPI) with SPECT is performed clinically worldwide to detect and monitor coronary artery disease (CAD). MPI allows an objective quantification of myocardial perfusion at stress and rest. This established technique relies on normal databases to compare patient scans against reference normal limits. In this review, we aim to introduce the process of MPI quantification with normal databases and describe the associated perfusion quantitative measures that are used. Recent findings New equipment and new software reconstruction algorithms have been introduced which require the development of new normal limits. The appearance and regional count variations of normal MPI scan may differ between these new scanners and standard Anger cameras. Therefore, these new systems may require the determination of new normal limits to achieve optimal accuracy in relative myocardial perfusion quantification. Accurate diagnostic and prognostic results rivaling those obtained by expert readers can be obtained by this widely used technique. Summary Throughout this review, we emphasize the importance of the different normal databases and the need for specific databases relative to distinct imaging procedures. use of appropriate normal limits allows optimal quantification of MPI by taking into account subtle image differences due to the hardware and software used, and the population studied. PMID:28138354

  8. An Introduction to Database Structure and Database Machines.

    ERIC Educational Resources Information Center

    Detweiler, Karen

    1984-01-01

    Enumerates principal management objectives of database management systems (data independence, quality, security, multiuser access, central control) and criteria for comparison (response time, size, flexibility, other features). Conventional database management systems, relational databases, and database machines used for backend processing are…

  9. Improving mass measurement accuracy in mass spectrometry based proteomics by combining open source tools for chromatographic alignment and internal calibration.

    PubMed

    Palmblad, Magnus; van der Burgt, Yuri E M; Dalebout, Hans; Derks, Rico J E; Schoenmaker, Bart; Deelder, André M

    2009-05-02

    Accurate mass determination enhances peptide identification in mass spectrometry based proteomics. We here describe the combination of two previously published open source software tools to improve mass measurement accuracy in Fourier transform ion cyclotron resonance mass spectrometry (FTICRMS). The first program, msalign, aligns one MS/MS dataset with one FTICRMS dataset. The second software, recal2, uses peptides identified from the MS/MS data for automated internal calibration of the FTICR spectra, resulting in sub-ppm mass measurement errors.

  10. Realistic simulated MRI and SPECT databases. Application to SPECT/MRI registration evaluation.

    PubMed

    Aubert-Broche, Berengere; Grova, Christophe; Reilhac, Anthonin; Evans, Alan C; Collins, D Louis

    2006-01-01

    This paper describes the construction of simulated SPECT and MRI databases that account for realistic anatomical and functional variability. The data is used as a gold-standard to evaluate four SPECT/MRI similarity-based registration methods. Simulation realism was accounted for using accurate physical models of data generation and acquisition. MRI and SPECT simulations were generated from three subjects to take into account inter-subject anatomical variability. Functional SPECT data were computed from six functional models of brain perfusion. Previous models of normal perfusion and ictal perfusion observed in Mesial Temporal Lobe Epilepsy (MTLE) were considered to generate functional variability. We studied the impact noise and intensity non-uniformity in MRI simulations and SPECT scatter correction may have on registration accuracy. We quantified the amount of registration error caused by anatomical and functional variability. Registration involving ictal data was less accurate than registration involving normal data. MR intensity nonuniformity was the main factor decreasing registration accuracy. The proposed simulated database is promising to evaluate many functional neuroimaging methods, involving MRI and SPECT data.

  11. Ultra-precise single-ion atomic mass measurements on deuterium and helium-3

    NASA Astrophysics Data System (ADS)

    Zafonte, S. L.; Van Dyck, R. S., Jr.

    2015-04-01

    The former University of Washington Penning Trap Mass Spectrometer (UW-PTMS), now located at the Max Planck Institute for Nuclear Physics in Heidelberg, Germany, was used in the decade before the move to determine new values for the deuteron atomic mass, M (2H+) = 2.013 553 212 745(40) u, and the deuterium atomic mass, M (2H) = 2.014 101 778 052(40) u, both of which are now more than an order-of-magnitude more accurate than the previous best 1994-MIT measurements of these quantities. The new value for the deuteron’s mass can then be used with the accepted 2010-CODATA proton mass and the most recent 1999-measurement of the 2.2 MeV gamma-ray binding energy of the deuteron to refine the neutron’s mass to mn = 1.008 664 916 018(435) u which has about half the uncertainty relative to the value computed using that previous 1994-MIT deuterium measurement. As a result, further improvements of mn must now come from a more accurate determination of the wavelength of this gamma ray. In this same period of time, this spectrometer has also been used to determine new values for the helion atomic mass, M (3He2+) = 3.014 932 246 668(43) u, and the neutral helium-3 atomic mass, M (3He) = 3.016 029 321 675(43) u, which are both about 60 times more accurate than the 2006-SMILETRAP measurements, but disagree with the 4.4-times less-accurate 2015-Florida-State measurements by 0.76 nu. It is expected that these helium-3 results will be used in the future 3H/3He mass ratio (to be determined by the Heidelberg, Germany version of the old UW-PTMS) in order to generate a more accurate value for the tritium atomic mass.

  12. Structure Identification Using High Resolution Mass ...

    EPA Pesticide Factsheets

    The iCSS CompTox Dashboard is a publicly accessible dashboard provided by the National Center for Computation Toxicology at the US-EPA. It serves a number of purposes, including providing a chemistry database underpinning many of our public-facing projects (e.g. ToxCast and ExpoCast). The available data and searches provide a valuable path to structure identification using mass spectrometry as the source data. With an underlying database of over 720,000 chemicals, the dashboard has already been used to assist in identifying chemicals present in house dust. This poster reviews the benefits of the EPA’s platform and underlying algorithms used for the purpose of compound identification using high-resolution mass spectrometry data. Standard approaches for both mass and formula lookup are available but the dashboard delivers a novel approach for hit ranking based on functional use of the chemicals. The focus on high-quality data, novel ranking approaches and integration to other resources of value to mass spectrometrists makes the CompTox Dashboard a valuable resource for the identification of environmental chemicals. This abstract does not reflect U.S. EPA policy poster presented at the Eastern Analytical Symposium (EAS) held in Somerset, NJ

  13. [Exploration and construction of the full-text database of acupuncture literature in the Republic of China].

    PubMed

    Fei, Lin; Zhao, Jing; Leng, Jiahao; Zhang, Shujian

    2017-10-12

    The ALIPORC full-text database is targeted at a specific full-text database of acupuncture literature in the Republic of China. Starting in 2015, till now, the database has been getting completed, focusing on books relevant with acupuncture, articles and advertising documents, accomplished or published in the Republic of China. The construction of this database aims to achieve the source sharing of acupuncture medical literature in the Republic of China through the retrieval approaches to diversity and accurate content presentation, contributes to the exchange of scholars, reduces the paper damage caused by paging and simplify the retrieval of the rare literature. The writers have made the explanation of the database in light of sources, characteristics and current situation of construction; and have discussed on improving the efficiency and integrity of the database and deepening the development of acupuncture literature in the Republic of China.

  14. CYP450 phenotyping and accurate mass identification of metabolites of the 8-aminoquinoline, anti-malarial drug primaquine.

    PubMed

    Pybus, Brandon S; Sousa, Jason C; Jin, Xiannu; Ferguson, James A; Christian, Robert E; Barnhart, Rebecca; Vuong, Chau; Sciotti, Richard J; Reichard, Gregory A; Kozar, Michael P; Walker, Larry A; Ohrt, Colin; Melendez, Victor

    2012-08-02

    The 8-aminoquinoline (8AQ) drug primaquine (PQ) is currently the only approved drug effective against the persistent liver stage of the hypnozoite forming strains Plasmodium vivax and Plasmodium ovale as well as Stage V gametocytes of Plasmodium falciparum. To date, several groups have investigated the toxicity observed in the 8AQ class, however, exact mechanisms and/or metabolic species responsible for PQ's haemotoxic and anti-malarial properties are not fully understood. In the present study, the metabolism of PQ was evaluated using in vitro recombinant metabolic enzymes from the cytochrome P450 (CYP) and mono-amine oxidase (MAO) families. Based on this information, metabolite identification experiments were performed using nominal and accurate mass measurements. Relative activity factor (RAF)-weighted intrinsic clearance values show the relative role of each enzyme to be MAO-A, 2C19, 3A4, and 2D6, with 76.1, 17.0, 5.2, and 1.7% contributions to PQ metabolism, respectively. CYP 2D6 was shown to produce at least six different oxidative metabolites along with demethylations, while MAO-A products derived from the PQ aldehyde, a pre-cursor to carboxy PQ. CYPs 2C19 and 3A4 produced only trace levels of hydroxylated species. As a result of this work, CYP 2D6 and MAO-A have been implicated as the key enzymes associated with PQ metabolism, and metabolites previously identified as potentially playing a role in efficacy and haemolytic toxicity have been attributed to production via CYP 2D6 mediated pathways.

  15. Astronomical Surveys, Catalogs, Databases, and Archives

    NASA Astrophysics Data System (ADS)

    Mickaelian, A. M.

    2016-06-01

    All-sky and large-area astronomical surveys and their cataloged data over the whole range of electromagnetic spectrum are reviewed, from γ-ray to radio, such as Fermi-GLAST and INTEGRAL in γ-ray, ROSAT, XMM and Chandra in X-ray, GALEX in UV, SDSS and several POSS I and II based catalogues (APM, MAPS, USNO, GSC) in optical range, 2MASS in NIR, WISE and AKARI IRC in MIR, IRAS and AKARI FIS in FIR, NVSS and FIRST in radio and many others, as well as most important surveys giving optical images (DSS I and II, SDSS, etc.), proper motions (Tycho, USNO, Gaia), variability (GCVS, NSVS, ASAS, Catalina, Pan-STARRS) and spectroscopic data (FBS, SBS, Case, HQS, HES, SDSS, CALIFA, GAMA). Most important astronomical databases and archives are reviewed as well, including Wide-Field Plate DataBase (WFPDB), ESO, HEASARC, IRSA and MAST archives, CDS SIMBAD, VizieR and Aladin, NED and HyperLEDA extragalactic databases, ADS and astro-ph services. They are powerful sources for many-sided efficient research using Virtual Observatory tools. Using and analysis of Big Data accumulated in astronomy lead to many new discoveries.

  16. Simultaneous detection and classification of breast masses in digital mammograms via a deep learning YOLO-based CAD system.

    PubMed

    Al-Masni, Mohammed A; Al-Antari, Mugahed A; Park, Jeong-Min; Gi, Geon; Kim, Tae-Yeon; Rivera, Patricio; Valarezo, Edwin; Choi, Mun-Taek; Han, Seung-Moo; Kim, Tae-Seong

    2018-04-01

    Automatic detection and classification of the masses in mammograms are still a big challenge and play a crucial role to assist radiologists for accurate diagnosis. In this paper, we propose a novel Computer-Aided Diagnosis (CAD) system based on one of the regional deep learning techniques, a ROI-based Convolutional Neural Network (CNN) which is called You Only Look Once (YOLO). Although most previous studies only deal with classification of masses, our proposed YOLO-based CAD system can handle detection and classification simultaneously in one framework. The proposed CAD system contains four main stages: preprocessing of mammograms, feature extraction utilizing deep convolutional networks, mass detection with confidence, and finally mass classification using Fully Connected Neural Networks (FC-NNs). In this study, we utilized original 600 mammograms from Digital Database for Screening Mammography (DDSM) and their augmented mammograms of 2,400 with the information of the masses and their types in training and testing our CAD. The trained YOLO-based CAD system detects the masses and then classifies their types into benign or malignant. Our results with five-fold cross validation tests show that the proposed CAD system detects the mass location with an overall accuracy of 99.7%. The system also distinguishes between benign and malignant lesions with an overall accuracy of 97%. Our proposed system even works on some challenging breast cancer cases where the masses exist over the pectoral muscles or dense regions. Copyright © 2018 Elsevier B.V. All rights reserved.

  17. The ChArMEx database

    NASA Astrophysics Data System (ADS)

    Ferré, Hélène; Belmahfoud, Nizar; Boichard, Jean-Luc; Brissebrat, Guillaume; Cloché, Sophie; Descloitres, Jacques; Fleury, Laurence; Focsa, Loredana; Henriot, Nicolas; Mière, Arnaud; Ramage, Karim; Vermeulen, Anne; Boulanger, Damien

    2015-04-01

    have been inserted in a relational database, in order to enable more accurate selection and download of different datasets in a shared format. Many dedicated satellite products (SEVIRI, TRIMM, PARASOL...) are processed and will soon be accessible through the database website. In order to meet the operational needs of the airborne and ground based observational teams during the ChArMEx campaigns, a day-to-day chart display website has been developed and operated: http://choc.sedoo.org. It offers a convenient way to browse weather conditions and chemical composition during the campaign periods. Every scientist is invited to visit the ChArMEx websites, to register and to request data. Feel free to contact charmex-database@sedoo.fr for any question.

  18. The ChArMEx database

    NASA Astrophysics Data System (ADS)

    Ferré, Hélène; Descloitres, Jacques; Fleury, Laurence; Boichard, Jean-Luc; Brissebrat, Guillaume; Focsa, Loredana; Henriot, Nicolas; Mastrorillo, Laurence; Mière, Arnaud; Vermeulen, Anne

    2013-04-01

    (SEVIRI, TRIMM, PARASOL...) stored in the ICARE data archive using OpeNDAP protocole The website will soon propose new facilities. In particular, many in situ datasets will be homogenized and inserted in a relational database, in order to enable more accurate data selection and download of different datasets in a shared format. In order to meet the operational needs of the airborne and ground based observational teams during the ChArMEx 2012 pre-campaign and the 2013 experiment, a day-to-day quick look and report display website has been developed too: http://choc.sedoo.org. It offers a convenient way to browse weather conditions and chemical composition during the campaign periods.

  19. A Newly Updated Database of Elevation-changes of the Greenand Ice Sheet to Study Surface Processes and Ice Dynamics

    NASA Astrophysics Data System (ADS)

    Schenk, A. F.; Csatho, B. M.; van den Broeke, M.; Kuipers Munneke, P.

    2015-12-01

    This paper reports about important upgrades of the Greenland Ice Sheet (GrIS) surface elevation and elevation-change database obtained with our Surface Elevation And Change detection (SERAC) software suite. We have developed SERAC to derive information from laser altimetry data, particularly time series of elevation changes and their partitioning into changes caused by ice dynamics. This allows direct investigation of ice dynamic processes that is much needed for improving the predictive power of ice sheet models. SERAC is different from most other change detection methods. It is based on detecting changes of surface patches, about 1 km by 1 km in size, rather than deriving elevation changes from individual laser points. The current database consists of ~100,000 time series with satellite laser altimetry data from ICESat, airborne laser observations obtained by NASA's Airborne Topographic Mapper (ATM) and the Land, Vegetation and Ice Sensor (LVIS). The upgrade is significant, because not only new observations from 2013 and 2014 have been added but also a number of improvements lead to a more comprehensive and consistent record of elevation-changes. First, we used the model that gives in addition to ice sheet also information about ice caps and glaciers (Rastner et al., 2012) for deciding if a laser point is on the ice sheet or ice cap. Then we added small gaps that exist in the ICESat GLA12 data set because the ice sheet mask is not wide enough. The new database is now more complete and will facilitate more accurate comparisons of mass balance studies obtained from the Gravity Recovery and Climate Experiment system (GRACE). For determining the part of a time series caused by ice dynamics we used the new firn compaction model and Surface Mass Balance (SMB) estimates from RACMO2.3. The new database spans the time period from 1993 to 2014. Adding new observations amounts to a spatial densification of the old record and at the same time extends the time domain by two

  20. The ChArMEx database

    NASA Astrophysics Data System (ADS)

    Ferré, Helene; Belmahfoud, Nizar; Boichard, Jean-Luc; Brissebrat, Guillaume; Descloitres, Jacques; Fleury, Laurence; Focsa, Loredana; Henriot, Nicolas; Mastrorillo, Laurence; Mière, Arnaud; Vermeulen, Anne

    2014-05-01

    The Chemistry-Aerosol Mediterranean Experiment (ChArMEx, http://charmex.lsce.ipsl.fr/) aims at a scientific assessment of the present and future state of the atmospheric environment in the Mediterranean Basin, and of its impacts on the regional climate, air quality, and marine biogeochemistry. The project includes long term monitoring of environmental parameters, intensive field campaigns, use of satellite data and modelling studies. Therefore ChARMEx scientists produce and need to access a wide diversity of data. In this context, the objective of the database task is to organize data management, distribution system and services, such as facilitating the exchange of information and stimulating the collaboration between researchers within the ChArMEx community, and beyond. The database relies on a strong collaboration between OMP and ICARE data centres and has been set up in the framework of the Mediterranean Integrated Studies at Regional And Locals Scales (MISTRALS) program data portal. All the data produced by or of interest for the ChArMEx community will be documented in the data catalogue and accessible through the database website: http://mistrals.sedoo.fr/ChArMEx. At present, the ChArMEx database contains about 75 datasets, including 50 in situ datasets (2012 and 2013 campaigns, Ersa background monitoring station), 25 model outputs (dust model intercomparison, MEDCORDEX scenarios), and a high resolution emission inventory over the Mediterranean. Many in situ datasets have been inserted in a relational database, in order to enable more accurate data selection and download of different datasets in a shared format. The database website offers different tools: - A registration procedure which enables any scientist to accept the data policy and apply for a user database account. - A data catalogue that complies with metadata international standards (ISO 19115-19139; INSPIRE European Directive; Global Change Master Directory Thesaurus). - Metadata forms to document

  1. The HyMeX database

    NASA Astrophysics Data System (ADS)

    Brissebrat, Guillaume; Mastrorillo, Laurence; Ramage, Karim; Boichard, Jean-Luc; Cloché, Sophie; Fleury, Laurence; Klenov, Ludmila; Labatut, Laurent; Mière, Arnaud

    2013-04-01

    measured parameters, by instruments or by platform type. - Forms to document observations or products that will be provided to the database. - A shopping-cart web interface to order in situ data files. - Ftp facilities to access gridded data. The website will soon propose new facilities. Many in situ datasets have been homogenized and inserted in a relational database yet, in order to enable more accurate data selection and download of different datasets in a shared format. Interoperability between the two data centres will be enhanced by the OpenDAP communication protocol associated with the Thredds catalogue software, which may also be implemented in other data centres that manage data of interest for the HyMeX project. In order to meet the operational needs for the HyMeX 2012 campaigns, a day-to-day quick look and report display website has been developed too: http://sop.hymex.org. It offers a convenient way to browse meteorological conditions and data during the campaign periods.

  2. THE ART OF DATA MINING THE MINEFIELDS OF TOXICITY DATABASES TO LINK CHEMISTRY TO BIOLOGY

    EPA Science Inventory

    Toxicity databases have a special role in predictive toxicology, providing ready access to historical information throughout the workflow of discovery, development, and product safety processes in drug development as well as in review by regulatory agencies. To provide accurate i...

  3. Masses of Black Holes in Active Galactic Nuclei

    NASA Technical Reports Server (NTRS)

    Peterson, Bradley M.

    2003-01-01

    We present a progress report on a project whose goal is to improve both the precision and accuracy of reverberation-based black-hole masses. Reverberation masses appear to be accurate to a factor of about three, and the black-hole mass/bulge velocity dispersion (M-sigma) relationship appears to be the same in active and quiescent galaxies.

  4. Kalium: a database of potassium channel toxins from scorpion venom.

    PubMed

    Kuzmenkov, Alexey I; Krylov, Nikolay A; Chugunov, Anton O; Grishin, Eugene V; Vassilevski, Alexander A

    2016-01-01

    Kalium (http://kaliumdb.org/) is a manually curated database that accumulates data on potassium channel toxins purified from scorpion venom (KTx). This database is an open-access resource, and provides easy access to pages of other databases of interest, such as UniProt, PDB, NCBI Taxonomy Browser, and PubMed. General achievements of Kalium are a strict and easy regulation of KTx classification based on the unified nomenclature supported by researchers in the field, removal of peptides with partial sequence and entries supported by transcriptomic information only, classification of β-family toxins, and addition of a novel λ-family. Molecules presented in the database can be processed by the Clustal Omega server using a one-click option. Molecular masses of mature peptides are calculated and available activity data are compiled for all KTx. We believe that Kalium is not only of high interest to professional toxinologists, but also of general utility to the scientific community.Database URL:http://kaliumdb.org/. © The Author(s) 2016. Published by Oxford University Press.

  5. Development of a Searchable Database of Cryoablation Simulations for Use in Treatment Planning.

    PubMed

    Boas, F Edward; Srimathveeravalli, Govindarajan; Durack, Jeremy C; Kaye, Elena A; Erinjeri, Joseph P; Ziv, Etay; Maybody, Majid; Yarmohammadi, Hooman; Solomon, Stephen B

    2017-05-01

    To create and validate a planning tool for multiple-probe cryoablation, using simulations of ice ball size and shape for various ablation probe configurations, ablation times, and types of tissue ablated. Ice ball size and shape was simulated using the Pennes bioheat equation. Five thousand six hundred and seventy different cryoablation procedures were simulated, using 1-6 cryoablation probes and 1-2 cm spacing between probes. The resulting ice ball was measured along three perpendicular axes and recorded in a database. Simulated ice ball sizes were compared to gel experiments (26 measurements) and clinical cryoablation cases (42 measurements). The clinical cryoablation measurements were obtained from a HIPAA-compliant retrospective review of kidney and liver cryoablation procedures between January 2015 and February 2016. Finally, we created a web-based cryoablation planning tool, which uses the cryoablation simulation database to look up the probe spacing and ablation time that produces the desired ice ball shape and dimensions. Average absolute error between the simulated and experimentally measured ice balls was 1 mm in gel experiments and 4 mm in clinical cryoablation cases. The simulations accurately predicted the degree of synergy in multiple-probe ablations. The cryoablation simulation database covers a wide range of ice ball sizes and shapes up to 9.8 cm. Cryoablation simulations accurately predict the ice ball size in multiple-probe ablations. The cryoablation database can be used to plan ablation procedures: given the desired ice ball size and shape, it will find the number and type of probes, probe configuration and spacing, and ablation time required.

  6. High-Resolution Mass Spectrometers

    NASA Astrophysics Data System (ADS)

    Marshall, Alan G.; Hendrickson, Christopher L.

    2008-07-01

    Over the past decade, mass spectrometry has been revolutionized by access to instruments of increasingly high mass-resolving power. For small molecules up to ˜400 Da (e.g., drugs, metabolites, and various natural organic mixtures ranging from foods to petroleum), it is possible to determine elemental compositions (CcHhNnOoSsPp…) of thousands of chemical components simultaneously from accurate mass measurements (the same can be done up to 1000 Da if additional information is included). At higher mass, it becomes possible to identify proteins (including posttranslational modifications) from proteolytic peptides, as well as lipids, glycoconjugates, and other biological components. At even higher mass (˜100,000 Da or higher), it is possible to characterize posttranslational modifications of intact proteins and to map the binding surfaces of large biomolecule complexes. Here we review the principles and techniques of the highest-resolution analytical mass spectrometers (time-of-flight and Fourier transform ion cyclotron resonance and orbitrap mass analyzers) and describe some representative high-resolution applications.

  7. Recent Developments of the GLIMS Glacier Database

    NASA Astrophysics Data System (ADS)

    Raup, B. H.; Berthier, E.; Bolch, T.; Kargel, J. S.; Paul, F.; Racoviteanu, A.

    2017-12-01

    Earth's glaciers are shrinking almost without exception, leading to changes in water resources, timing of runoff, sea level, and hazard potential. Repeat mapping of glacier outlines, lakes, and glacier topography, along with glacial processes, is critically needed to understand how glaciers will react to a changing climate, and how those changes will impact humans. To understand the impacts and processes behind the observed changes, it is crucial to monitor glaciers through time by mapping their areal extent, snow lines, ice flow velocities, associated water bodies, and thickness changes. The glacier database of the Global Land Ice Measurements from Space (GLIMS) initiative is the only multi-temporal glacier database capable of tracking all these glacier measurements and providing them to the scientific community and broader public.Recent developments in GLIMS include improvements in the database and web applications and new activities in the international GLIMS community. The coverage of the GLIMS database has recently grown geographically and temporally by drawing on the Randolph Glacier Inventory (RGI) and other new data sets. The GLIMS database is globally complete, and approximately one third of glaciers have outlines from more than one time. New tools for visualizing and downloading GLIMS data in a choice of formats and data models have been developed, and a new data model for handling multiple glacier records through time while avoiding double-counting of glacier number or area is nearing completion. A GLIMS workshop was held in Boulder, Colorado this year to facilitate two-way communication with the greater community on future needs.The result of this work is a more complete and accurate glacier data repository that shows both the current state of glaciers on Earth and how they have changed in recent decades. Needs for future scientific and technical developments were identified and prioritized at the GLIMS Workshop, and are reported here.

  8. A new natural hazards data-base for volcanic ash and SO2 from global satellite remote sensing measurements

    NASA Astrophysics Data System (ADS)

    Prata, F.; Stebel, K.

    2013-12-01

    Over the last few years there has been a recognition of the utility of satellite measurements to identify and track volcanic emissions that present a natural hazard to human populations. Mitigation of the volcanic hazard to life and the environment requires understanding of the properties of volcanic emissions, identifying the hazard in near real-time and being able to provide timely and accurate forecasts to affected areas. Amongst the many ways to measure volcanic emissions, satellite remote sensing is capable of providing global quantitative retrievals of important microphysical parameters such as ash mass loading, ash particle effective radius, infrared optical depth, SO2 partial and total column abundance, plume altitude, aerosol optical depth and aerosol absorbing index. The eruption of Eyjafjallajokull in April-May, 2010 led to increased research and measurement programs to better characterize properties of volcanic ash and the need to establish a data-base in which to store and access these data was confirmed. The European Space Agency (ESA) has recognized the importance of having a quality controlled data-base of satellite retrievals and has funded an activity (VAST) to develop novel remote sensing retrieval schemes and a data-base, initially focused on several recent hazardous volcanic eruptions. As a first step, satellite retrievals for the eruptions of Eyjafjallajokull, Grimsvotn, Puyhue-Cordon Caulle, Nabro, Merapi, Okmok, Kasatochi and Sarychev Peak are being considered. Here we describe the data, retrievals and methods being developed for the data-base. Three important applications of the data-base are illustrated related to the ash/aviation problem, to the impact of the Merapi volcanic eruption on the local population, and to estimate SO2 fluxes from active volcanoes-as a means to diagnose future unrest. Dispersion model simulations are also being included in the data-base. In time, data from conventional in situ sampling instruments, airborne and

  9. An ab initio electronic transport database for inorganic materials.

    PubMed

    Ricci, Francesco; Chen, Wei; Aydemir, Umut; Snyder, G Jeffrey; Rignanese, Gian-Marco; Jain, Anubhav; Hautier, Geoffroy

    2017-07-04

    Electronic transport in materials is governed by a series of tensorial properties such as conductivity, Seebeck coefficient, and effective mass. These quantities are paramount to the understanding of materials in many fields from thermoelectrics to electronics and photovoltaics. Transport properties can be calculated from a material's band structure using the Boltzmann transport theory framework. We present here the largest computational database of electronic transport properties based on a large set of 48,000 materials originating from the Materials Project database. Our results were obtained through the interpolation approach developed in the BoltzTraP software, assuming a constant relaxation time. We present the workflow to generate the data, the data validation procedure, and the database structure. Our aim is to target the large community of scientists developing materials selection strategies and performing studies involving transport properties.

  10. A comprehensive global genotype-phenotype database for rare diseases.

    PubMed

    Trujillano, Daniel; Oprea, Gabriela-Elena; Schmitz, Yvonne; Bertoli-Avella, Aida M; Abou Jamra, Rami; Rolfs, Arndt

    2017-01-01

    The ability to discover genetic variants in a patient runs far ahead of the ability to interpret them. Databases with accurate descriptions of the causal relationship between the variants and the phenotype are valuable since these are critical tools in clinical genetic diagnostics. Here, we introduce a comprehensive and global genotype-phenotype database focusing on rare diseases. This database (CentoMD ® ) is a browser-based tool that enables access to a comprehensive, independently curated system utilizing stringent high-quality criteria and a quickly growing repository of genetic and human phenotype ontology (HPO)-based clinical information. Its main goals are to aid the evaluation of genetic variants, to enhance the validity of the genetic analytical workflow, to increase the quality of genetic diagnoses, and to improve evaluation of treatment options for patients with hereditary diseases. The database software correlates clinical information from consented patients and probands of different geographical backgrounds with a large dataset of genetic variants and, when available, biomarker information. An automated follow-up tool is incorporated that informs all users whenever a variant classification has changed. These unique features fully embedded in a CLIA/CAP-accredited quality management system allow appropriate data quality and enhanced patient safety. More than 100,000 genetically screened individuals are documented in the database, resulting in more than 470 million variant detections. Approximately, 57% of the clinically relevant and uncertain variants in the database are novel. Notably, 3% of the genetic variants identified and previously reported in the literature as being associated with a particular rare disease were reclassified, based on internal evidence, as clinically irrelevant. The database offers a comprehensive summary of the clinical validity and causality of detected gene variants with their associated phenotypes, and is a valuable tool

  11. Database of episode-integrated solar energetic proton fluences

    NASA Astrophysics Data System (ADS)

    Robinson, Zachary D.; Adams, James H.; Xapsos, Michael A.; Stauffer, Craig A.

    2018-04-01

    A new database of proton episode-integrated fluences is described. This database contains data from two different instruments on multiple satellites. The data are from instruments on the Interplanetary Monitoring Platform-8 (IMP8) and the Geostationary Operational Environmental Satellites (GOES) series. A method to normalize one set of data to one another is presented to create a seamless database spanning 1973 to 2016. A discussion of some of the characteristics that episodes exhibit is presented, including episode duration and number of peaks. As an example of what can be understood about episodes, the July 4, 2012 episode is examined in detail. The coronal mass ejections and solar flares that caused many of the fluctuations of the proton flux seen at Earth are associated with peaks in the proton flux during this episode. The reasoning for each choice is laid out to provide a reference for how CME and solar flares associations are made.

  12. Zero-Point Calibration for AGN Black-Hole Mass Estimates

    NASA Technical Reports Server (NTRS)

    Peterson, B. M.; Onken, C. A.

    2004-01-01

    We discuss the measurement and associated uncertainties of AGN reverberation-based black-hole masses, since these provide the zero-point calibration for scaling relationships that allow black-hole mass estimates for quasars. We find that reverberation-based mass estimates appear to be accurate to within a factor of about 3.

  13. Analysis of Glycosaminoglycans Using Mass Spectrometry

    PubMed Central

    Staples, Gregory O.; Zaia, Joseph

    2015-01-01

    The glycosaminoglycans (GAGs) are linear polysaccharides expressed on animal cell surfaces and in extracellular matrices. Their biosynthesis is under complex control and confers a domain structure that is essential to their ability to bind to protein partners. Key to understanding the functions of GAGs are methods to determine accurately and rapidly patterns of sulfation, acetylation and uronic acid epimerization that correlate with protein binding or other biological activities. Mass spectrometry (MS) is particularly suitable for the analysis of GAGs for biomedical purposes. Using modern ionization techniques it is possible to accurately determine molecular weights of GAG oligosaccharides and their distributions within a mixture. Methods for direct interfacing with liquid chromatography have been developed to permit online mass spectrometric analysis of GAGs. New tandem mass spectrometric methods for fine structure determination of GAGs are emerging. This review summarizes MS-based approaches for analysis of GAGs, including tissue extraction and chromatographic methods compatible with LC/MS and tandem MS. PMID:25705143

  14. Probabilistic Mass Growth Uncertainties

    NASA Technical Reports Server (NTRS)

    Plumer, Eric; Elliott, Darren

    2013-01-01

    Mass has been widely used as a variable input parameter for Cost Estimating Relationships (CER) for space systems. As these space systems progress from early concept studies and drawing boards to the launch pad, their masses tend to grow substantially, hence adversely affecting a primary input to most modeling CERs. Modeling and predicting mass uncertainty, based on historical and analogous data, is therefore critical and is an integral part of modeling cost risk. This paper presents the results of a NASA on-going effort to publish mass growth datasheet for adjusting single-point Technical Baseline Estimates (TBE) of masses of space instruments as well as spacecraft, for both earth orbiting and deep space missions at various stages of a project's lifecycle. This paper will also discusses the long term strategy of NASA Headquarters in publishing similar results, using a variety of cost driving metrics, on an annual basis. This paper provides quantitative results that show decreasing mass growth uncertainties as mass estimate maturity increases. This paper's analysis is based on historical data obtained from the NASA Cost Analysis Data Requirements (CADRe) database.

  15. SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics

    PubMed Central

    2013-01-01

    Background Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. Results We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other

  16. Evaluation of matrix-assisted laser desorption/ionization time of flight mass spectrometry for the identification of ceratopogonid and culicid larvae.

    PubMed

    Steinmann, I C; Pflüger, V; Schaffner, F; Mathis, A; Kaufmann, C

    2013-03-01

    Matrix-assisted laser desorption/ionization time of flight mass spectrometry (MALDI-TOF MS) was evaluated for the rapid identification of ceratopogonid larvae. Optimal sample preparation as evaluated with laboratory-reared biting midges Culicoides nubeculosus was the homogenization of gut-less larvae in 10% formic acid, and analysis of 0.2 mg/ml crude protein homogenate mixed with SA matrix at a ratio of 1:1.5. Using 5 larvae each of 4 ceratopogonid species (C. nubeculosus, C. obsoletus, C. decor, and Dasyhelea sp.) and of 2 culicid species (Aedes aegypti, Ae. japonicus), biomarker mass sets between 27 and 33 masses were determined. In a validation study, 67 larvae belonging to the target species were correctly identified by automated database-based identification (91%) or manual full comparison (9%). Four specimens of non-target species did not yield identification. As anticipated for holometabolous insects, the biomarker mass sets of adults cannot be used for the identification of larvae, and vice versa, because they share only very few similar masses as shown for C. nubeculosus, C. obsoletus, and Ae. japonicus. Thus, protein profiling by MALDI-TOF as a quick, inexpensive and accurate alternative tool is applicable to identify insect larvae of vector species collected in the field.

  17. ClassLess: A Comprehensive Database of Young Stellar Objects

    NASA Astrophysics Data System (ADS)

    Hillenbrand, Lynne A.; baliber, nairn

    2015-08-01

    We have designed and constructed a database intended to house catalog and literature-published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks. We are in the database population phase now, and are eager to engage with interested experts worldwide on local galactic star formation and young stellar populations.

  18. Can computerized tomography accurately stage childhood renal tumors?

    PubMed

    Abdelhalim, Ahmed; Helmy, Tamer E; Harraz, Ahmed M; Abou-El-Ghar, Mohamed E; Dawaba, Mohamed E; Hafez, Ashraf T

    2014-07-01

    Staging of childhood renal tumors is crucial for treatment planning and outcome prediction. We sought to identify whether computerized tomography could accurately predict the local stage of childhood renal tumors. We retrospectively reviewed our database for patients diagnosed with childhood renal tumors and treated surgically between 1990 and 2013. Inability to retrieve preoperative computerized tomography, intraoperative tumor spillage and nonWilms childhood renal tumors were exclusion criteria. Local computerized tomography stage was assigned by a single experienced pediatric radiologist blinded to the pathological stage, using a consensus similar to the Children's Oncology Group Wilms tumor staging system. Tumors were stratified into up-front surgery and preoperative chemotherapy groups. The radiological stage of each tumor was compared to the pathological stage. A total of 189 tumors in 179 patients met inclusion criteria. Computerized tomography staging matched pathological staging in 68% of up-front surgery (70 of 103), 31.8% of pre-chemotherapy (21 of 66) and 48.8% of post-chemotherapy scans (42 of 86). Computerized tomography over staged 21.4%, 65.2% and 46.5% of tumors in the up-front surgery, pre-chemotherapy and post-chemotherapy scans, respectively, and under staged 10.7%, 3% and 4.7%. Computerized tomography staging was more accurate in tumors managed by up-front surgery (p <0.001) and those without extracapsular extension (p <0.001). The validity of computerized tomography staging of childhood renal tumors remains doubtful. This staging is more accurate for tumors treated with up-front surgery and those without extracapsular extension. Preoperative computerized tomography can help to exclude capsular breach. Treatment strategy should be based on surgical and pathological staging to avoid the hazards of inaccurate staging. Copyright © 2014 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.

  19. Accurate determinations of alpha(s) from realistic lattice QCD.

    PubMed

    Mason, Q; Trottier, H D; Davies, C T H; Foley, K; Gray, A; Lepage, G P; Nobes, M; Shigemitsu, J

    2005-07-29

    We obtain a new value for the QCD coupling constant by combining lattice QCD simulations with experimental data for hadron masses. Our lattice analysis is the first to (1) include vacuum polarization effects from all three light-quark flavors (using MILC configurations), (2) include third-order terms in perturbation theory, (3) systematically estimate fourth and higher-order terms, (4) use an unambiguous lattice spacing, and (5) use an [symbol: see text](a2)-accurate QCD action. We use 28 different (but related) short-distance quantities to obtain alpha((5)/(MS))(M(Z)) = 0.1170(12).

  20. A Database of Supercooled Large Droplet Ice Accretions [Supplement

    NASA Technical Reports Server (NTRS)

    VanZante, Judith Foss

    2007-01-01

    A unique, publicly available database regarding supercooled large droplet (SLD) ice accretions has been developed in NASA Glenn's Icing Research Tunnel. Identical cloud and flight conditions were generated for five different airfoil models. The models chosen represent a variety of aircraft types from the horizontal stabilizer of a large transport aircraft to the wings of regional, business, and general aviation aircraft. In addition to the standard documentation methods of 2D ice shape tracing and imagery, ice mass measurements were also taken. This database will also be used to validate and verify the extension of the ice accretion code, LEWICE, into the SLD realm.

  1. A Method to Calculate and Analyze Residents' Evaluations by Using a Microcomputer Data-Base Management System.

    ERIC Educational Resources Information Center

    Mills, Myron L.

    1988-01-01

    A system developed for more efficient evaluation of graduate medical students' progress uses numerical scoring and a microcomputer database management system as an alternative to manual methods to produce accurate, objective, and meaningful summaries of resident evaluations. (Author/MSE)

  2. A Mass Spectrometer Simulator in Your Computer

    ERIC Educational Resources Information Center

    Gagnon, Michel

    2012-01-01

    Introduced to study components of ionized gas, the mass spectrometer has evolved into a highly accurate device now used in many undergraduate and research laboratories. Unfortunately, despite their importance in the formation of future scientists, mass spectrometers remain beyond the financial reach of many high schools and colleges. As a result,…

  3. Consensus and conflict cards for metabolic pathway databases.

    PubMed

    Stobbe, Miranda D; Swertz, Morris A; Thiele, Ines; Rengaw, Trebor; van Kampen, Antoine H C; Moerland, Perry D

    2013-06-26

    The metabolic network of H. sapiens and many other organisms is described in multiple pathway databases. The level of agreement between these descriptions, however, has proven to be low. We can use these different descriptions to our advantage by identifying conflicting information and combining their knowledge into a single, more accurate, and more complete description. This task is, however, far from trivial. We introduce the concept of Consensus and Conflict Cards (C₂Cards) to provide concise overviews of what the databases do or do not agree on. Each card is centered at a single gene, EC number or reaction. These three complementary perspectives make it possible to distinguish disagreements on the underlying biology of a metabolic process from differences that can be explained by different decisions on how and in what detail to represent knowledge. As a proof-of-concept, we implemented C₂Cards(Human), as a web application http://www.molgenis.org/c2cards, covering five human pathway databases. C₂Cards can contribute to ongoing reconciliation efforts by simplifying the identification of consensus and conflicts between pathway databases and lowering the threshold for experts to contribute. Several case studies illustrate the potential of the C₂Cards in identifying disagreements on the underlying biology of a metabolic process. The overviews may also point out controversial biological knowledge that should be subject of further research. Finally, the examples provided emphasize the importance of manual curation and the need for a broad community involvement.

  4. DEEP: Database of Energy Efficiency Performance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hong, Tianzhen; Piette, Mary; Lee, Sang Hoon

    A database of energy efficiency performance (DEEP) is a presimulated database to enable quick and accurate assessment of energy retrofit of commercial buildings. DEEP was compiled from results of about 10 million EnergyPlus simulations. DEEP provides energy savings for screening and evaluation of retrofit measures targeting the small and medium-sized office and retail buildings in California. The prototype building models are developed for a comprehensive assessment of building energy performance based on DOE commercial reference buildings and the California DEER [sic] prototype buildings. The prototype buildings represent seven building types across six vintages of constructions and 16 California climate zones.more » DEEP uses these prototypes to evaluate energy performance of about 100 energy conservation measures covering envelope, lighting, heating, ventilation, air conditioning, plug loads, and domestic hot war. DEEP consists the energy simulation results for individual retrofit measures as well as packages of measures to consider interactive effects between multiple measures. The large scale EnergyPlus simulations are being conducted on the super computers at the National Energy Research Scientific Computing Center (NERSC) of Lawrence Berkeley National Laboratory. The pre-simulation database is a part of the CEC PIER project to develop a web-based retrofit toolkit for small and medium-sized commercial buildings in California, which provides real-time energy retrofit feedback by querying DEEP with recommended measures, estimated energy savings and financial payback period based on users' decision criteria of maximizing energy savings, energy cost savings, carbon reduction, or payback of investment. The pre-simulated database and associated comprehensive measure analysis enhances the ability to performance assessments of retrofits to reduce energy use for small and medium buildings and business owners who typically do not have resources to conduct costly

  5. Impact of Accurate 30-Day Status on Operative Mortality: Wanted Dead or Alive, Not Unknown.

    PubMed

    Ring, W Steves; Edgerton, James R; Herbert, Morley; Prince, Syma; Knoff, Cathy; Jenkins, Kristin M; Jessen, Michael E; Hamman, Baron L

    2017-12-01

    Risk-adjusted operative mortality is the most important quality metric in cardiac surgery for determining The Society of Thoracic Surgeons (STS) Composite Score for star ratings. Accurate 30-day status is required to determine STS operative mortality. The goal of this study was to determine the effect of unknown or missing 30-day status on risk-adjusted operative mortality in a regional STS Adult Cardiac Surgery Database cooperative and demonstrate the ability to correct these deficiencies by matching with an administrative database. STS Adult Cardiac Surgery Database data were submitted by 27 hospitals from five hospital systems to the Texas Quality Initiative (TQI), a regional quality collaborative. TQI data were matched with a regional hospital claims database to resolve unknown 30-day status. The risk-adjusted operative mortality observed-to-expected (O/E) ratio was determined before and after matching to determine the effect of unknown status on the operative mortality O/E. TQI found an excessive (22%) unknown 30-day status for STS isolated coronary artery bypass grafting cases. Matching the TQI data to the administrative claims database reduced the unknowns to 7%. The STS process of imputing unknown 30-day status as alive underestimates the true operative mortality O/E (1.27 before vs 1.30 after match), while excluding unknowns overestimates the operative mortality O/E (1.57 before vs 1.37 after match) for isolated coronary artery bypass grafting. The current STS algorithm of imputing unknown 30-day status as alive and a strategy of excluding cases with unknown 30-day status both result in erroneous calculation of operative mortality and operative mortality O/E. However, external validation by matching with an administrative database can improve the accuracy of clinical databases such as the STS Adult Cardiac Surgery Database. Copyright © 2017 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.

  6. A review of automatic mass detection and segmentation in mammographic images.

    PubMed

    Oliver, Arnau; Freixenet, Jordi; Martí, Joan; Pérez, Elsa; Pont, Josep; Denton, Erika R E; Zwiggelaar, Reyer

    2010-04-01

    The aim of this paper is to review existing approaches to the automatic detection and segmentation of masses in mammographic images, highlighting the key-points and main differences between the used strategies. The key objective is to point out the advantages and disadvantages of the various approaches. In contrast with other reviews which only describe and compare different approaches qualitatively, this review also provides a quantitative comparison. The performance of seven mass detection methods is compared using two different mammographic databases: a public digitised database and a local full-field digital database. The results are given in terms of Receiver Operating Characteristic (ROC) and Free-response Receiver Operating Characteristic (FROC) analysis. Copyright 2009 Elsevier B.V. All rights reserved.

  7. Accurate registration of temporal CT images for pulmonary nodules detection

    NASA Astrophysics Data System (ADS)

    Yan, Jichao; Jiang, Luan; Li, Qiang

    2017-02-01

    Interpretation of temporal CT images could help the radiologists to detect some subtle interval changes in the sequential examinations. The purpose of this study was to develop a fully automated scheme for accurate registration of temporal CT images for pulmonary nodule detection. Our method consisted of three major registration steps. Firstly, affine transformation was applied in the segmented lung region to obtain global coarse registration images. Secondly, B-splines based free-form deformation (FFD) was used to refine the coarse registration images. Thirdly, Demons algorithm was performed to align the feature points extracted from the registered images in the second step and the reference images. Our database consisted of 91 temporal CT cases obtained from Beijing 301 Hospital and Shanghai Changzheng Hospital. The preliminary results showed that approximately 96.7% cases could obtain accurate registration based on subjective observation. The subtraction images of the reference images and the rigid and non-rigid registered images could effectively remove the normal structures (i.e. blood vessels) and retain the abnormalities (i.e. pulmonary nodules). This would be useful for the screening of lung cancer in our future study.

  8. PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system

    PubMed Central

    Droit, Arnaud; Hunter, Joanna M; Rouleau, Michèle; Ethier, Chantal; Picard-Cloutier, Aude; Bourgais, David; Poirier, Guy G

    2007-01-01

    Background In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools. Description We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified. Conclusion Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5. PMID:18093328

  9. RCDB: Renal Cancer Gene Database.

    PubMed

    Ramana, Jayashree

    2012-05-18

    Renal cell carcinoma or RCC is one of the common and most lethal urological cancers, with 40% of the patients succumbing to death because of metastatic progression of the disease. Treatment of metastatic RCC remains highly challenging because of its resistance to chemotherapy as well as radiotherapy, besides surgical resection. Whereas RCC comprises tumors with differing histological types, clear cell RCC remains the most common. A major problem in the clinical management of patients presenting with localized ccRCC is the inability to determine tumor aggressiveness and accurately predict the risk of metastasis following surgery. As a measure to improve the diagnosis and prognosis of RCC, researchers have identified several molecular markers through a number of techniques. However the wealth of information available is scattered in literature and not easily amenable to data-mining. To reduce this gap, this work describes a comprehensive repository called Renal Cancer Gene Database, as an integrated gateway to study renal cancer related data. Renal Cancer Gene Database is a manually curated compendium of 240 protein-coding and 269 miRNA genes contributing to the etiology and pathogenesis of various forms of renal cell carcinomas. The protein coding genes have been classified according to the kind of gene alteration observed in RCC. RCDB also includes the miRNAsdysregulated in RCC, along with the corresponding information regarding the type of RCC and/or metastatic or prognostic significance. While some of the miRNA genes showed an association with other types of cancers few were unique to RCC. Users can query the database using keywords, category and chromosomal location of the genes. The knowledgebase can be freely accessed via a user-friendly web interface at http://www.juit.ac.in/attachments/jsr/rcdb/homenew.html. It is hoped that this database would serve as a useful complement to the existing public resources and as a good starting point for researchers and

  10. Optical measurements of paintings and the creation of an artwork database for authenticity

    PubMed Central

    Hwang, Seonhee; Song, Hyerin; Cho, Soon-Woo; Kim, Chang Eun; Kim, Chang-Seok; Kim, Kyujung

    2017-01-01

    Paintings have high cultural and commercial value, so that needs to be preserved. Many techniques have been attempted to analyze properties of paintings, including X-ray analysis and optical coherence tomography (OCT) methods, and enable conservation of paintings from forgeries. In this paper, we suggest a simple and accurate optical analysis system to protect them from counterfeit which is comprised of fiber optics reflectance spectroscopy (FORS) and line laser-based topographic analysis. The system is designed to fully cover the whole area of paintings regardless of its size for the accurate analysis. For additional assessments, a line laser-based high resolved OCT was utilized. Some forgeries were created by the experts from the three different styles of genuine paintings for the experiments. After measuring surface properties of paintings, we could observe the results from the genuine works and the forgeries have the distinctive characteristics. The forgeries could be distinguished maximally 76.5% with obtained RGB spectra by FORS and 100% by topographic analysis. Through the several executions, the reliability of the system was confirmed. We could verify that the measurement system is worthwhile for the conservation of the valuable paintings. To store the surface information of the paintings in micron scale, we created a numerical database. Consequently, we secured the databases of three different famous Korean paintings for accurate authenticity. PMID:28151981

  11. Optical measurements of paintings and the creation of an artwork database for authenticity.

    PubMed

    Hwang, Seonhee; Song, Hyerin; Cho, Soon-Woo; Kim, Chang Eun; Kim, Chang-Seok; Kim, Kyujung

    2017-01-01

    Paintings have high cultural and commercial value, so that needs to be preserved. Many techniques have been attempted to analyze properties of paintings, including X-ray analysis and optical coherence tomography (OCT) methods, and enable conservation of paintings from forgeries. In this paper, we suggest a simple and accurate optical analysis system to protect them from counterfeit which is comprised of fiber optics reflectance spectroscopy (FORS) and line laser-based topographic analysis. The system is designed to fully cover the whole area of paintings regardless of its size for the accurate analysis. For additional assessments, a line laser-based high resolved OCT was utilized. Some forgeries were created by the experts from the three different styles of genuine paintings for the experiments. After measuring surface properties of paintings, we could observe the results from the genuine works and the forgeries have the distinctive characteristics. The forgeries could be distinguished maximally 76.5% with obtained RGB spectra by FORS and 100% by topographic analysis. Through the several executions, the reliability of the system was confirmed. We could verify that the measurement system is worthwhile for the conservation of the valuable paintings. To store the surface information of the paintings in micron scale, we created a numerical database. Consequently, we secured the databases of three different famous Korean paintings for accurate authenticity.

  12. A georeferenced Landsat digital database for forest insect-damage assessment

    NASA Technical Reports Server (NTRS)

    Williams, D. L.; Nelson, R. F.; Dottavio, C. L.

    1985-01-01

    In 1869, the gypsy moth caterpillar was introduced in the U.S. in connection with the experiments of a French scientist. Throughout the insect's period of establishment, gypsy moth populations have periodically increased to epidemic proportions. For programs concerned with preventing the insect's spread, it would be highly desirable to be able to employ a survey technique which could provide timely, accurate, and standardized assessments at a reasonable cost. A project was, therefore, initiated with the aim to demonstrate the usefulness of satellite remotely sensed data for monitoring the insect defoliation of hardwood forests in Pennsylvania. A major effort within this project involved the development of a map-registered Landsat digital database. A complete description of the database developed is provided along with information regarding the employed data management system.

  13. New DMSP database of precipitating auroral electrons and ions

    NASA Astrophysics Data System (ADS)

    Redmon, Robert J.; Denig, William F.; Kilcommons, Liam M.; Knipp, Delores J.

    2017-08-01

    Since the mid-1970s, the Defense Meteorological Satellite Program (DMSP) spacecraft have operated instruments for monitoring the space environment from low Earth orbit. As the program evolved, so have the measurement capabilities such that modern DMSP spacecraft include a comprehensive suite of instruments providing estimates of precipitating electron and ion fluxes, cold/bulk plasma composition and moments, the geomagnetic field, and optical emissions in the far and extreme ultraviolet. We describe the creation of a new public database of precipitating electrons and ions from the Special Sensor J (SSJ) instrument, complete with original counts, calibrated differential fluxes adjusted for penetrating radiation, estimates of the total kinetic energy flux and characteristic energy, uncertainty estimates, and accurate ephemerides. These are provided in a common and self-describing format that covers 30+ years of DMSP spacecraft from F06 (launched in 1982) to F18 (launched in 2009). This new database is accessible at the National Centers for Environmental Information and the Coordinated Data Analysis Web. We describe how the new database is being applied to high-latitude studies of the colocation of kinetic and electromagnetic energy inputs, ionospheric conductivity variability, field-aligned currents, and auroral boundary identification. We anticipate that this new database will support a broad range of space science endeavors from single observatory studies to coordinated system science investigations.

  14. New DMSP Database of Precipitating Auroral Electrons and Ions.

    PubMed

    Redmon, Robert J; Denig, William F; Kilcommons, Liam M; Knipp, Delores J

    2017-08-01

    Since the mid 1970's, the Defense Meteorological Satellite Program (DMSP) spacecraft have operated instruments for monitoring the space environment from low earth orbit. As the program evolved, so to have the measurement capabilities such that modern DMSP spacecraft include a comprehensive suite of instruments providing estimates of precipitating electron and ion fluxes, cold/bulk plasma composition and moments, the geomagnetic field, and optical emissions in the far and extreme ultraviolet. We describe the creation of a new public database of precipitating electrons and ions from the Special Sensor J (SSJ) instrument, complete with original counts, calibrated differential fluxes adjusted for penetrating radiation, estimates of the total kinetic energy flux and characteristic energy, uncertainty estimates, and accurate ephemerides. These are provided in a common and self-describing format that covers 30+ years of DMSP spacecraft from F06 (launched in 1982) through F18 (launched in 2009). This new database is accessible at the National Centers for Environmental Information (NCEI) and the Coordinated Data Analysis Web (CDAWeb). We describe how the new database is being applied to high latitude studies of: the co-location of kinetic and electromagnetic energy inputs, ionospheric conductivity variability, field aligned currents and auroral boundary identification. We anticipate that this new database will support a broad range of space science endeavors from single observatory studies to coordinated system science investigations.

  15. Construction and validation of a population-based bone densitometry database.

    PubMed

    Leslie, William D; Caetano, Patricia A; Macwilliam, Leonard R; Finlayson, Gregory S

    2005-01-01

    Utilization of dual-energy X-ray absorptiometry (DXA) for the initial diagnostic assessment of osteoporosis and in monitoring treatment has risen dramatically in recent years. Population-based studies of the impact of DXA and osteoporosis remain challenging because of incomplete and fragmented test data that exist in most regions. Our aim was to create and assess completeness of a database of all clinical DXA services and test results for the province of Manitoba, Canada and to present descriptive data resulting from testing. A regionally based bone density program for the province of Manitoba, Canada was established in 1997. Subsequent DXA services were prospectively captured in a program database. This database was retrospectively populated with earlier DXA results dating back to 1990 (the year that the first DXA scanner was installed) by integrating multiple data sources. A random chart audit was performed to assess completeness and accuracy of this dataset. For comparison, testing rates determined from the DXA database were compared with physician administrative claims data. There was a high level of completeness of this database (>99%) and accurate personal identifier information sufficient for linkage with other health care administrative data (>99%). This contrasted with physician billing data that were found to be markedly incomplete. Descriptive data provide a profile of individuals receiving DXA and their test results. In conclusion, the Manitoba bone density database has great potential as a resource for clinical and health policy research because it is population based with a high level of completeness and accuracy.

  16. Crowdsourcing-Assisted Radio Environment Database for V2V Communication.

    PubMed

    Katagiri, Keita; Sato, Koya; Fujii, Takeo

    2018-04-12

    In order to realize reliable Vehicle-to-Vehicle (V2V) communication systems for autonomous driving, the recognition of radio propagation becomes an important technology. However, in the current wireless distributed network systems, it is difficult to accurately estimate the radio propagation characteristics because of the locality of the radio propagation caused by surrounding buildings and geographical features. In this paper, we propose a measurement-based radio environment database for improving the accuracy of the radio environment estimation in the V2V communication systems. The database first gathers measurement datasets of the received signal strength indicator (RSSI) related to the transmission/reception locations from V2V systems. By using the datasets, the average received power maps linked with transmitter and receiver locations are generated. We have performed measurement campaigns of V2V communications in the real environment to observe RSSI for the database construction. Our results show that the proposed method has higher accuracy of the radio propagation estimation than the conventional path loss model-based estimation.

  17. An ab initio electronic transport database for inorganic materials

    DOE PAGES

    Ricci, Francesco; Chen, Wei; Aydemir, Umut; ...

    2017-07-04

    Electronic transport in materials is governed by a series of tensorial properties such as conductivity, Seebeck coefficient, and effective mass. These quantities are paramount to the understanding of materials in many fields from thermoelectrics to electronics and photovoltaics. Transport properties can be calculated from a material’s band structure using the Boltzmann transport theory framework. We present here the largest computational database of electronic transport properties based on a large set of 48,000 materials originating from the Materials Project database. Our results were obtained through the interpolation approach developed in the BoltzTraP software, assuming a constant relaxation time. We present themore » workflow to generate the data, the data validation procedure, and the database structure. In conclusion, our aim is to target the large community of scientists developing materials selection strategies and performing studies involving transport properties.« less

  18. An ab initio electronic transport database for inorganic materials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ricci, Francesco; Chen, Wei; Aydemir, Umut

    Electronic transport in materials is governed by a series of tensorial properties such as conductivity, Seebeck coefficient, and effective mass. These quantities are paramount to the understanding of materials in many fields from thermoelectrics to electronics and photovoltaics. Transport properties can be calculated from a material’s band structure using the Boltzmann transport theory framework. We present here the largest computational database of electronic transport properties based on a large set of 48,000 materials originating from the Materials Project database. Our results were obtained through the interpolation approach developed in the BoltzTraP software, assuming a constant relaxation time. We present themore » workflow to generate the data, the data validation procedure, and the database structure. In conclusion, our aim is to target the large community of scientists developing materials selection strategies and performing studies involving transport properties.« less

  19. Phylogenomics databases for facilitating functional genomics in rice.

    PubMed

    Jung, Ki-Hong; Cao, Peijian; Sharma, Rita; Jain, Rashmi; Ronald, Pamela C

    2015-12-01

    The completion of whole genome sequence of rice (Oryza sativa) has significantly accelerated functional genomics studies. Prior to the release of the sequence, only a few genes were assigned a function each year. Since sequencing was completed in 2005, the rate has exponentially increased. As of 2014, 1,021 genes have been described and added to the collection at The Overview of functionally characterized Genes in Rice online database (OGRO). Despite this progress, that number is still very low compared with the total number of genes estimated in the rice genome. One limitation to progress is the presence of functional redundancy among members of the same rice gene family, which covers 51.6 % of all non-transposable element-encoding genes. There remain a significant portion or rice genes that are not functionally redundant, as reflected in the recovery of loss-of-function mutants. To more accurately analyze functional redundancy in the rice genome, we have developed a phylogenomics databases for six large gene families in rice, including those for glycosyltransferases, glycoside hydrolases, kinases, transcription factors, transporters, and cytochrome P450 monooxygenases. In this review, we introduce key features and applications of these databases. We expect that they will serve as a very useful guide in the post-genomics era of research.

  20. Recent Advances in the GLIMS Glacier Database

    NASA Astrophysics Data System (ADS)

    Raup, Bruce; Cogley, Graham; Zemp, Michael; Glaus, Ladina

    2017-04-01

    Glaciers are shrinking almost without exception. Glacier losses have impacts on local water availability and hazards, and contribute to sea level rise. To understand these impacts and the processes behind them, it is crucial to monitor glaciers through time by mapping their areal extent, changes in volume, elevation distribution, snow lines, ice flow velocities, and changes to associated water bodies. The glacier database of the Global Land Ice Measurements from Space (GLIMS) initiative is the only multi-temporal glacier database capable of tracking all these glacier measurements and providing them to the scientific community and broader public. Here we present recent results in 1) expansion of the geographic and temporal coverage of the GLIMS Glacier Database by drawing on the Randolph Glacier Inventory (RGI) and other new data sets; 2) improved tools for visualizing and downloading GLIMS data in a choice of formats and data models; and 3) a new data model for handling multiple glacier records through time while avoiding double-counting of glacier number or area. The result of this work is a more complete glacier data repository that shows not only the current state of glaciers on Earth, but how they have changed in recent decades. The database is useful for tracking changes in water resources, hazards, and mass budgets of the world's glaciers.

  1. RaftProt: mammalian lipid raft proteome database.

    PubMed

    Shah, Anup; Chen, David; Boda, Akash R; Foster, Leonard J; Davis, Melissa J; Hill, Michelle M

    2015-01-01

    RaftProt (http://lipid-raft-database.di.uq.edu.au/) is a database of mammalian lipid raft-associated proteins as reported in high-throughput mass spectrometry studies. Lipid rafts are specialized membrane microdomains enriched in cholesterol and sphingolipids thought to act as dynamic signalling and sorting platforms. Given their fundamental roles in cellular regulation, there is a plethora of information on the size, composition and regulation of these membrane microdomains, including a large number of proteomics studies. To facilitate the mining and analysis of published lipid raft proteomics studies, we have developed a searchable database RaftProt. In addition to browsing the studies, performing basic queries by protein and gene names, searching experiments by cell, tissue and organisms; we have implemented several advanced features to facilitate data mining. To address the issue of potential bias due to biochemical preparation procedures used, we have captured the lipid raft preparation methods and implemented advanced search option for methodology and sample treatment conditions, such as cholesterol depletion. Furthermore, we have identified a list of high confidence proteins, and enabled searching only from this list of likely bona fide lipid raft proteins. Given the apparent biological importance of lipid raft and their associated proteins, this database would constitute a key resource for the scientific community. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Review of Spatial-Database System Usability: Recommendations for the ADDNS Project

    DTIC Science & Technology

    2007-12-01

    basic GIS background information , with a closer look at spatial databases. A GIS is also a computer- based system designed to capture, manage...foundation for deploying enterprise-wide spatial information systems . According to Oracle® [18], it enables accurate delivery of location- based services...Toronto TR 2007-141 Lanter, D.P. (1991). Design of a lineage- based meta-data base for GIS. Cartography and Geographic Information Systems , 18

  3. Accurate proteome-wide protein quantification from high-resolution 15N mass spectra

    PubMed Central

    2011-01-01

    In quantitative mass spectrometry-based proteomics, the metabolic incorporation of a single source of 15N-labeled nitrogen has many advantages over using stable isotope-labeled amino acids. However, the lack of a robust computational framework for analyzing the resulting spectra has impeded wide use of this approach. We have addressed this challenge by introducing a new computational methodology for analyzing 15N spectra in which quantification is integrated with identification. Application of this method to an Escherichia coli growth transition reveals significant improvement in quantification accuracy over previous methods. PMID:22182234

  4. FreeSolv: A database of experimental and calculated hydration free energies, with input files

    PubMed Central

    Mobley, David L.; Guthrie, J. Peter

    2014-01-01

    This work provides a curated database of experimental and calculated hydration free energies for small neutral molecules in water, along with molecular structures, input files, references, and annotations. We call this the Free Solvation Database, or FreeSolv. Experimental values were taken from prior literature and will continue to be curated, with updated experimental references and data added as they become available. Calculated values are based on alchemical free energy calculations using molecular dynamics simulations. These used the GAFF small molecule force field in TIP3P water with AM1-BCC charges. Values were calculated with the GROMACS simulation package, with full details given in references cited within the database itself. This database builds in part on a previous, 504-molecule database containing similar information. However, additional curation of both experimental data and calculated values has been done here, and the total number of molecules is now up to 643. Additional information is now included in the database, such as SMILES strings, PubChem compound IDs, accurate reference DOIs, and others. One version of the database is provided in the Supporting Information of this article, but as ongoing updates are envisioned, the database is now versioned and hosted online. In addition to providing the database, this work describes its construction process. The database is available free-of-charge via http://www.escholarship.org/uc/item/6sd403pz. PMID:24928188

  5. Development of a Searchable Database of Cryoablation Simulations for Use in Treatment Planning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boas, F. Edward, E-mail: boasf@mskcc.org; Srimathveeravalli, Govindarajan, E-mail: srimaths@mskcc.org; Durack, Jeremy C., E-mail: durackj@mskcc.org

    PurposeTo create and validate a planning tool for multiple-probe cryoablation, using simulations of ice ball size and shape for various ablation probe configurations, ablation times, and types of tissue ablated.Materials and MethodsIce ball size and shape was simulated using the Pennes bioheat equation. Five thousand six hundred and seventy different cryoablation procedures were simulated, using 1–6 cryoablation probes and 1–2 cm spacing between probes. The resulting ice ball was measured along three perpendicular axes and recorded in a database. Simulated ice ball sizes were compared to gel experiments (26 measurements) and clinical cryoablation cases (42 measurements). The clinical cryoablation measurements weremore » obtained from a HIPAA-compliant retrospective review of kidney and liver cryoablation procedures between January 2015 and February 2016. Finally, we created a web-based cryoablation planning tool, which uses the cryoablation simulation database to look up the probe spacing and ablation time that produces the desired ice ball shape and dimensions.ResultsAverage absolute error between the simulated and experimentally measured ice balls was 1 mm in gel experiments and 4 mm in clinical cryoablation cases. The simulations accurately predicted the degree of synergy in multiple-probe ablations. The cryoablation simulation database covers a wide range of ice ball sizes and shapes up to 9.8 cm.ConclusionCryoablation simulations accurately predict the ice ball size in multiple-probe ablations. The cryoablation database can be used to plan ablation procedures: given the desired ice ball size and shape, it will find the number and type of probes, probe configuration and spacing, and ablation time required.« less

  6. Exploring Site-Specific N-Glycosylation Microheterogeneity of Haptoglobin using Glycopeptide CID Tandem Mass Spectra and Glycan Database Search

    PubMed Central

    Chandler, Kevin Brown; Pompach, Petr; Goldman, Radoslav

    2013-01-01

    Glycosylation is a common protein modification with a significant role in many vital cellular processes and human diseases, making the characterization of protein-attached glycan structures important for understanding cell biology and disease processes. Direct analysis of protein N-glycosylation by tandem mass spectrometry of glycopeptides promises site-specific elucidation of N-glycan microheterogeneity, something which detached N-glycan and de-glycosylated peptide analyses cannot provide. However, successful implementation of direct N-glycopeptide analysis by tandem mass spectrometry remains a challenge. In this work, we consider algorithmic techniques for the analysis of LC-MS/MS data acquired from glycopeptide-enriched fractions of enzymatic digests of purified proteins. We implement a computational strategy which takes advantage of the properties of CID fragmentation spectra of N-glycopeptides, matching the MS/MS spectra to peptide-glycan pairs from protein sequences and glycan structure databases. Significantly, we also propose a novel false-discovery-rate estimation technique to estimate and manage the number of false identifications. We use a human glycoprotein standard, haptoglobin, digested with trypsin and GluC, enriched for glycopeptides using HILIC chromatography, and analyzed by LC-MS/MS to demonstrate our algorithmic strategy and evaluate its performance. Our software, GlycoPeptideSearch (GPS), assigned glycopeptide identifications to 246 of the spectra at false-discovery-rate 5.58%, identifying 42 distinct haptoglobin peptide-glycan pairs at each of the four haptoglobin N-linked glycosylation sites. We further demonstrate the effectiveness of this approach by analyzing plasma-derived haptoglobin, identifying 136 N-linked glycopeptide spectra at false-discovery-rate 0.4%, representing 15 distinct glycopeptides on at least three of the four N-linked glycosylation sites. The software, GlycoPeptideSearch, is available for download from http

  7. Mass and Reliability System (MaRS)

    NASA Technical Reports Server (NTRS)

    Barnes, Sarah

    2016-01-01

    The Safety and Mission Assurance (S&MA) Directorate is responsible for mitigating risk, providing system safety, and lowering risk for space programs from ground to space. The S&MA is divided into 4 divisions: The Space Exploration Division (NC), the International Space Station Division (NE), the Safety & Test Operations Division (NS), and the Quality and Flight Equipment Division (NT). The interns, myself and Arun Aruljothi, will be working with the Risk & Reliability Analysis Branch under the NC Division's. The mission of this division is to identify, characterize, diminish, and communicate risk by implementing an efficient and effective assurance model. The team utilizes Reliability and Maintainability (R&M) and Probabilistic Risk Assessment (PRA) to ensure decisions concerning risks are informed, vehicles are safe and reliable, and program/project requirements are realistic and realized. This project pertains to the Orion mission, so it is geared toward a long duration Human Space Flight Program(s). For space missions, payload is a critical concept; balancing what hardware can be replaced by components verse by Orbital Replacement Units (ORU) or subassemblies is key. For this effort a database was created that combines mass and reliability data, called Mass and Reliability System or MaRS. The U.S. International Space Station (ISS) components are used as reference parts in the MaRS database. Using ISS components as a platform is beneficial because of the historical context and the environment similarities to a space flight mission. MaRS uses a combination of systems: International Space Station PART for failure data, Vehicle Master Database (VMDB) for ORU & components, Maintenance & Analysis Data Set (MADS) for operation hours and other pertinent data, & Hardware History Retrieval System (HHRS) for unit weights. MaRS is populated using a Visual Basic Application. Once populated, the excel spreadsheet is comprised of information on ISS components including

  8. TFBSshape: a motif database for DNA shape features of transcription factor binding sites.

    PubMed

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.

  9. TFBSshape: a motif database for DNA shape features of transcription factor binding sites

    PubMed Central

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955

  10. Binary Lenses in OGLE-III EWS Database. Seasons 2002-2003

    NASA Astrophysics Data System (ADS)

    Jaroszynski, M.; Udalski, A.; Kubiak, M.; Szymanski, M.; Pietrzynski, G.; Soszynski, I.; Zebrun, K.; Szewczyk, O.; Wyrzykowski, L.

    2004-06-01

    We present 15 binary lens candidates from OGLE-III Early Warning System database for seasons 2002-2003. We also found 15 events interpreted as single mass lensing of double sources. The candidates were selected by visual light curves inspection. Examining the models of binary lenses of this and our previous study (10 caustic crossing events of OGLE-II seasons 1997--1999) we find one case of extreme mass ratio binary (q approx 0.005) and the rest in the range 0.1mass ratios and separations for binary stars.

  11. Doubling down on peptide phosphorylation as a variable mass modification

    USDA-ARS?s Scientific Manuscript database

    Some mass spectrometrists believe that searching for variable post-translational modifications like phosphorylation of serine or threonine when using database-search algorithms to interpret peptide tandem mass spectra will increase false positive rates. The basis for this is the premise that the al...

  12. Matrix-assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) Can Precisely Discriminate the Lineages of Listeria monocytogenes and Species of Listeria

    PubMed Central

    Yamamoto, Naomi; Takahashi, Hajime; Tamura, Hiroto

    2016-01-01

    The genetic lineages of Listeria monocytogenes and other species of the genus Listeria are correlated with pathogenesis in humans. Although matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) has become a prevailing tool for rapid and reliable microbial identification, the precise discrimination of Listeria species and lineages remains a crucial issue in clinical settings and for food safety. In this study, we constructed an accurate and reliable MS database to discriminate the lineages of L. monocytogenes and the species of Listeria (L. monocytogenes, L. innocua, L. welshimeri, L. seeligeri, L. ivanovii, L. grayi, and L. rocourtiae) based on the S10-spc-alpha operon gene encoded ribosomal protein mass spectrum (S10-GERMS) proteotyping method, which relies on both genetic information (genomics) and observed MS peaks in MALDI-TOF MS (proteomics). The specific set of eight biomarkers (ribosomal proteins L24, L6, L18, L15, S11, S9, L31 type B, and S16) yielded characteristic MS patterns for the lineages of L. monocytogenes and the different species of Listeria, and led to the construction of a MS database that was successful in discriminating between these organisms in MALDI-TOF MS fingerprinting analysis followed by advanced proteotyping software Strain Solution analysis. We also confirmed the constructed database on the proteotyping software Strain Solution by using 23 Listeria strains collected from natural sources. PMID:27442502

  13. Biological agents database in the armed forces.

    PubMed

    Niemcewicz, Marcin; Kocik, Janusz; Bielecka, Anna; Wierciński, Michał

    2014-10-01

    Rapid detection and identification of the biological agent during both, natural or deliberate outbreak is crucial for implementation of appropriate control measures and procedures in order to mitigate the spread of disease. Determination of pathogen etiology may not only support epidemiological investigation and safety of human beings, but also enhance forensic efforts in pathogen tracing, collection of evidences and correct inference. The article presents objectives of the Biological Agents Database, which was developed for the purpose of the Ministry of National Defense of the Republic of Poland under the European Defence Agency frame. The Biological Agents Database is an electronic catalogue of genetic markers of highly dangerous pathogens and biological agents of weapon of mass destruction concern, which provides full identification of biological threats emerging in Poland and in locations of activity of Polish troops. The Biological Agents Database is a supportive tool used for tracing biological agents' origin as well as rapid identification of agent causing the disease of unknown etiology. It also provides support in diagnosis, analysis, response and exchange of information between institutions that use information contained in it. Therefore, it can be used not only for military purposes, but also in a civilian environment.

  14. An analytical platform for mass spectrometry-based identification and chemical analysis of RNA in ribonucleoprotein complexes.

    PubMed

    Taoka, Masato; Yamauchi, Yoshio; Nobe, Yuko; Masaki, Shunpei; Nakayama, Hiroshi; Ishikawa, Hideaki; Takahashi, Nobuhiro; Isobe, Toshiaki

    2009-11-01

    We describe here a mass spectrometry (MS)-based analytical platform of RNA, which combines direct nano-flow reversed-phase liquid chromatography (RPLC) on a spray tip column and a high-resolution LTQ-Orbitrap mass spectrometer. Operating RPLC under a very low flow rate with volatile solvents and MS in the negative mode, we could estimate highly accurate mass values sufficient to predict the nucleotide composition of a approximately 21-nucleotide small interfering RNA, detect post-transcriptional modifications in yeast tRNA, and perform collision-induced dissociation/tandem MS-based structural analysis of nucleolytic fragments of RNA at a sub-femtomole level. Importantly, the method allowed the identification and chemical analysis of small RNAs in ribonucleoprotein (RNP) complex, such as the pre-spliceosomal RNP complex, which was pulled down from cultured cells with a tagged protein cofactor as bait. We have recently developed a unique genome-oriented database search engine, Ariadne, which allows tandem MS-based identification of RNAs in biological samples. Thus, the method presented here has broad potential for automated analysis of RNA; it complements conventional molecular biology-based techniques and is particularly suited for simultaneous analysis of the composition, structure, interaction, and dynamics of RNA and protein components in various cellular RNP complexes.

  15. A robust and accurate numerical method for transcritical turbulent flows at supercritical pressure with an arbitrary equation of state

    NASA Astrophysics Data System (ADS)

    Kawai, Soshi; Terashima, Hiroshi; Negishi, Hideyo

    2015-11-01

    This paper addresses issues in high-fidelity numerical simulations of transcritical turbulent flows at supercritical pressure. The proposed strategy builds on a tabulated look-up table method based on REFPROP database for an accurate estimation of non-linear behaviors of thermodynamic and fluid transport properties at the transcritical conditions. Based on the look-up table method we propose a numerical method that satisfies high-order spatial accuracy, spurious-oscillation-free property, and capability of capturing the abrupt variation in thermodynamic properties across the transcritical contact surface. The method introduces artificial mass diffusivity to the continuity and momentum equations in a physically-consistent manner in order to capture the steep transcritical thermodynamic variations robustly while maintaining spurious-oscillation-free property in the velocity field. The pressure evolution equation is derived from the full compressible Navier-Stokes equations and solved instead of solving the total energy equation to achieve the spurious pressure oscillation free property with an arbitrary equation of state including the present look-up table method. Flow problems with and without physical diffusion are employed for the numerical tests to validate the robustness, accuracy, and consistency of the proposed approach.

  16. A robust and accurate numerical method for transcritical turbulent flows at supercritical pressure with an arbitrary equation of state

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kawai, Soshi, E-mail: kawai@cfd.mech.tohoku.ac.jp; Terashima, Hiroshi; Negishi, Hideyo

    2015-11-01

    This paper addresses issues in high-fidelity numerical simulations of transcritical turbulent flows at supercritical pressure. The proposed strategy builds on a tabulated look-up table method based on REFPROP database for an accurate estimation of non-linear behaviors of thermodynamic and fluid transport properties at the transcritical conditions. Based on the look-up table method we propose a numerical method that satisfies high-order spatial accuracy, spurious-oscillation-free property, and capability of capturing the abrupt variation in thermodynamic properties across the transcritical contact surface. The method introduces artificial mass diffusivity to the continuity and momentum equations in a physically-consistent manner in order to capture themore » steep transcritical thermodynamic variations robustly while maintaining spurious-oscillation-free property in the velocity field. The pressure evolution equation is derived from the full compressible Navier–Stokes equations and solved instead of solving the total energy equation to achieve the spurious pressure oscillation free property with an arbitrary equation of state including the present look-up table method. Flow problems with and without physical diffusion are employed for the numerical tests to validate the robustness, accuracy, and consistency of the proposed approach.« less

  17. A generalised multiple-mass based method for the determination of the live mass of a force transducer

    NASA Astrophysics Data System (ADS)

    Montalvão, Diogo; Baker, Thomas; Ihracska, Balazs; Aulaqi, Muhammad

    2017-01-01

    Many applications in Experimental Modal Analysis (EMA) require that the sensors' masses are known. This is because the added mass from sensors will affect the structural mode shapes, and in particular its natural frequencies. EMA requires the measurement of the exciting forces at given coordinates, which is often made using piezoelectric force transducers. In such a case, the live mass of the force transducer, i.e. the mass as 'seen' by the structure in perpendicular directions must be measured somehow, so that compensation methods like mass cancelation can be performed. This however presents a problem on how to obtain an accurate measurement for the live mass. If the system is perfectly calibrated, then a reasonably accurate estimate can be made using a straightforward method available in most classical textbooks based on Newton's second law. However, this is often not the case (for example when the transducer's sensitivity changed over time, when it is unknown or when the connection influences the transmission of the force). In a self-calibrating iterative method, both the live mass and calibration factor are determined, but this paper shows that the problem may be ill-conditioned, producing misleading results if certain conditions are not met. Therefore, a more robust method is presented and discussed in this paper, reducing the ill-conditioning problems and the need to know the calibration factors beforehand. The three methods will be compared and discussed through numerical and experimental examples, showing that classical EMA still is a field of research that deserves the attention from scientists and engineers.

  18. High-Resolution Metabolomics Assessment of Military Personnel: Evaluating Analytical Strategies for Chemical Detection.

    PubMed

    Liu, Ken H; Walker, Douglas I; Uppal, Karan; Tran, ViLinh; Rohrbeck, Patricia; Mallon, Timothy M; Jones, Dean P

    2016-08-01

    The aim of this study was to maximize detection of serum metabolites with high-resolution metabolomics (HRM). Department of Defense Serum Repository (DoDSR) samples were analyzed using ultrahigh resolution mass spectrometry with three complementary chromatographic phases and four ionization modes. Chemical coverage was evaluated by number of ions detected and accurate mass matches to a human metabolomics database. Individual HRM platforms provided accurate mass matches for up to 58% of the KEGG metabolite database. Combining two analytical methods increased matches to 72% and included metabolites in most major human metabolic pathways and chemical classes. Detection and feature quality varied by analytical configuration. Dual chromatography HRM with positive and negative electrospray ionization provides an effective generalized method for metabolic assessment of military personnel.

  19. High-resolution metabolomics assessment of military personnel: Evaluating analytical strategies for chemical detection

    PubMed Central

    Liu, Ken H.; Walker, Douglas I.; Uppal, Karan; Tran, ViLinh; Rohrbeck, Patricia; Mallon, Timothy M.; Jones, Dean P.

    2016-01-01

    Objective To maximize detection of serum metabolites with high-resolution metabolomics (HRM). Methods Department of Defense Serum Repository (DoDSR) samples were analyzed using ultra-high resolution mass spectrometry with three complementary chromatographic phases and four ionization modes. Chemical coverage was evaluated by number of ions detected and accurate mass matches to a human metabolomics database. Results Individual HRM platforms provided accurate mass matches for up to 58% of the KEGG metabolite database. Combining two analytical methods increased matches to 72%, and included metabolites in most major human metabolic pathways and chemical classes. Detection and feature quality varied by analytical configuration. Conclusions Dual chromatography HRM with positive and negative electrospray ionization provides an effective generalized method for metabolic assessment of military personnel. PMID:27501105

  20. Advances in Proteomics Data Analysis and Display Using an Accurate Mass and Time Tag Approach

    PubMed Central

    Zimmer, Jennifer S.D.; Monroe, Matthew E.; Qian, Wei-Jun; Smith, Richard D.

    2007-01-01

    Proteomics has recently demonstrated utility in understanding cellular processes on the molecular level as a component of systems biology approaches and for identifying potential biomarkers of various disease states. The large amount of data generated by utilizing high efficiency (e.g., chromatographic) separations coupled to high mass accuracy mass spectrometry for high-throughput proteomics analyses presents challenges related to data processing, analysis, and display. This review focuses on recent advances in nanoLC-FTICR-MS-based proteomics approaches and the accompanying data processing tools that have been developed to display and interpret the large volumes of data being produced. PMID:16429408

  1. Accurate read-based metagenome characterization using a hierarchical suite of unique signatures

    PubMed Central

    Freitas, Tracey Allen K.; Li, Po-E; Scholz, Matthew B.; Chain, Patrick S. G.

    2015-01-01

    A major challenge in the field of shotgun metagenomics is the accurate identification of organisms present within a microbial community, based on classification of short sequence reads. Though existing microbial community profiling methods have attempted to rapidly classify the millions of reads output from modern sequencers, the combination of incomplete databases, similarity among otherwise divergent genomes, errors and biases in sequencing technologies, and the large volumes of sequencing data required for metagenome sequencing has led to unacceptably high false discovery rates (FDR). Here, we present the application of a novel, gene-independent and signature-based metagenomic taxonomic profiling method with significantly and consistently smaller FDR than any other available method. Our algorithm circumvents false positives using a series of non-redundant signature databases and examines Genomic Origins Through Taxonomic CHAllenge (GOTTCHA). GOTTCHA was tested and validated on 20 synthetic and mock datasets ranging in community composition and complexity, was applied successfully to data generated from spiked environmental and clinical samples, and robustly demonstrates superior performance compared with other available tools. PMID:25765641

  2. Field Validation of Food Service Listings: A Comparison of Commercial and Online Geographic Information System Databases

    PubMed Central

    Seliske, Laura; Pickett, William; Bates, Rebecca; Janssen, Ian

    2012-01-01

    Many studies examining the food retail environment rely on geographic information system (GIS) databases for location information. The purpose of this study was to validate information provided by two GIS databases, comparing the positional accuracy of food service places within a 1 km circular buffer surrounding 34 schools in Ontario, Canada. A commercial database (InfoCanada) and an online database (Yellow Pages) provided the addresses of food service places. Actual locations were measured using a global positioning system (GPS) device. The InfoCanada and Yellow Pages GIS databases provided the locations for 973 and 675 food service places, respectively. Overall, 749 (77.1%) and 595 (88.2%) of these were located in the field. The online database had a higher proportion of food service places found in the field. The GIS locations of 25% of the food service places were located within approximately 15 m of their actual location, 50% were within 25 m, and 75% were within 50 m. This validation study provided a detailed assessment of errors in the measurement of the location of food service places in the two databases. The location information was more accurate for the online database, however, when matching criteria were more conservative, there were no observed differences in error between the databases. PMID:23066385

  3. Field validation of food service listings: a comparison of commercial and online geographic information system databases.

    PubMed

    Seliske, Laura; Pickett, William; Bates, Rebecca; Janssen, Ian

    2012-08-01

    Many studies examining the food retail environment rely on geographic information system (GIS) databases for location information. The purpose of this study was to validate information provided by two GIS databases, comparing the positional accuracy of food service places within a 1 km circular buffer surrounding 34 schools in Ontario, Canada. A commercial database (InfoCanada) and an online database (Yellow Pages) provided the addresses of food service places. Actual locations were measured using a global positioning system (GPS) device. The InfoCanada and Yellow Pages GIS databases provided the locations for 973 and 675 food service places, respectively. Overall, 749 (77.1%) and 595 (88.2%) of these were located in the field. The online database had a higher proportion of food service places found in the field. The GIS locations of 25% of the food service places were located within approximately 15 m of their actual location, 50% were within 25 m, and 75% were within 50 m. This validation study provided a detailed assessment of errors in the measurement of the location of food service places in the two databases. The location information was more accurate for the online database, however, when matching criteria were more conservative, there were no observed differences in error between the databases.

  4. ClassLess: A Comprehensive Database of Young Stellar Objects

    NASA Astrophysics Data System (ADS)

    Hillenbrand, Lynne; Baliber, Nairn

    2015-01-01

    We have designed and constructed a database housing published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks.

  5. The white dwarf mass-radius relation with Gaia, Hubble and FUSE

    NASA Astrophysics Data System (ADS)

    Joyce, Simon R. G.; Barstow, Martin A.; Casewell, Sarah L.; Holberg, Jay B.; Bond, Howard E.

    2018-04-01

    White dwarfs are becoming useful tools for many areas of astronomy. They can be used as accurate chronometers over Gyr timescales. They are also clues to the history of star formation in our galaxy. Many of these studies require accurate estimates of the mass of the white dwarf. The theoretical mass-radius relation is often invoked to provide these mass estimates. While the theoretical mass-radius relation is well developed, observational tests of this relation show a much larger scatter in the results than expected. High precision observational tests to confirm this relation are required. Gaia is providing distance measurements which will remove one of the main source of uncertainty affecting most previous observations. We combine Gaia distances with spectra from the Hubble and FUSE satelites to make precise tests of the white dwarf mass-radius relation.

  6. Consensus and conflict cards for metabolic pathway databases

    PubMed Central

    2013-01-01

    Background The metabolic network of H. sapiens and many other organisms is described in multiple pathway databases. The level of agreement between these descriptions, however, has proven to be low. We can use these different descriptions to our advantage by identifying conflicting information and combining their knowledge into a single, more accurate, and more complete description. This task is, however, far from trivial. Results We introduce the concept of Consensus and Conflict Cards (C2Cards) to provide concise overviews of what the databases do or do not agree on. Each card is centered at a single gene, EC number or reaction. These three complementary perspectives make it possible to distinguish disagreements on the underlying biology of a metabolic process from differences that can be explained by different decisions on how and in what detail to represent knowledge. As a proof-of-concept, we implemented C2CardsHuman, as a web application http://www.molgenis.org/c2cards, covering five human pathway databases. Conclusions C2Cards can contribute to ongoing reconciliation efforts by simplifying the identification of consensus and conflicts between pathway databases and lowering the threshold for experts to contribute. Several case studies illustrate the potential of the C2Cards in identifying disagreements on the underlying biology of a metabolic process. The overviews may also point out controversial biological knowledge that should be subject of further research. Finally, the examples provided emphasize the importance of manual curation and the need for a broad community involvement. PMID:23803311

  7. Crux: Rapid Open Source Protein Tandem Mass Spectrometry Analysis

    PubMed Central

    2015-01-01

    Efficiently and accurately analyzing big protein tandem mass spectrometry data sets requires robust software that incorporates state-of-the-art computational, machine learning, and statistical methods. The Crux mass spectrometry analysis software toolkit (http://cruxtoolkit.sourceforge.net) is an open source project that aims to provide users with a cross-platform suite of analysis tools for interpreting protein mass spectrometry data. PMID:25182276

  8. Microorganisms in cryopreserved semen and culture media used in the in vitro production (IVP) of bovine embryos identified by matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS).

    PubMed

    Zampieri, Dávila; Santos, Vanessa G; Braga, Patrícia A C; Ferreira, Christina R; Ballottin, Daniela; Tasic, Ljubica; Basso, Andréa C; Sanches, Bruno V; Pontes, José H F; da Silva, Bárbara Pereira; Garboggini, Fabiana Fantinatti; Eberlin, Marcos N; Tata, Alessandra

    2013-09-01

    Commercial cattle breeders produce their own herd offspring for the dairy and beef market using artificial insemination. The procedure involves sanitary risks associated with the collection and commercialization of the germplasm, and the in vitro production and transfer of the bovine embryos must be monitored by strict health surveillance. To avoid the spreading of infectious diseases, one must rely on using controlled and monitored germplasm, media, and reagents that are guaranteed free of pathogens. In this article, we investigated the use of a new mass spectrometric approach for fast and accurate identification of bacteria and fungi in bovine semen and in culture media employed in the embryo in vitro production process. The microorganisms isolated from samples obtained in a commercial bovine embryo IVP setting were identified in a few minutes by their conserved peptide/protein profile, obtained applying matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS), matched against a commercial database. The successful microorganisms MS identification has been confirmed by DNA amplification and sequencing. Therefore, the MS technique seems to offer a powerful tool for rapid and accurate microorganism identification in semen and culture media samples. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. Toward Phase IV, Populating the WOVOdat Database

    NASA Astrophysics Data System (ADS)

    Ratdomopurbo, A.; Newhall, C. G.; Schwandner, F. M.; Selva, J.; Ueda, H.

    2009-12-01

    One of challenges for volcanologists is the fact that more and more people are likely to live on volcanic slopes. Information about volcanic activity during unrest should be accurate and rapidly distributed. As unrest may lead to eruption, evacuation may be necessary to minimize damage and casualties. The decision to evacuate people is usually based on the interpretation of monitoring data. Over the past several decades, monitoring volcanoes has used more and more sophisticated instruments. A huge volume of data is collected in order to understand the state of activity and behaviour of a volcano. WOVOdat, The World Organization of Volcano Observatories (WOVO) Database of Volcanic Unrest, will provide context within which scientists can interpret the state of their own volcano, during and between crises. After a decision during the 2000 IAVCEI General Assembly to create WOVOdat, development has passed through several phases, from Concept Development (Phase-I in 2000-2002), Database Design (Phase-II, 2003-2006) and Pilot Testing (Phase-III in 2007-2008). For WOVOdat to be operational, there are still two (2) steps to complete, which are: Database Population (Phase-IV) and Enhancement and Maintenance (Phase-V). Since January 2009, the WOVOdat project is hosted by Earth Observatory of Singapore for at least a 5-year period. According to the original planning in 2002, this 5-year period will be used for completing the Phase-IV. As the WOVOdat design is not yet tested for all types of data, 2009 is still reserved for building the back-end relational database management system (RDBMS) of WOVOdat and testing it with more complex data. Fine-tuning of the WOVOdat’s RDBMS design is being done with each new upload of observatory data. The next and main phase of WOVOdat development will be data population, managing data transfer from multiple observatory formats to WOVOdat format. Data population will depend on two important things, the availability of SQL database in volcano

  10. AUTOMATED DETERMINATION OF PRECURSOR ION, PRODUCT ION, AND NEUTRAL LOSS COMPOSITIONS AND DECONVOLUTION OF COMPOSITE MASS SPECTRA USING ION CORRELATION BASED ON EXACT MASSES AND RELATIVE ISOTOPIC ABUNDANCES

    EPA Science Inventory

    After a dispersive event, rapid determination of elemental compositions of ions in mass spectra is essential for tentatively identifying compounds. A Direct Analysis in Real Time (DART)® ion source interfaced to a JEOL AccuTOF® mass spectrometer provided exact masses accurate to ...

  11. Solid renal masses in adults

    PubMed Central

    Mittal, Mahesh Kumar; Sureka, Binit

    2016-01-01

    With the ever increasing trend of using cross-section imaging in today's era, incidental detection of small solid renal masses has dramatically multiplied. Coincidentally, the number of asymptomatic benign lesions being detected has also increased. The role of radiologists is not only to identify these lesions, but also go a one step further and accurately characterize various renal masses. Earlier detection of small renal cell carcinomas means identifying at the initial stage which has an impact on prognosis, patient management and healthcare costs. In this review article we share our experience with the typical and atypical solid renal masses encountered in adults in routine daily practice. PMID:28104933

  12. CBD: a biomarker database for colorectal cancer.

    PubMed

    Zhang, Xueli; Sun, Xiao-Feng; Cao, Yang; Ye, Benchen; Peng, Qiliang; Liu, Xingyun; Shen, Bairong; Zhang, Hong

    2018-01-01

    Colorectal cancer (CRC) biomarker database (CBD) was established based on 870 identified CRC biomarkers and their relevant information from 1115 original articles in PubMed published from 1986 to 2017. In this version of the CBD, CRC biomarker data were collected, sorted, displayed and analysed. The CBD with the credible contents as a powerful and time-saving tool provide more comprehensive and accurate information for further CRC biomarker research. The CBD was constructed under MySQL server. HTML, PHP and JavaScript languages have been used to implement the web interface. The Apache was selected as HTTP server. All of these web operations were implemented under the Windows system. The CBD could provide to users the multiple individual biomarker information and categorized into the biological category, source and application of biomarkers; the experiment methods, results, authors and publication resources; the research region, the average age of cohort, gender, race, the number of tumours, tumour location and stage. We only collect data from the articles with clear and credible results to prove the biomarkers are useful in the diagnosis, treatment or prognosis of CRC. The CBD can also provide a professional platform to researchers who are interested in CRC research to communicate, exchange their research ideas and further design high-quality research in CRC. They can submit their new findings to our database via the submission page and communicate with us in the CBD.Database URL: http://sysbio.suda.edu.cn/CBD/.

  13. CBD: a biomarker database for colorectal cancer

    PubMed Central

    Zhang, Xueli; Sun, Xiao-Feng; Ye, Benchen; Peng, Qiliang; Liu, Xingyun; Shen, Bairong; Zhang, Hong

    2018-01-01

    Abstract Colorectal cancer (CRC) biomarker database (CBD) was established based on 870 identified CRC biomarkers and their relevant information from 1115 original articles in PubMed published from 1986 to 2017. In this version of the CBD, CRC biomarker data were collected, sorted, displayed and analysed. The CBD with the credible contents as a powerful and time-saving tool provide more comprehensive and accurate information for further CRC biomarker research. The CBD was constructed under MySQL server. HTML, PHP and JavaScript languages have been used to implement the web interface. The Apache was selected as HTTP server. All of these web operations were implemented under the Windows system. The CBD could provide to users the multiple individual biomarker information and categorized into the biological category, source and application of biomarkers; the experiment methods, results, authors and publication resources; the research region, the average age of cohort, gender, race, the number of tumours, tumour location and stage. We only collect data from the articles with clear and credible results to prove the biomarkers are useful in the diagnosis, treatment or prognosis of CRC. The CBD can also provide a professional platform to researchers who are interested in CRC research to communicate, exchange their research ideas and further design high-quality research in CRC. They can submit their new findings to our database via the submission page and communicate with us in the CBD. Database URL: http://sysbio.suda.edu.cn/CBD/ PMID:29846545

  14. Advances in Multicollector ICPMS for precise and accurate isotope ratio measurements of Uranium isotopes

    NASA Astrophysics Data System (ADS)

    Bouman, C.; Lloyd, N. S.; Schwieters, J.

    2011-12-01

    The accurate and precise determination of uranium isotopes is challenging, because of the large dynamic range posed by the U isotope abundances and the limited available sample material. Various mass spectrometric techniques are used for the measurement of U isotopes, where TIMS is the most accepted and accurate one. Multicollector inductively coupled plasma mass spectrometry (MC-ICPMS) can offer higher productivity compared to TIMS, but is traditionally limited by low efficiency of sample utilisation. This contribution will discuss progress in MC-ICPMS for detecting 234U, 235U, 236U and 238U in various uranium reference materials from IRMM and NBL. The Thermo Scientific NEPTUNE Plus with Jet Interface offers a modified dry plasma ICP interface using a large interface pump combined with a special set of sample and skimmer cones giving ultimate sensitivity for all elements across the mass range. For uranium, an ion yield of > 3 % was reported previously [1]. The NEPTUNE Plus also offers Multi Ion Counting using discrete dynode electron multipliers as well as two high abundance-sensitivity filters to discriminate against peak tailing effects on 234U and 236U originating from the major uranium beams. These improvements in sensitivity and dynamic range allow accurate measurements of 234U, 235U and 236U abundances on very small samples and at low concentration. In our approach, minor U isotopes 234U and 236U were detected on ion counters with high abundance sensitivity filters, whereas 235U and 238U were detected on Faraday Cups using a high gain current amplifier (10e12 Ohm) for 235U. Precisions and accuracies for 234U and 236U were down to ~1%. For 235U, subpermil levels were reached.

  15. Development of a data entry auditing protocol and quality assurance for a tissue bank database.

    PubMed

    Khushi, Matloob; Carpenter, Jane E; Balleine, Rosemary L; Clarke, Christine L

    2012-03-01

    Human transcription error is an acknowledged risk when extracting information from paper records for entry into a database. For a tissue bank, it is critical that accurate data are provided to researchers with approved access to tissue bank material. The challenges of tissue bank data collection include manual extraction of data from complex medical reports that are accessed from a number of sources and that differ in style and layout. As a quality assurance measure, the Breast Cancer Tissue Bank (http:\\\\www.abctb.org.au) has implemented an auditing protocol and in order to efficiently execute the process, has developed an open source database plug-in tool (eAuditor) to assist in auditing of data held in our tissue bank database. Using eAuditor, we have identified that human entry errors range from 0.01% when entering donor's clinical follow-up details, to 0.53% when entering pathological details, highlighting the importance of an audit protocol tool such as eAuditor in a tissue bank database. eAuditor was developed and tested on the Caisis open source clinical-research database; however, it can be integrated in other databases where similar functionality is required.

  16. Introduction of the American Academy of Facial Plastic and Reconstructive Surgery FACE TO FACE Database.

    PubMed

    Abraham, Manoj T; Rousso, Joseph J; Hu, Shirley; Brown, Ryan F; Moscatello, Augustine L; Finn, J Charles; Patel, Neha A; Kadakia, Sameep P; Wood-Smith, Donald

    2017-07-01

    The American Academy of Facial Plastic and Reconstructive Surgery FACE TO FACE database was created to gather and organize patient data primarily from international humanitarian surgical mission trips, as well as local humanitarian initiatives. Similar to cloud-based Electronic Medical Records, this web-based user-generated database allows for more accurate tracking of provider and patient information and outcomes, regardless of site, and is useful when coordinating follow-up care for patients. The database is particularly useful on international mission trips as there are often different surgeons who may provide care to patients on subsequent missions, and patients who may visit more than 1 mission site. Ultimately, by pooling data across multiples sites and over time, the database has the potential to be a useful resource for population-based studies and outcome data analysis. The objective of this paper is to delineate the process involved in creating the AAFPRS FACE TO FACE database, to assess its functional utility, to draw comparisons to electronic medical records systems that are now widely implemented, and to explain the specific benefits and disadvantages of the use of the database as it was implemented on recent international surgical mission trips.

  17. EDDIX--a database of ionisation double differential cross sections.

    PubMed

    MacGibbon, J H; Emerson, S; Liamsuwan, T; Nikjoo, H

    2011-02-01

    The use of Monte Carlo track structure is a choice method in biophysical modelling and calculations. To precisely model 3D and 4D tracks, the cross section for the ionisation by an incoming ion, double differential in the outgoing electron energy and angle, is required. However, the double differential cross section cannot be theoretically modelled over the full range of parameters. To address this issue, a database of all available experimental data has been constructed. Currently, the database of Experimental Double Differential Ionisation Cross sections (EDDIX) contains over 1200 digitalised experimentally measured datasets from the 1960s to present date, covering all available ion species (hydrogen to uranium) and all available target species. Double differential cross sections are also presented with the aid of an eight parameter functions fitted to the cross sections. The parameters include projectile species and charge, target nuclear charge and atomic mass, projectile atomic mass and energy, electron energy and deflection angle. It is planned to freely distribute EDDIX and make it available to the radiation research community for use in the analytical and numerical modelling of track structure.

  18. USDA food and nutrient databases provide the infrastructure for food and nutrition research, policy, and practice.

    PubMed

    Ahuja, Jaspreet K C; Moshfegh, Alanna J; Holden, Joanne M; Harris, Ellen

    2013-02-01

    The USDA food and nutrient databases provide the basic infrastructure for food and nutrition research, nutrition monitoring, policy, and dietary practice. They have had a long history that goes back to 1892 and are unique, as they are the only databases available in the public domain that perform these functions. There are 4 major food and nutrient databases released by the Beltsville Human Nutrition Research Center (BHNRC), part of the USDA's Agricultural Research Service. These include the USDA National Nutrient Database for Standard Reference, the Dietary Supplement Ingredient Database, the Food and Nutrient Database for Dietary Studies, and the USDA Food Patterns Equivalents Database. The users of the databases are diverse and include federal agencies, the food industry, health professionals, restaurants, software application developers, academia and research organizations, international organizations, and foreign governments, among others. Many of these users have partnered with BHNRC to leverage funds and/or scientific expertise to work toward common goals. The use of the databases has increased tremendously in the past few years, especially the breadth of uses. These new uses of the data are bound to increase with the increased availability of technology and public health emphasis on diet-related measures such as sodium and energy reduction. Hence, continued improvement of the databases is important, so that they can better address these challenges and provide reliable and accurate data.

  19. Database Access Systems.

    ERIC Educational Resources Information Center

    Dalrymple, Prudence W.; Roderer, Nancy K.

    1994-01-01

    Highlights the changes that have occurred from 1987-93 in database access systems. Topics addressed include types of databases, including CD-ROMs; enduser interface; database selection; database access management, including library instruction and use of primary literature; economic issues; database users; the search process; and improving…

  20. New standards for reducing gravity data: The North American gravity database

    USGS Publications Warehouse

    Hinze, W. J.; Aiken, C.; Brozena, J.; Coakley, B.; Dater, D.; Flanagan, G.; Forsberg, R.; Hildenbrand, T.; Keller, Gordon R.; Kellogg, J.; Kucks, R.; Li, X.; Mainville, A.; Morin, R.; Pilkington, M.; Plouff, D.; Ravat, D.; Roman, D.; Urrutia-Fucugauchi, J.; Veronneau, M.; Webring, M.; Winester, D.

    2005-01-01

    The North American gravity database as well as databases from Canada, Mexico, and the United States are being revised to improve their coverage, versatility, and accuracy. An important part of this effort is revising procedures for calculating gravity anomalies, taking into account our enhanced computational power, improved terrain databases and datums, and increased interest in more accurately defining long-wavelength anomaly components. Users of the databases may note minor differences between previous and revised database values as a result of these procedures. Generally, the differences do not impact the interpretation of local anomalies but do improve regional anomaly studies. The most striking revision is the use of the internationally accepted terrestrial ellipsoid for the height datum of gravity stations rather than the conventionally used geoid or sea level. Principal facts of gravity observations and anomalies based on both revised and previous procedures together with germane metadata will be available on an interactive Web-based data system as well as from national agencies and data centers. The use of the revised procedures is encouraged for gravity data reduction because of the widespread use of the global positioning system in gravity fieldwork and the need for increased accuracy and precision of anomalies and consistency with North American and national databases. Anomalies based on the revised standards should be preceded by the adjective "ellipsoidal" to differentiate anomalies calculated using heights with respect to the ellipsoid from those based on conventional elevations referenced to the geoid. ?? 2005 Society of Exploration Geophysicists. All rights reserved.

  1. Crowdsourcing-Assisted Radio Environment Database for V2V Communication †

    PubMed Central

    Katagiri, Keita; Fujii, Takeo

    2018-01-01

    In order to realize reliable Vehicle-to-Vehicle (V2V) communication systems for autonomous driving, the recognition of radio propagation becomes an important technology. However, in the current wireless distributed network systems, it is difficult to accurately estimate the radio propagation characteristics because of the locality of the radio propagation caused by surrounding buildings and geographical features. In this paper, we propose a measurement-based radio environment database for improving the accuracy of the radio environment estimation in the V2V communication systems. The database first gathers measurement datasets of the received signal strength indicator (RSSI) related to the transmission/reception locations from V2V systems. By using the datasets, the average received power maps linked with transmitter and receiver locations are generated. We have performed measurement campaigns of V2V communications in the real environment to observe RSSI for the database construction. Our results show that the proposed method has higher accuracy of the radio propagation estimation than the conventional path loss model-based estimation. PMID:29649174

  2. "Mr. Database" : Jim Gray and the History of Database Technologies.

    PubMed

    Hanwahr, Nils C

    2017-12-01

    Although the widespread use of the term "Big Data" is comparatively recent, it invokes a phenomenon in the developments of database technology with distinct historical contexts. The database engineer Jim Gray, known as "Mr. Database" in Silicon Valley before his disappearance at sea in 2007, was involved in many of the crucial developments since the 1970s that constitute the foundation of exceedingly large and distributed databases. Jim Gray was involved in the development of relational database systems based on the concepts of Edgar F. Codd at IBM in the 1970s before he went on to develop principles of Transaction Processing that enable the parallel and highly distributed performance of databases today. He was also involved in creating forums for discourse between academia and industry, which influenced industry performance standards as well as database research agendas. As a co-founder of the San Francisco branch of Microsoft Research, Gray increasingly turned toward scientific applications of database technologies, e. g. leading the TerraServer project, an online database of satellite images. Inspired by Vannevar Bush's idea of the memex, Gray laid out his vision of a Personal Memex as well as a World Memex, eventually postulating a new era of data-based scientific discovery termed "Fourth Paradigm Science". This article gives an overview of Gray's contributions to the development of database technology as well as his research agendas and shows that central notions of Big Data have been occupying database engineers for much longer than the actual term has been in use.

  3. Can use of an administrative database improve accuracy of hospital-reported readmission rates?

    PubMed

    Edgerton, James R; Herbert, Morley A; Hamman, Baron L; Ring, W Steves

    2018-05-01

    Readmission rates after cardiac surgery are being used as a quality indicator; they are also being collected by Medicare and are tied to reimbursement. Accurate knowledge of readmission rates may be difficult to achieve because patients may be readmitted to different hospitals. In our area, 81 hospitals share administrative claims data; 28 of these hospitals (from 5 different hospital systems) do cardiac surgery and share Society of Thoracic Surgeons (STS) clinical data. We used these 2 sources to compare the readmissions data for accuracy. A total of 45,539 STS records from January 2008 to December 2016 were matched with the hospital billing data records. Using the index visit as the start date, the billing records were queried for any subsequent in-patient visits for that patient. The billing records included date of readmission and hospital of readmission data and were compared with the data captured in the STS record. We found 1153 (2.5%) patients who had STS records that were marked "No" or "missing," but there were billing records that showed a readmission. The reported STS readmission rate of 4796 (10.5%) underreported the readmission rate by 2.5 actual percentage points. The true rate should have been 13.0%. Actual readmission rate was 23.8% higher than reported by the clinical database. Approximately 36% of readmissions were to a hospital that was a part of a different hospital system. It is important to know accurate readmission rates for quality improvement processes and institutional financial planning. Matching patient records to an administrative database showed that the clinical database may fail to capture many readmissions. Combining data with an administrative database can enhance accuracy of reporting. Copyright © 2017 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.

  4. Mining of high utility-probability sequential patterns from uncertain databases

    PubMed Central

    Zhang, Binbin; Fournier-Viger, Philippe; Li, Ting

    2017-01-01

    High-utility sequential pattern mining (HUSPM) has become an important issue in the field of data mining. Several HUSPM algorithms have been designed to mine high-utility sequential patterns (HUPSPs). They have been applied in several real-life situations such as for consumer behavior analysis and event detection in sensor networks. Nonetheless, most studies on HUSPM have focused on mining HUPSPs in precise data. But in real-life, uncertainty is an important factor as data is collected using various types of sensors that are more or less accurate. Hence, data collected in a real-life database can be annotated with existing probabilities. This paper presents a novel pattern mining framework called high utility-probability sequential pattern mining (HUPSPM) for mining high utility-probability sequential patterns (HUPSPs) in uncertain sequence databases. A baseline algorithm with three optional pruning strategies is presented to mine HUPSPs. Moroever, to speed up the mining process, a projection mechanism is designed to create a database projection for each processed sequence, which is smaller than the original database. Thus, the number of unpromising candidates can be greatly reduced, as well as the execution time for mining HUPSPs. Substantial experiments both on real-life and synthetic datasets show that the designed algorithm performs well in terms of runtime, number of candidates, memory usage, and scalability for different minimum utility and minimum probability thresholds. PMID:28742847

  5. Controlling Mass Transport in Microfluidic Devices

    PubMed Central

    Kuo, Jason S.; Chiu, Daniel T.

    2017-01-01

    Microfluidic platforms offer exquisite capabilities in controlling mass transport for biological studies. In this review, we focus on recent developments in manipulating chemical concentrations at the microscale. Some techniques prevent or accelerate mixing, whereas others shape the concentration gradients of chemical and biological molecules. We also highlight several in vitro biological studies in the areas of organ engineering, cancer, and blood coagulation that have benefited from accurate control of mass transfer. PMID:21456968

  6. Pseudomonas Genome Database: facilitating user-friendly, comprehensive comparisons of microbial genomes.

    PubMed

    Winsor, Geoffrey L; Van Rossum, Thea; Lo, Raymond; Khaira, Bhavjinder; Whiteside, Matthew D; Hancock, Robert E W; Brinkman, Fiona S L

    2009-01-01

    Pseudomonas aeruginosa is a well-studied opportunistic pathogen that is particularly known for its intrinsic antimicrobial resistance, diverse metabolic capacity, and its ability to cause life threatening infections in cystic fibrosis patients. The Pseudomonas Genome Database (http://www.pseudomonas.com) was originally developed as a resource for peer-reviewed, continually updated annotation for the Pseudomonas aeruginosa PAO1 reference strain genome. In order to facilitate cross-strain and cross-species genome comparisons with other Pseudomonas species of importance, we have now expanded the database capabilities to include all Pseudomonas species, and have developed or incorporated methods to facilitate high quality comparative genomics. The database contains robust assessment of orthologs, a novel ortholog clustering method, and incorporates five views of the data at the sequence and annotation levels (Gbrowse, Mauve and custom views) to facilitate genome comparisons. A choice of simple and more flexible user-friendly Boolean search features allows researchers to search and compare annotations or sequences within or between genomes. Other features include more accurate protein subcellular localization predictions and a user-friendly, Boolean searchable log file of updates for the reference strain PAO1. This database aims to continue to provide a high quality, annotated genome resource for the research community and is available under an open source license.

  7. STRAPS v1.0: evaluating a methodology for predicting electron impact ionisation mass spectra for the aerosol mass spectrometer

    NASA Astrophysics Data System (ADS)

    Topping, David O.; Allan, James; Rami Alfarra, M.; Aumont, Bernard

    2017-06-01

    , any internal mechanisms or instrument features impacting on fragmentation are implicitly accounted for in the fitted model. Whilst one might expect a collection of keys specifically designed according to EI fragmentation principles to offer a robust basis, the suitability of a range of commonly available fingerprints is evaluated. Using available fingerprints in isolation, initial results suggest the generic public MACCS fingerprints provide the most accurate trained model when combined with both decision trees and random forests, with median cosine angles of 0.94-0.97 between modelled and measured spectra. There is some sensitivity to choice of fingerprint, but most sensitivity is in choice of regression technique. Support vector machines perform the worst, with median values of 0.78-0.85 and lower ranges approaching 0.4, depending on the fingerprint used. More detailed analysis of modelled versus mass spectra demonstrates important composition-dependent sensitivities on a compound-by-compound basis. This is further demonstrated when we apply the trained methods to a model α-pinene SOA system, using output from the GECKO-A model. This shows that use of a generic fingerprint referred to as FP4 and one designed for vapour pressure predictions (Nanoolal) gives plausible mass spectra, whilst the use of the MACCS keys in isolation performs poorly in this application, demonstrating the need for evaluating model performance against other SOA systems rather than existing laboratory databases on single compounds. Given the limited number of compounds used within the AMS training dataset, it is difficult to prescribe which combination of approach would lead to a robust generic model across all expected compositions. Nonetheless, the study demonstrates the use of a methodology that would be improved with more training data, fingerprints designed explicitly for fragmentation mechanisms occurring within the AMS, and data from additional mixed systems for

  8. [Establishment of database with standard 3D tooth crowns based on 3DS MAX].

    PubMed

    Cheng, Xiaosheng; An, Tao; Liao, Wenhe; Dai, Ning; Yu, Qing; Lu, Peijun

    2009-08-01

    The database with standard 3D tooth crowns has laid the groundwork for dental CAD/CAM system. In this paper, we design the standard tooth crowns in 3DS MAX 9.0 and create a database with these models successfully. Firstly, some key lines are collected from standard tooth pictures. Then we use 3DS MAX 9.0 to design the digital tooth model based on these lines. During the design process, it is important to refer to the standard plaster tooth model. After some tests, the standard tooth models designed with this method are accurate and adaptable; furthermore, it is very easy to perform some operations on the models such as deforming and translating. This method provides a new idea to build the database with standard 3D tooth crowns and a basis for dental CAD/CAM system.

  9. Recent Progress in the Development of Metabolome Databases for Plant Systems Biology

    PubMed Central

    Fukushima, Atsushi; Kusano, Miyako

    2013-01-01

    Metabolomics has grown greatly as a functional genomics tool, and has become an invaluable diagnostic tool for biochemical phenotyping of biological systems. Over the past decades, a number of databases involving information related to mass spectra, compound names and structures, statistical/mathematical models and metabolic pathways, and metabolite profile data have been developed. Such databases complement each other and support efficient growth in this area, although the data resources remain scattered across the World Wide Web. Here, we review available metabolome databases and summarize the present status of development of related tools, particularly focusing on the plant metabolome. Data sharing discussed here will pave way for the robust interpretation of metabolomic data and advances in plant systems biology. PMID:23577015

  10. [Scrotal masses in adulthood].

    PubMed

    Leskinen, Markku; Ala-Lipasti, Mika; Marttila, Timo; Paaso, Ilkka; Raitanen, Mika

    2009-01-01

    GPs often encounter patients with scrotal masses. Most of these are easily diagnosed by means of palpation and transillumination. Scrotal ultrasound is the primary radiological investigation, and should be done in cases of uncertain clinical diagnosis or when a solid mass in the scrotum is suspected. Benign processes such as hydrocele, spermatocele and varicocele are the most common causes for scrotal abnormalities. Symptomatic cases are treated with surgery or percutaneous sclerotherapy/embolization. Malignant testicular tumours, although less frequently seen, need prompt and accurate diagnosis and treatment. Radical orchidectomy is the primary treatment, followed by radiochemotherpy in selected cases.

  11. A tuberculosis biomarker database: the key to novel TB diagnostics.

    PubMed

    Yerlikaya, Seda; Broger, Tobias; MacLean, Emily; Pai, Madhukar; Denkinger, Claudia M

    2017-03-01

    New diagnostic innovations for tuberculosis (TB), including point-of-care solutions, are critical to reach the goals of the End TB Strategy. However, despite decades of research, numerous reports on new biomarker candidates, and significant investment, no well-performing, simple and rapid TB diagnostic test is yet available on the market, and the search for accurate, non-DNA biomarkers remains a priority. To help overcome this 'biomarker pipeline problem', FIND and partners are working on the development of a well-curated and user-friendly TB biomarker database. The web-based database will enable the dynamic tracking of evidence surrounding biomarker candidates in relation to target product profiles (TPPs) for needed TB diagnostics. It will be able to accommodate raw datasets and facilitate the verification of promising biomarker candidates and the identification of novel biomarker combinations. As such, the database will simplify data and knowledge sharing, empower collaboration, help in the coordination of efforts and allocation of resources, streamline the verification and validation of biomarker candidates, and ultimately lead to an accelerated translation into clinically useful tools. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  12. Prediction of the B{c}{*} mass in full lattice QCD.

    PubMed

    Gregory, E B; Davies, C T H; Follana, E; Gamiz, E; Kendall, I D; Lepage, G P; Na, H; Shigemitsu, J; Wong, K Y

    2010-01-15

    By using the highly improved staggered quark formalism to handle charm, strange, and light valence quarks in full lattice QCD, and NRQCD to handle bottom valence quarks, we are able to determine accurately ratios of the B meson vector-pseudoscalar mass splittings, in particular, [m(B{c}{*})-m(B{c})]/[m(B{s}{*})-m(B{s})]. We find this ratio to be 1.15(15), showing the "light" quark mass dependence of this splitting to be very small. Hence we predict m(B{c}{*})=6.330(7)(2)(6) GeV, where the first two errors are from the lattice calculation and the third from existing experiment. This is the most accurate prediction of a gold-plated hadron mass from lattice QCD to date.

  13. Quantitative aspects of inductively coupled plasma mass spectrometry

    NASA Astrophysics Data System (ADS)

    Bulska, Ewa; Wagner, Barbara

    2016-10-01

    Accurate determination of elements in various kinds of samples is essential for many areas, including environmental science, medicine, as well as industry. Inductively coupled plasma mass spectrometry (ICP-MS) is a powerful tool enabling multi-elemental analysis of numerous matrices with high sensitivity and good precision. Various calibration approaches can be used to perform accurate quantitative measurements by ICP-MS. They include the use of pure standards, matrix-matched standards, or relevant certified reference materials, assuring traceability of the reported results. This review critically evaluates the advantages and limitations of different calibration approaches, which are used in quantitative analyses by ICP-MS. Examples of such analyses are provided. This article is part of the themed issue 'Quantitative mass spectrometry'.

  14. Pathogen Research Databases

    Science.gov Websites

    Hepatitis C Virus (HCV) database project is funded by the Division of Microbiology and Infectious Diseases of the National Institute of Allergies and Infectious Diseases (NIAID). The HCV database project started as a spin-off from the HIV database project. There are two databases for HCV, a sequence database

  15. Establishment of a matrix-assisted laser desorption ionization time-of-flight mass spectrometry database for rapid identification of infectious achlorophyllous green micro-algae of the genus Prototheca.

    PubMed

    Murugaiyan, J; Ahrholdt, J; Kowbel, V; Roesler, U

    2012-05-01

    The possibility of using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) for rapid identification of pathogenic and non-pathogenic species of the genus Prototheca has been recently demonstrated. A unique reference database of MALDI-TOF MS profiles for type and reference strains of the six generally accepted Prototheca species was established. The database quality was reinforced after the acquisition of 27 spectra for selected Prototheca strains, with three biological and technical replicates for each of 18 type and reference strains of Prototheca and four strains of Chlorella. This provides reproducible and unique spectra covering a wide m/z range (2000-20 000 Da) for each of the strains used in the present study. The reproducibility of the spectra was further confirmed by employing composite correlation index calculation and main spectra library (MSP) dendrogram creation, available with MALDI Biotyper software. The MSP dendrograms obtained were comparable with the 18S rDNA sequence-based dendrograms. These reference spectra were successfully added to the Bruker database, and the efficiency of identification was evaluated by cross-reference-based and unknown Prototheca identification. It is proposed that the addition of further strains would reinforce the reference spectra library for rapid identification of Prototheca strains to the genus and species/genotype level. © 2011 The Authors. Clinical Microbiology and Infection © 2011 European Society of Clinical Microbiology and Infectious Diseases.

  16. Making proteomics data accessible and reusable: Current state of proteomics databases and repositories

    PubMed Central

    Perez-Riverol, Yasset; Alpi, Emanuele; Wang, Rui; Hermjakob, Henning; Vizcaíno, Juan Antonio

    2015-01-01

    Compared to other data-intensive disciplines such as genomics, public deposition and storage of MS-based proteomics, data are still less developed due to, among other reasons, the inherent complexity of the data and the variety of data types and experimental workflows. In order to address this need, several public repositories for MS proteomics experiments have been developed, each with different purposes in mind. The most established resources are the Global Proteome Machine Database (GPMDB), PeptideAtlas, and the PRIDE database. Additionally, there are other useful (in many cases recently developed) resources such as ProteomicsDB, Mass Spectrometry Interactive Virtual Environment (MassIVE), Chorus, MaxQB, PeptideAtlas SRM Experiment Library (PASSEL), Model Organism Protein Expression Database (MOPED), and the Human Proteinpedia. In addition, the ProteomeXchange consortium has been recently developed to enable better integration of public repositories and the coordinated sharing of proteomics information, maximizing its benefit to the scientific community. Here, we will review each of the major proteomics resources independently and some tools that enable the integration, mining and reuse of the data. We will also discuss some of the major challenges and current pitfalls in the integration and sharing of the data. PMID:25158685

  17. Aptamer Database

    PubMed Central

    Lee, Jennifer F.; Hesselberth, Jay R.; Meyers, Lauren Ancel; Ellington, Andrew D.

    2004-01-01

    The aptamer database is designed to contain comprehensive sequence information on aptamers and unnatural ribozymes that have been generated by in vitro selection methods. Such data are not normally collected in ‘natural’ sequence databases, such as GenBank. Besides serving as a storehouse of sequences that may have diagnostic or therapeutic utility, the database serves as a valuable resource for theoretical biologists who describe and explore fitness landscapes. The database is updated monthly and is publicly available at http://aptamer.icmb.utexas.edu/. PMID:14681367

  18. Solving Relational Database Problems with ORDBMS in an Advanced Database Course

    ERIC Educational Resources Information Center

    Wang, Ming

    2011-01-01

    This paper introduces how to use the object-relational database management system (ORDBMS) to solve relational database (RDB) problems in an advanced database course. The purpose of the paper is to provide a guideline for database instructors who desire to incorporate the ORDB technology in their traditional database courses. The paper presents…

  19. The integrated contaminant elution and tracer test toolkit, ICET3, for improved characterization of mass transfer, attenuation, and mass removal

    NASA Astrophysics Data System (ADS)

    Brusseau, Mark L.; Guo, Zhilin

    2018-01-01

    It is evident based on historical data that groundwater contaminant plumes persist at many sites, requiring costly long-term management. High-resolution site-characterization methods are needed to support accurate risk assessments and to select, design, and operate effective remediation operations. Most subsurface characterization methods are generally limited in their ability to provide unambiguous, real-time delineation of specific processes affecting mass-transfer, transformation, and mass removal, and accurate estimation of associated rates. An integrated contaminant elution and tracer test toolkit, comprising a set of local-scale groundwater extraction-and injection tests, was developed to ameliorate the primary limitations associated with standard characterization methods. The test employs extended groundwater extraction to stress the system and induce hydraulic and concentration gradients. Clean water can be injected, which removes the resident aqueous contaminant mass present in the higher-permeability zones and isolates the test zone from the surrounding plume. This ensures that the concentrations and fluxes measured within the isolated area are directly and predominantly influenced by the local mass-transfer and transformation processes controlling mass removal. A suite of standard and novel tracers can be used to delineate specific mass-transfer and attenuation processes that are active at a given site, and to quantify the associated mass-transfer and transformation rates. The conceptual basis for the test is first presented, followed by an illustrative application based on simulations produced with a 3-D mathematical model and a brief case study application.

  20. Characterizing the genetic structure of a forensic DNA database using a latent variable approach.

    PubMed

    Kruijver, Maarten

    2016-07-01

    Several problems in forensic genetics require a representative model of a forensic DNA database. Obtaining an accurate representation of the offender database can be difficult, since databases typically contain groups of persons with unregistered ethnic origins in unknown proportions. We propose to estimate the allele frequencies of the subpopulations comprising the offender database and their proportions from the database itself using a latent variable approach. We present a model for which parameters can be estimated using the expectation maximization (EM) algorithm. This approach does not rely on relatively small and possibly unrepresentative population surveys, but is driven by the actual genetic composition of the database only. We fit the model to a snapshot of the Dutch offender database (2014), which contains close to 180,000 profiles, and find that three subpopulations suffice to describe a large fraction of the heterogeneity in the database. We demonstrate the utility and reliability of the approach with three applications. First, we use the model to predict the number of false leads obtained in database searches. We assess how well the model predicts the number of false leads obtained in mock searches in the Dutch offender database, both for the case of familial searching for first degree relatives of a donor and searching for contributors to three-person mixtures. Second, we study the degree of partial matching between all pairs of profiles in the Dutch database and compare this to what is predicted using the latent variable approach. Third, we use the model to provide evidence to support that the Dutch practice of estimating match probabilities using the Balding-Nichols formula with a native Dutch reference database and θ=0.03 is conservative. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  1. Generalized Database Management System Support for Numeric Database Environments.

    ERIC Educational Resources Information Center

    Dominick, Wayne D.; Weathers, Peggy G.

    1982-01-01

    This overview of potential for utilizing database management systems (DBMS) within numeric database environments highlights: (1) major features, functions, and characteristics of DBMS; (2) applicability to numeric database environment needs and user needs; (3) current applications of DBMS technology; and (4) research-oriented and…

  2. WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research.

    PubMed

    Slenter, Denise N; Kutmon, Martina; Hanspers, Kristina; Riutta, Anders; Windsor, Jacob; Nunes, Nuno; Mélius, Jonathan; Cirillo, Elisa; Coort, Susan L; Digles, Daniela; Ehrhart, Friederike; Giesbertz, Pieter; Kalafati, Marianthi; Martens, Marvin; Miller, Ryan; Nishida, Kozo; Rieswijk, Linda; Waagmeester, Andra; Eijssen, Lars M T; Evelo, Chris T; Pico, Alexander R; Willighagen, Egon L

    2018-01-04

    WikiPathways (wikipathways.org) captures the collective knowledge represented in biological pathways. By providing a database in a curated, machine readable way, omics data analysis and visualization is enabled. WikiPathways and other pathway databases are used to analyze experimental data by research groups in many fields. Due to the open and collaborative nature of the WikiPathways platform, our content keeps growing and is getting more accurate, making WikiPathways a reliable and rich pathway database. Previously, however, the focus was primarily on genes and proteins, leaving many metabolites with only limited annotation. Recent curation efforts focused on improving the annotation of metabolism and metabolic pathways by associating unmapped metabolites with database identifiers and providing more detailed interaction knowledge. Here, we report the outcomes of the continued growth and curation efforts, such as a doubling of the number of annotated metabolite nodes in WikiPathways. Furthermore, we introduce an OpenAPI documentation of our web services and the FAIR (Findable, Accessible, Interoperable and Reusable) annotation of resources to increase the interoperability of the knowledge encoded in these pathways and experimental omics data. New search options, monthly downloads, more links to metabolite databases, and new portals make pathway knowledge more effortlessly accessible to individual researchers and research communities. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Maximum heat of mass concrete - phase 2.

    DOT National Transportation Integrated Search

    2016-11-01

    The main findings and recommendations from this study are as follows: : (1) The database of adiabatic temperature rise tables which was developed in this study can be used in the DIANA software for the modeling of mass concrete structures. : (2) Clas...

  4. Mass Spectra-Based Framework for Automated Structural Elucidation of Metabolome Data to Explore Phytochemical Diversity

    PubMed Central

    Matsuda, Fumio; Nakabayashi, Ryo; Sawada, Yuji; Suzuki, Makoto; Hirai, Masami Y.; Kanaya, Shigehiko; Saito, Kazuki

    2011-01-01

    A novel framework for automated elucidation of metabolite structures in liquid chromatography–mass spectrometer metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method. PMID:22645535

  5. Advances in structure elucidation of small molecules using mass spectrometry

    PubMed Central

    Fiehn, Oliver

    2010-01-01

    The structural elucidation of small molecules using mass spectrometry plays an important role in modern life sciences and bioanalytical approaches. This review covers different soft and hard ionization techniques and figures of merit for modern mass spectrometers, such as mass resolving power, mass accuracy, isotopic abundance accuracy, accurate mass multiple-stage MS(n) capability, as well as hybrid mass spectrometric and orthogonal chromatographic approaches. The latter part discusses mass spectral data handling strategies, which includes background and noise subtraction, adduct formation and detection, charge state determination, accurate mass measurements, elemental composition determinations, and complex data-dependent setups with ion maps and ion trees. The importance of mass spectral library search algorithms for tandem mass spectra and multiple-stage MS(n) mass spectra as well as mass spectral tree libraries that combine multiple-stage mass spectra are outlined. The successive chapter discusses mass spectral fragmentation pathways, biotransformation reactions and drug metabolism studies, the mass spectral simulation and generation of in silico mass spectra, expert systems for mass spectral interpretation, and the use of computational chemistry to explain gas-phase phenomena. A single chapter discusses data handling for hyphenated approaches including mass spectral deconvolution for clean mass spectra, cheminformatics approaches and structure retention relationships, and retention index predictions for gas and liquid chromatography. The last section reviews the current state of electronic data sharing of mass spectra and discusses the importance of software development for the advancement of structure elucidation of small molecules. Electronic supplementary material The online version of this article (doi:10.1007/s12566-010-0015-9) contains supplementary material, which is available to authorized users. PMID:21289855

  6. Globular Cluster Orbits from HST Proper Motions: Constraining the Formation and Mass of the Milky Way Halo

    NASA Astrophysics Data System (ADS)

    Sohn, S. Tony; Van Der Marel, Roeland P.; Deason, Alis J.; Bellini, Andrea; Besla, Gurtina; Watkins, Laura

    2016-06-01

    The globular cluster (GC) system of the Milky Way (MW) provides important information on the MW's present structure and past evolution. GCs in the halo are particularly useful tracers; because of their long dynamical timescales, their orbits retain imprints of their origin or accretion history. Full 3D motions are required to calculate past orbits. While most GCs have known line of sight velocities, accurate proper motion (PM) measurements are currently available for only a few halo GCs. Our goal is to create the first high-quality PM database for halo GCs. We have identified suitable 1st-epoch data in the HST Archive for 20 halo GCs at 10-100 kpc from the Galactic Center. We are in the process of obtaining the necessary 2nd-epoch data to determine absolute PMs of the target GCs through our HST program GO-14235. We will use the same advanced astrometric techniques that allowed us to measure the PMs of M31 and Leo I. Previous studies of the halo GC system based on e.g., stellar populations, metallicities, RR Lyrae properties, and structural properties have revealed a dichotomy between old and young halo GCs. This may reflect distinct formation scenarios (in situ vs. accreted). Orbit calculations based on our PMs will directly test this. The PMs will also yield the best handle yet on the velocity anisotropy profile of any tracer population in the halo. This will resolve the mass-anisotropy degeneracy to provide an improved estimate of the MW mass, which is at present poorly known. In summary, our project will deliver the first accurate PMs for halo GCs, and will significantly increase our understanding of the formation, evolution, and mass of the MW.

  7. Modelling non-adiabatic effects in H_3^+: Solution of the rovibrational Schrödinger equation with motion-dependent masses and mass surfaces

    NASA Astrophysics Data System (ADS)

    Mátyus, Edit; Szidarovszky, Tamás; Császár, Attila G.

    2014-10-01

    Introducing different rotational and vibrational masses in the nuclear-motion Hamiltonian is a simple phenomenological way to model rovibrational non-adiabaticity. It is shown on the example of the molecular ion H_3^+, for which a global adiabatic potential energy surface accurate to better than 0.1 cm-1 exists [M. Pavanello, L. Adamowicz, A. Alijah, N. F. Zobov, I. I. Mizus, O. L. Polyansky, J. Tennyson, T. Szidarovszky, A. G. Császár, M. Berg et al., Phys. Rev. Lett. 108, 023002 (2012)], that the motion-dependent mass concept yields much more accurate rovibrational energy levels but, unusually, the results are dependent upon the choice of the embedding of the molecule-fixed frame. Correct degeneracies and an improved agreement with experimental data are obtained if an Eckart embedding corresponding to a reference structure of D3h point-group symmetry is employed. The vibrational mass of the proton in H_3^+ is optimized by minimizing the root-mean-square (rms) deviation between the computed and recent high-accuracy experimental transitions. The best vibrational mass obtained is larger than the nuclear mass of the proton by approximately one third of an electron mass, m^(v)_opt,p=m_nuc,p+0.31224 m_e. This optimized vibrational mass, along with a nuclear rotational mass, reduces the rms deviation of the experimental and computed rovibrational transitions by an order of magnitude. Finally, it is shown that an extension of the algorithm allowing the use of motion-dependent masses can deal with coordinate-dependent mass surfaces in the rovibrational Hamiltonian, as well.

  8. Accurate Measurements of Aircraft Engine Soot Emissions Using a CAPS PMssa Monitor

    NASA Astrophysics Data System (ADS)

    Onasch, Timothy; Thompson, Kevin; Renbaum-Wolff, Lindsay; Smallwood, Greg; Make-Lye, Richard; Freedman, Andrew

    2016-04-01

    We present results of aircraft engine soot emissions measurements during the VARIAnT2 campaign using CAPS PMssa monitors. VARIAnT2, an aircraft engine non-volatile particulate matter (nvPM) emissions field campaign, was focused on understanding the variability in nvPM mass measurements using different measurement techniques and accounting for possible nvPM sampling system losses. The CAPS PMssa monitor accurately measures both the optical extinction and scattering (and thus single scattering albedo and absorption) of an extracted sample using the same sample volume for both measurements with a time resolution of 1 second and sensitivity of better than 1 Mm-1. Absorption is obtained by subtracting the scattering signal from the total extinction. Given that the single scattering albedo of the particulates emitted from the aircraft engine measured at both 630 and 660 nm was on the order of 0.1, any inaccuracy in the scattering measurement has little impact on the accuracy of the ddetermined absorption coefficient. The absorption is converted into nvPM mass using a documented Mass Absorption Coefficient (MAC). Results of soot emission indices (mass soot emitted per mass of fuel consumed) for a turbojet engine as a function of engine power will be presented and compared to results obtained using an EC/OC monitor.

  9. MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing.

    PubMed

    Lommen, Arjen

    2009-04-15

    Hyphenated full-scan MS technology creates large amounts of data. A versatile easy to handle automation tool aiding in the data analysis is very important in handling such a data stream. MetAlign softwareas described in this manuscripthandles a broad range of accurate mass and nominal mass GC/MS and LC/MS data. It is capable of automatic format conversions, accurate mass calculations, baseline corrections, peak-picking, saturation and mass-peak artifact filtering, as well as alignment of up to 1000 data sets. A 100 to 1000-fold data reduction is achieved. MetAlign software output is compatible with most multivariate statistics programs.

  10. Mass Detection in Mammographic Images Using Wavelet Processing and Adaptive Threshold Technique.

    PubMed

    Vikhe, P S; Thool, V R

    2016-04-01

    Detection of mass in mammogram for early diagnosis of breast cancer is a significant assignment in the reduction of the mortality rate. However, in some cases, screening of mass is difficult task for radiologist, due to variation in contrast, fuzzy edges and noisy mammograms. Masses and micro-calcifications are the distinctive signs for diagnosis of breast cancer. This paper presents, a method for mass enhancement using piecewise linear operator in combination with wavelet processing from mammographic images. The method includes, artifact suppression and pectoral muscle removal based on morphological operations. Finally, mass segmentation for detection using adaptive threshold technique is carried out to separate the mass from background. The proposed method has been tested on 130 (45 + 85) images with 90.9 and 91 % True Positive Fraction (TPF) at 2.35 and 2.1 average False Positive Per Image(FP/I) from two different databases, namely Mammographic Image Analysis Society (MIAS) and Digital Database for Screening Mammography (DDSM). The obtained results show that, the proposed technique gives improved diagnosis in the early breast cancer detection.

  11. Extracting Databases from Dark Data with DeepDive

    PubMed Central

    Zhang, Ce; Shin, Jaeho; Ré, Christopher; Cafarella, Michael; Niu, Feng

    2016-01-01

    DeepDive is a system for extracting relational databases from dark data: the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data — scientific papers, Web classified ads, customer service notes, and so on — were instead in a relational database, it would give analysts a massive and valuable new set of “big data.” DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference. PMID:28316365

  12. Extracting Databases from Dark Data with DeepDive.

    PubMed

    Zhang, Ce; Shin, Jaeho; Ré, Christopher; Cafarella, Michael; Niu, Feng

    2016-01-01

    DeepDive is a system for extracting relational databases from dark data : the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data - scientific papers, Web classified ads, customer service notes, and so on - were instead in a relational database, it would give analysts a massive and valuable new set of "big data." DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference.

  13. MOCCA-SURVEY Database. I. Eccentric Black Hole Mergers during Binary–Single Interactions in Globular Clusters

    NASA Astrophysics Data System (ADS)

    Samsing, Johan; Askar, Abbas; Giersz, Mirek

    2018-03-01

    We estimate the population of eccentric gravitational wave (GW) binary black hole (BBH) mergers forming during binary–single interactions in globular clusters (GCs), using ∼800 GC models that were evolved using the MOCCA code for star cluster simulations as part of the MOCCA-Survey Database I project. By re-simulating BH binary–single interactions extracted from this set of GC models using an N-body code that includes GW emission at the 2.5 post-Newtonian level, we find that ∼10% of all the BBHs assembled in our GC models that merge at present time form during chaotic binary–single interactions, and that about half of this sample have an eccentricity >0.1 at 10 Hz. We explicitly show that this derived rate of eccentric mergers is ∼100 times higher than one would find with a purely Newtonian N-body code. Furthermore, we demonstrate that the eccentric fraction can be accurately estimated using a simple analytical formalism when the interacting BHs are of similar mass, a result that serves as the first successful analytical description of eccentric GW mergers forming during three-body interactions in realistic GCs.

  14. Database Administrator

    ERIC Educational Resources Information Center

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  15. FIREMON Database

    Treesearch

    John F. Caratti

    2006-01-01

    The FIREMON database software allows users to enter data, store, analyze, and summarize plot data, photos, and related documents. The FIREMON database software consists of a Java application and a Microsoft® Access database. The Java application provides the user interface with FIREMON data through data entry forms, data summary reports, and other data management tools...

  16. Precision Timing of PSR J0437-4715: An Accurate Pulsar Distance, a High Pulsar Mass, and a Limit on the Variation of Newton's Gravitational Constant

    NASA Astrophysics Data System (ADS)

    Verbiest, J. P. W.; Bailes, M.; van Straten, W.; Hobbs, G. B.; Edwards, R. T.; Manchester, R. N.; Bhat, N. D. R.; Sarkissian, J. M.; Jacoby, B. A.; Kulkarni, S. R.

    2008-05-01

    Analysis of 10 years of high-precision timing data on the millisecond pulsar PSR J0437-4715 has resulted in a model-independent kinematic distance based on an apparent orbital period derivative, dot Pb , determined at the 1.5% level of precision (Dk = 157.0 +/- 2.4 pc), making it one of the most accurate stellar distance estimates published to date. The discrepancy between this measurement and a previously published parallax distance estimate is attributed to errors in the DE200 solar system ephemerides. The precise measurement of dot Pb allows a limit on the variation of Newton's gravitational constant, |Ġ/G| <= 23 × 10-12 yr-1. We also constrain any anomalous acceleration along the line of sight to the pulsar to |a⊙/c| <= 1.5 × 10-18 s-1 at 95% confidence, and derive a pulsar mass, mpsr = 1.76 +/- 0.20 M⊙, one of the highest estimates so far obtained.

  17. Contaminant screening of wastewater with HPLC-IM-qTOF-MS and LC+LC-IM-qTOF-MS using a CCS database.

    PubMed

    Stephan, Susanne; Hippler, Joerg; Köhler, Timo; Deeb, Ahmad A; Schmidt, Torsten C; Schmitz, Oliver J

    2016-09-01

    Non-target analysis has become an important tool in the field of water analysis since a broad variety of pollutants from different sources are released to the water cycle. For identification of compounds in such complex samples, liquid chromatography coupled to high resolution mass spectrometry are often used. The introduction of ion mobility spectrometry provides an additional separation dimension and allows determining collision cross sections (CCS) of the analytes as a further physicochemical constant supporting the identification. A CCS database with more than 500 standard substances including drug-like compounds and pesticides was used for CCS data base search in this work. A non-target analysis of a wastewater sample was initially performed with high performance liquid chromatography (HPLC) coupled to an ion mobility-quadrupole-time of flight mass spectrometer (IM-qTOF-MS). A database search including exact mass (±5 ppm) and CCS (±1 %) delivered 22 different compounds. Furthermore, the same sample was analyzed with a two-dimensional LC method, called LC+LC, developed in our group for the coupling to IM-qTOF-MS. This four dimensional separation platform revealed 53 different compounds, identified over exact mass and CCS, in the examined wastewater sample. It is demonstrated that the CCS database can also help to distinguish between isobaric structures exemplified for cyclophosphamide and ifosfamide. Graphical Abstract Scheme of sample analysis and database screening.

  18. Modelling baryonic effects on galaxy cluster mass profiles

    NASA Astrophysics Data System (ADS)

    Shirasaki, Masato; Lau, Erwin T.; Nagai, Daisuke

    2018-06-01

    Gravitational lensing is a powerful probe of the mass distribution of galaxy clusters and cosmology. However, accurate measurements of the cluster mass profiles are limited by uncertainties in cluster astrophysics. In this work, we present a physically motivated model of baryonic effects on the cluster mass profiles, which self-consistently takes into account the impact of baryons on the concentration as well as mass accretion histories of galaxy clusters. We calibrate this model using the Omega500 hydrodynamical cosmological simulations of galaxy clusters with varying baryonic physics. Our model will enable us to simultaneously constrain cluster mass, concentration, and cosmological parameters using stacked weak lensing measurements from upcoming optical cluster surveys.

  19. Quantifying the line-of-sight mass distributions for time-delay lenses with stellar masses

    NASA Astrophysics Data System (ADS)

    Rusu, Cristian; Fassnacht, Chris; Treu, Tommaso; Suyu, Sherry; Auger, Matt; Koopmans, Leon; Marshall, Phil; Wong, Kenneth; Collett, Thomas; Agnello, Adriano; Blandford, Roger; Courbin, Frederic; Hilbert, Stefan; Meylan, Georges; Sluse, Dominique

    2014-12-01

    Measuring cosmological parameters with a realistic account of systematic uncertainties is currently one of the principal challenges of physical cosmology. Building on our recent successes with two gravitationally lensed systems, we have started a program to achieve accurate cosmographic measurements from five gravitationally lensed quasars. We aim at measuring H_0 with an accuracy better than 4%, comparable to but independent from measurements by current BAO, SN or Cepheid programs. The largest current contributor to the error budget in our sample is uncertainty about the line-of-sight mass distribution and environment of the lens systems. In this proposal, we request wide-field u-band imaging of the only lens in our sample without already available Spitzer/IRCA observations, B1608+656. The proposed observations are critical for reducing these uncertainties by providing accurate redshifts and in particular stellar masses for galaxies in the light cones of the target lens system. This will establish lensing as a powerful and independent tool for determining cosmography, in preparation for the hundreds of time-delay lenses that will be discovered by future surveys.

  20. Quantitative aspects of inductively coupled plasma mass spectrometry

    PubMed Central

    Wagner, Barbara

    2016-01-01

    Accurate determination of elements in various kinds of samples is essential for many areas, including environmental science, medicine, as well as industry. Inductively coupled plasma mass spectrometry (ICP-MS) is a powerful tool enabling multi-elemental analysis of numerous matrices with high sensitivity and good precision. Various calibration approaches can be used to perform accurate quantitative measurements by ICP-MS. They include the use of pure standards, matrix-matched standards, or relevant certified reference materials, assuring traceability of the reported results. This review critically evaluates the advantages and limitations of different calibration approaches, which are used in quantitative analyses by ICP-MS. Examples of such analyses are provided. This article is part of the themed issue ‘Quantitative mass spectrometry’. PMID:27644971

  1. Airport databases for 3D synthetic-vision flight-guidance displays: database design, quality assessment, and data generation

    NASA Astrophysics Data System (ADS)

    Friedrich, Axel; Raabe, Helmut; Schiefele, Jens; Doerr, Kai Uwe

    1999-07-01

    In future aircraft cockpit designs SVS (Synthetic Vision System) databases will be used to display 3D physical and virtual information to pilots. In contrast to pure warning systems (TAWS, MSAW, EGPWS) SVS serve to enhance pilot spatial awareness by 3-dimensional perspective views of the objects in the environment. Therefore all kind of aeronautical relevant data has to be integrated into the SVS-database: Navigation- data, terrain-data, obstacles and airport-Data. For the integration of all these data the concept of a GIS (Geographical Information System) based HQDB (High-Quality- Database) has been created at the TUD (Technical University Darmstadt). To enable database certification, quality- assessment procedures according to ICAO Annex 4, 11, 14 and 15 and RTCA DO-200A/EUROCAE ED76 were established in the concept. They can be differentiated in object-related quality- assessment-methods following the keywords accuracy, resolution, timeliness, traceability, assurance-level, completeness, format and GIS-related quality assessment methods with the keywords system-tolerances, logical consistence and visual quality assessment. An airport database is integrated in the concept as part of the High-Quality- Database. The contents of the HQDB are chosen so that they support both Flight-Guidance-SVS and other aeronautical applications like SMGCS (Surface Movement and Guidance Systems) and flight simulation as well. Most airport data are not available. Even though data for runways, threshold, taxilines and parking positions were to be generated by the end of 1997 (ICAO Annex 11 and 15) only a few countries fulfilled these requirements. For that reason methods of creating and certifying airport data have to be found. Remote sensing and digital photogrammetry serve as means to acquire large amounts of airport objects with high spatial resolution and accuracy in much shorter time than with classical surveying methods. Remotely sensed images can be acquired from satellite

  2. Using pseudoalignment and base quality to accurately quantify microbial community composition

    PubMed Central

    Novembre, John

    2018-01-01

    Pooled DNA from multiple unknown organisms arises in a variety of contexts, for example microbial samples from ecological or human health research. Determining the composition of pooled samples can be difficult, especially at the scale of modern sequencing data and reference databases. Here we propose a novel method for taxonomic profiling in pooled DNA that combines the speed and low-memory requirements of k-mer based pseudoalignment with a likelihood framework that uses base quality information to better resolve multiply mapped reads. We apply the method to the problem of classifying 16S rRNA reads using a reference database of known organisms, a common challenge in microbiome research. Using simulations, we show the method is accurate across a variety of read lengths, with different length reference sequences, at different sample depths, and when samples contain reads originating from organisms absent from the reference. We also assess performance in real 16S data, where we reanalyze previous genetic association data to show our method discovers a larger number of quantitative trait associations than other widely used methods. We implement our method in the software Karp, for k-mer based analysis of read pools, to provide a novel combination of speed and accuracy that is uniquely suited for enhancing discoveries in microbial studies. PMID:29659582

  3. Mass type-specific sparse representation for mass classification in computer-aided detection on mammograms

    PubMed Central

    2013-01-01

    Background Breast cancer is the leading cause of both incidence and mortality in women population. For this reason, much research effort has been devoted to develop Computer-Aided Detection (CAD) systems for early detection of the breast cancers on mammograms. In this paper, we propose a new and novel dictionary configuration underpinning sparse representation based classification (SRC). The key idea of the proposed algorithm is to improve the sparsity in terms of mass margins for the purpose of improving classification performance in CAD systems. Methods The aim of the proposed SRC framework is to construct separate dictionaries according to the types of mass margins. The underlying idea behind our method is that the separated dictionaries can enhance the sparsity of mass class (true-positive), leading to an improved performance for differentiating mammographic masses from normal tissues (false-positive). When a mass sample is given for classification, the sparse solutions based on corresponding dictionaries are separately solved and combined at score level. Experiments have been performed on both database (DB) named as Digital Database for Screening Mammography (DDSM) and clinical Full Field Digital Mammogram (FFDM) DBs. In our experiments, sparsity concentration in the true class (SCTC) and area under the Receiver operating characteristic (ROC) curve (AUC) were measured for the comparison between the proposed method and a conventional single dictionary based approach. In addition, a support vector machine (SVM) was used for comparing our method with state-of-the-arts classifier extensively used for mass classification. Results Comparing with the conventional single dictionary configuration, the proposed approach is able to improve SCTC of up to 13.9% and 23.6% on DDSM and FFDM DBs, respectively. Moreover, the proposed method is able to improve AUC with 8.2% and 22.1% on DDSM and FFDM DBs, respectively. Comparing to SVM classifier, the proposed method improves

  4. Systematization of the mass spectra for speciation of inorganic salts with static secondary ion mass spectrometry.

    PubMed

    Van Ham, Rita; Van Vaeck, Luc; Adams, Freddy C; Adriaens, Annemie

    2004-05-01

    The analytical use of mass spectra from static secondary ion mass spectrometry for the molecular identification of inorganic analytes in real life surface layers and microobjects requires an empirical insight in the signals to be expected from a given compound. A comprehensive database comprising over 50 salts has been assembled to complement prior data on oxides. The present study allows the systematic trends in the relationship between the detected signals and molecular composition of the analyte to be delineated. The mass spectra provide diagnostic information by means of atomic ions, structural fragments, molecular ions, and adduct ions of the analyte neutrals. The prediction of mass spectra from a given analyte must account for the charge state of the ions in the salt, the formation of oxide-type neutrals from oxy salts, and the occurrence of oxidation-reduction processes.

  5. Imaging mass spectrometry statistical analysis.

    PubMed

    Jones, Emrys A; Deininger, Sören-Oliver; Hogendoorn, Pancras C W; Deelder, André M; McDonnell, Liam A

    2012-08-30

    Imaging mass spectrometry is increasingly used to identify new candidate biomarkers. This clinical application of imaging mass spectrometry is highly multidisciplinary: expertise in mass spectrometry is necessary to acquire high quality data, histology is required to accurately label the origin of each pixel's mass spectrum, disease biology is necessary to understand the potential meaning of the imaging mass spectrometry results, and statistics to assess the confidence of any findings. Imaging mass spectrometry data analysis is further complicated because of the unique nature of the data (within the mass spectrometry field); several of the assumptions implicit in the analysis of LC-MS/profiling datasets are not applicable to imaging. The very large size of imaging datasets and the reporting of many data analysis routines, combined with inadequate training and accessible reviews, have exacerbated this problem. In this paper we provide an accessible review of the nature of imaging data and the different strategies by which the data may be analyzed. Particular attention is paid to the assumptions of the data analysis routines to ensure that the reader is apprised of their correct usage in imaging mass spectrometry research. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. Comparison of the NCI open database with seven large chemical structural databases.

    PubMed

    Voigt, J H; Bienfait, B; Wang, S; Nicklaus, M C

    2001-01-01

    Eight large chemical databases have been analyzed and compared to each other. Central to this comparison is the open National Cancer Institute (NCI) database, consisting of approximately 250 000 structures. The other databases analyzed are the Available Chemicals Directory ("ACD," from MDL, release 1.99, 3D-version); the ChemACX ("ACX," from CamSoft, Version 4.5); the Maybridge Catalog and the Asinex database (both as distributed by CamSoft as part of ChemInfo 4.5); the Sigma-Aldrich Catalog (CD-ROM, 1999 Version); the World Drug Index ("WDI," Derwent, version 1999.03); and the organic part of the Cambridge Crystallographic Database ("CSD," from Cambridge Crystallographic Data Center, 1999 Version 5.18). The database properties analyzed are internal duplication rates; compounds unique to each database; cumulative occurrence of compounds in an increasing number of databases; overlap of identical compounds between two databases; similarity overlap; diversity; and others. The crystallographic database CSD and the WDI show somewhat less overlap with the other databases than those with each other. In particular the collections of commercial compounds and compilations of vendor catalogs have a substantial degree of overlap among each other. Still, no database is completely a subset of any other, and each appears to have its own niche and thus "raison d'être". The NCI database has by far the highest number of compounds that are unique to it. Approximately 200 000 of the NCI structures were not found in any of the other analyzed databases.

  7. A spatio-temporal landslide inventory for the NW of Spain: BAPA database

    NASA Astrophysics Data System (ADS)

    Valenzuela, Pablo; Domínguez-Cuesta, María José; Mora García, Manuel Antonio; Jiménez-Sánchez, Montserrat

    2017-09-01

    A landslide database has been created for the Principality of Asturias, NW Spain: the BAPA (Base de datos de Argayos del Principado de Asturias - Principality of Asturias Landslide Database). Data collection is mainly performed through searching local newspaper archives. Moreover, a BAPA App and a BAPA website (http://geol.uniovi.es/BAPA) have been developed to obtain additional information from citizens and institutions. Presently, the dataset covers the period 1980-2015, recording 2063 individual landslides. The use of free cartographic servers, such as Google Maps, Google Street View and Iberpix (Government of Spain), combined with the spatial descriptions and pictures contained in the press news, makes it possible to assess different levels of spatial accuracy. In the database, 59% of the records show an exact spatial location, and 51% of the records provided accurate dates, showing the usefulness of press archives as temporal records. Thus, 32% of the landslides show the highest spatial and temporal accuracy levels. The database also gathers information about the type and characteristics of the landslides, the triggering factors and the damage and costs caused. Field work was conducted to validate the methodology used in assessing the spatial location, temporal occurrence and characteristics of the landslides.

  8. PrimateLit Database

    Science.gov Websites

    Primate Info Net Related Databases NCRR PrimateLit: A bibliographic database for primatology Top of any problems with this service. We welcome your feedback. The PrimateLit database is no longer being Resources, National Institutes of Health. The database is a collaborative project of the Wisconsin Primate

  9. A Database of Young Star Clusters for Five Hundred Galaxies

    NASA Astrophysics Data System (ADS)

    Whitmore, Brad

    2009-07-01

    We propose to use the source lists developed as part of the Hubble Legacy Archive {HLA: Data Release 1 - February 8, 2008} to obtain a large {N 50 galaxies for multi-wavelength, N 500 galaxies for ACS F814W}, uniform {ACS + WFPC2 + NICMOS: DAOphot used for object detection} database of super star clusters in nearby star-forming galaxies in order to address two fundamental astronomical questions: 1} To what degree is the cluster luminosity {and mass} function of star clusters universal ? 2} What fraction of super star clusters are "missing" in optical studies {i.e., are hidden by dust}? This database will also support comparisons with new Monte-Carlo simulations that have independently been developed in the past few years by co-I Larsen and PI Whitmore, and will be used to test the Whitmore, Chandar, Fall {2007} framework designed to understand the demographics of star clusters in all star forming galaxies. The catalogs will increase the number of galaxies with measured mass and luminosity functions by an order of magnitude, and will provide a powerful new tool for comparative studies, both ours and the community's.

  10. Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model

    PubMed Central

    Tan, Maxine; Pu, Jiantao; Zheng, Bin

    2014-01-01

    Purpose: Improving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis (CAD) schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification. Methods: We initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features, and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest (ROIs), we performed the study using a ten-fold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only. Results: The area under the receiver operating characteristic curve (AUC) = 0.805±0.012 was obtained for the classification task. The results also showed that the most frequently-selected features by the SFFS-based algorithm in 10-fold iterations were those related to mass shape, isodensity and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions. Conclusions: In conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be

  11. Detection and quantitation of trace phenolphthalein (in pharmaceutical preparations and in forensic exhibits) by liquid chromatography-tandem mass spectrometry, a sensitive and accurate method.

    PubMed

    Sharma, Kakali; Sharma, Shiba P; Lahiri, Sujit C

    2013-01-01

    Phenolphthalein, an acid-base indicator and laxative, is important as a constituent of widely used weight-reducing multicomponent food formulations. Phenolphthalein is an useful reagent in forensic science for the identification of blood stains of suspected victims and for apprehending erring officials accepting bribes in graft or trap cases. The pink-colored alkaline hand washes originating from the phenolphthalein-smeared notes can easily be determined spectrophotometrically. But in many cases, colored solution turns colorless with time, which renders the genuineness of bribe cases doubtful to the judiciary. No method is known till now for the detection and identification of phenolphthalein in colorless forensic exhibits with positive proof. Liquid chromatography-tandem mass spectrometry had been found to be most sensitive, accurate method capable of detection and quantitation of trace phenolphthalein in commercial formulations and colorless forensic exhibits with positive proof. The detection limit of phenolphthalein was found to be 1.66 pg/L or ng/mL, and the calibration curve shows good linearity (r(2) = 0.9974). © 2012 American Academy of Forensic Sciences.

  12. Nuclear ground-state masses and deformations: FRDM(2012)

    DOE PAGES

    Moller, P.; Sierk, A. J.; Ichikawa, T.; ...

    2016-03-25

    Here, we tabulate the atomic mass excesses and binding energies, ground-state shell-plus-pairing corrections, ground-state microscopic corrections, and nuclear ground-state deformations of 9318 nuclei ranging from 16O to A=339. The calculations are based on the finite-range droplet macroscopic and the folded-Yukawa single-particle microscopic nuclear-structure models, which are completely specified. Relative to our FRDM(1992) mass table in Möller et al. (1995), the results are obtained in the same model, but with considerably improved treatment of deformation and fewer of the approximations that were necessary earlier, due to limitations in computer power. The more accurate execution of the model and the more extensivemore » and more accurate experimental mass data base now available allow us to determine one additional macroscopic-model parameter, the density-symmetry coefficient LL, which was not varied in the previous calculation, but set to zero. Because we now realize that the FRDM is inaccurate for some highly deformed shapes occurring in fission, because some effects are derived in terms of perturbations around a sphere, we only adjust its macroscopic parameters to ground-state masses.« less

  13. Metabolomic approaches for orange origin discrimination by ultra-high performance liquid chromatography coupled to quadrupole time-of-flight mass spectrometry.

    PubMed

    Díaz, Ramon; Pozo, Oscar J; Sancho, Juan V; Hernández, Félix

    2014-08-15

    In this work, hybrid quadrupole time-of-flight mass spectrometer (QTOF MS) coupled to ultra high performance liquid chromatography (UHPLC) has been used for biomarkers identification for correct authentication of Valencia (Spain) oranges. Differentiation from foreign Argentinean, Brazilian and South African oranges has been carried out using XCMS application and multivariate analysis to UHPLC-(Q)TOF MS data acquired in both, positive and negative ionisation modes. Several markers have been found and corroborated by analysing two seasons samples. A seasonal independent marker was found and its structure elucidated using accurate mass data and MS(E) fragmentation spectrum information. Empirical formula was searched in Reaxys database applying sub-structure filtering from the fragments obtained. Three possible structures were found and citrusin D, a compound present in sweet oranges, has been identified as the most plausible as it fits better with the product ion scan performed for this compound. As a result of data obtained in this work, citrusin D is suggested as a potential marker to distinguish the geographic origin of oranges. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Remote visual analysis of large turbulence databases at multiple scales

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pulido, Jesus; Livescu, Daniel; Kanov, Kalin

    The remote analysis and visualization of raw large turbulence datasets is challenging. Current accurate direct numerical simulations (DNS) of turbulent flows generate datasets with billions of points per time-step and several thousand time-steps per simulation. Until recently, the analysis and visualization of such datasets was restricted to scientists with access to large supercomputers. The public Johns Hopkins Turbulence database simplifies access to multi-terabyte turbulence datasets and facilitates the computation of statistics and extraction of features through the use of commodity hardware. In this paper, we present a framework designed around wavelet-based compression for high-speed visualization of large datasets and methodsmore » supporting multi-resolution analysis of turbulence. By integrating common technologies, this framework enables remote access to tools available on supercomputers and over 230 terabytes of DNS data over the Web. Finally, the database toolset is expanded by providing access to exploratory data analysis tools, such as wavelet decomposition capabilities and coherent feature extraction.« less

  15. Remote visual analysis of large turbulence databases at multiple scales

    DOE PAGES

    Pulido, Jesus; Livescu, Daniel; Kanov, Kalin; ...

    2018-06-15

    The remote analysis and visualization of raw large turbulence datasets is challenging. Current accurate direct numerical simulations (DNS) of turbulent flows generate datasets with billions of points per time-step and several thousand time-steps per simulation. Until recently, the analysis and visualization of such datasets was restricted to scientists with access to large supercomputers. The public Johns Hopkins Turbulence database simplifies access to multi-terabyte turbulence datasets and facilitates the computation of statistics and extraction of features through the use of commodity hardware. In this paper, we present a framework designed around wavelet-based compression for high-speed visualization of large datasets and methodsmore » supporting multi-resolution analysis of turbulence. By integrating common technologies, this framework enables remote access to tools available on supercomputers and over 230 terabytes of DNS data over the Web. Finally, the database toolset is expanded by providing access to exploratory data analysis tools, such as wavelet decomposition capabilities and coherent feature extraction.« less

  16. Multi-centre evaluation of mass spectrometric identification of anaerobic bacteria using the VITEK® MS system.

    PubMed

    Garner, O; Mochon, A; Branda, J; Burnham, C-A; Bythrow, M; Ferraro, M; Ginocchio, C; Jennemann, R; Manji, R; Procop, G W; Richter, S; Rychert, J; Sercia, L; Westblade, L; Lewinski, M

    2014-04-01

    Accurate and timely identification of anaerobic bacteria is critical to successful treatment. Classic phenotypic methods for identification require long turnaround times and can exhibit poor species level identification. Matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is an identification method that can provide rapid identification of anaerobes. We present a multi-centre study assessing the clinical performance of the VITEK(®) MS in the identification of anaerobic bacteria. Five different test sites analysed a collection of 651 unique anaerobic isolates comprising 11 different genera. Multiple species were included for several of the genera. Briefly, anaerobic isolates were applied directly to a well of a target plate. Matrix solution (α-cyano-4-hydroxycinnamic acid) was added and allowed to dry. Mass spectra results were generated with the VITEK(®) MS, and the comparative spectral analysis and organism identification were determined using the VITEK(®) MS database 2.0. Results were confirmed by 16S rRNA gene sequencing. Of the 651 isolates analysed, 91.2% (594/651) exhibited the correct species identification. An additional eight isolates were correctly identified to genus level, raising the rate of identification to 92.5%. Genus-level identification consisted of Actinomyces, Bacteroides and Prevotella species. Fusobacterium nucleatum, Actinomyces neuii and Bacteroides uniformis were notable for an increased percentage of no-identification results compared with the other anaerobes tested. VITEK(®) MS identification of clinically relevant anaerobes is highly accurate and represents a dramatic improvement over other phenotypic methods in accuracy and turnaround time. © 2013 The Authors Clinical Microbiology and Infection © 2013 European Society of Clinical Microbiology and Infectious Diseases.

  17. Genome databases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Courteau, J.

    1991-10-11

    Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts inmore » the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.« less

  18. A Novel Method for Accurate Operon Predictions in All SequencedProkaryotes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Price, Morgan N.; Huang, Katherine H.; Alm, Eric J.

    2004-12-01

    We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacterpylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, andmore » its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from sixphylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.« less

  19. Micro-mass standards to calibrate the sensitivity of mass comparators

    NASA Astrophysics Data System (ADS)

    Madec, Tanguy; Mann, Gaëlle; Meury, Paul-André; Rabault, Thierry

    2007-10-01

    In mass metrology, the standards currently used are calibrated by a chain of comparisons, performed using mass comparators, that extends ultimately from the international prototype (which is the definition of the unit of mass) to the standards in routine use. The differences measured in the course of these comparisons become smaller and smaller as the standards approach the definitions of their units, precisely because of how accurately they have been adjusted. One source of uncertainty in the determination of the difference of mass between the mass compared and the reference mass is the sensitivity error of the comparator used. Unfortunately, in the market there are no mass standards small enough (of the order of a few hundreds of micrograms) for a valid evaluation of this source of uncertainty. The users of these comparators therefore have no choice but to rely on the characteristics claimed by the makers of the comparators, or else to determine this sensitivity error at higher values (at least 1 mg) and interpolate from this result to smaller differences of mass. For this reason, the LNE decided to produce and calibrate micro-mass standards having nominal values between 100 µg and 900 µg. These standards were developed, then tested in multiple comparisons on an A5 type automatic comparator. They have since been qualified and calibrated in a weighing design, repeatedly and over an extended period of time, to establish their stability with respect to oxidation and the harmlessness of the handling and storage procedure associated with their use. Finally, the micro-standards so qualified were used to characterize the sensitivity errors of two of the LNE's mass comparators, including the one used to tie France's Platinum reference standard (Pt 35) to stainless steel and superalloy standards.

  20. Qualitative metabolome analysis of human cerebrospinal fluid by 13C-/12C-isotope dansylation labeling combined with liquid chromatography Fourier transform ion cyclotron resonance mass spectrometry.

    PubMed

    Guo, Kevin; Bamforth, Fiona; Li, Liang

    2011-02-01

    Metabolome analysis of human cerebrospinal fluid (CSF) is challenging because of low abundance of metabolites present in a small volume of sample. We describe and apply a sensitive isotope labeling LC-MS technique for qualitative analysis of the CSF metabolome. After a CSF sample is divided into two aliquots, they are labeled by (13)C-dansyl and (12)C-dansyl chloride, respectively. The differentially labeled aliquots are then mixed and subjected to LC-MS using Fourier-transform ion cyclotron resonance mass spectrometry (FTICR MS). Dansylation offers significant improvement in the performance of chromatography separation and detection sensitivity. Moreover, peaks detected in the mass spectra can be readily analyzed for ion pair recognition and database search based on accurate mass and/or retention time information. It is shown that about 14,000 features can be detected in a 25-min LC-FTICR MS run of a dansyl-labeled CSF sample, from which about 500 metabolites can be profiled. Results from four CSF samples are compared to gauge the detectability of metabolites by this method. About 261 metabolites are commonly detected in replicate runs of four samples. In total, 1132 unique metabolite ion pairs are detected and 347 pairs (31%) matched with at least one metabolite in the Human Metabolome Database. We also report a dansylation library of 220 standard compounds and, using this library, about 85 metabolites can be positively identified. Among them, 21 metabolites have never been reported to be associated with CSF. These results illustrate that the dansylation LC-FTICR MS method can be used to analyze the CSF metabolome in a more comprehensive manner. © American Society for Mass Spectrometry, 2011