USDA-ARS?s Scientific Manuscript database
Welcome to the Morchella MLST database. This dedicated database was set up at the CBS-KNAW Biodiversity Center by Vincent Robert in February 2012, using BioloMICS software (Robert et al., 2011), to facilitate DNA sequence-based identifications of Morchella species via the Internet. The current datab...
ApiEST-DB: analyzing clustered EST data of the apicomplexan parasites.
Li, Li; Crabtree, Jonathan; Fischer, Steve; Pinney, Deborah; Stoeckert, Christian J; Sibley, L David; Roos, David S
2004-01-01
ApiEST-DB (http://www.cbil.upenn.edu/paradbs-servlet/) provides integrated access to publicly available EST data from protozoan parasites in the phylum Apicomplexa. The database currently incorporates a total of nearly 100,000 ESTs from several parasite species of clinical and/or veterinary interest, including Eimeria tenella, Neospora caninum, Plasmodium falciparum, Sarcocystis neurona and Toxoplasma gondii. To facilitate analysis of these data, EST sequences were clustered and assembled to form consensus sequences for each organism, and these assemblies were then subjected to automated annotation via similarity searches against protein and domain databases. The underlying relational database infrastructure, Genomics Unified Schema (GUS), enables complex biologically based queries, facilitating validation of gene models, identification of alternative splicing, detection of single nucleotide polymorphisms, identification of stage-specific genes and recognition of phylogenetically conserved and phylogenetically restricted sequences.
Competitive code-based fast palmprint identification using a set of cover trees
NASA Astrophysics Data System (ADS)
Yue, Feng; Zuo, Wangmeng; Zhang, David; Wang, Kuanquan
2009-06-01
A palmprint identification system recognizes a query palmprint image by searching for its nearest neighbor from among all the templates in a database. When applied on a large-scale identification system, it is often necessary to speed up the nearest-neighbor searching process. We use competitive code, which has very fast feature extraction and matching speed, for palmprint identification. To speed up the identification process, we extend the cover tree method and propose to use a set of cover trees to facilitate the fast and accurate nearest-neighbor searching. We can use the cover tree method because, as we show, the angular distance used in competitive code can be decomposed into a set of metrics. Using the Hong Kong PolyU palmprint database (version 2) and a large-scale palmprint database, our experimental results show that the proposed method searches for nearest neighbors faster than brute force searching.
USDA-ARS?s Scientific Manuscript database
Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...
Jacobs, Jeffrey P.; Pasquali, Sara K.; Austin, Erle; Gaynor, J. William; Backer, Carl; Hirsch-Romano, Jennifer C.; Williams, William G.; Caldarone, Christopher A.; McCrindle, Brian W.; Graham, Karen E.; Dokholyan, Rachel S.; Shook, Gregory J.; Poteat, Jennifer; Baxi, Maulik V.; Karamlou, Tara; Blackstone, Eugene H.; Mavroudis, Constantine; Mayer, John E.; Jonas, Richard A.; Jacobs, Marshall L.
2014-01-01
Purpose The Society of Thoracic Surgeons Congenital Heart Surgery Database (STS-CHSD) is the largest Registry in the world of patients who have undergone congenital and pediatric cardiac surgical operations. The Congenital Heart Surgeons’ Society Database (CHSS-D) is an Academic Database designed for specialized detailed analyses of specific congenital cardiac malformations and related treatment strategies. The goal of this project was to create a link between the STS-CHSD and the CHSS-D in order to facilitate studies not possible using either individual database alone and to help identify patients who are potentially eligible for enrollment in CHSS studies. Methods Centers were classified on the basis of participation in the STS-CHSD, the CHSS-D, or both. Five matrices, based on CHSS inclusionary criteria and STS-CHSD codes, were created to facilitate the automated identification of patients in the STS-CHSD who meet eligibility criteria for the five active CHSS studies. The matrices were evaluated with a manual adjudication process and were iteratively refined. The sensitivity and specificity of the original matrices and the refined matrices were assessed. Results In January 2012, a total of 100 centers participated in the STS-CHSD and 74 centers participated in the CHSS. A total of 70 centers participate in both and 40 of these 70 agreed to participate in this linkage project. The manual adjudication process and the refinement of the matrices resulted in an increase in the sensitivity of the matrices from 93% to 100% and an increase in the specificity of the matrices from 94% to 98%. Conclusion Matrices were created to facilitate the automated identification of patients potentially eligible for the five active CHSS studies using the STS-CHSD. These matrices have a sensitivity of 100% and a specificity of 98%. In addition to facilitating identification of patients potentially eligible for enrollment in CHSS studies, these matrices will allow (1) estimation of the denominator of patients potentially eligible for CHSS studies and (2) comparison of eligible and enrolled patients to potentially eligible and not enrolled patients to assess the generalizability of CHSS studies. PMID:24668974
Dhanasekaran, A Ranjitha; Pearson, Jon L; Ganesan, Balasubramanian; Weimer, Bart C
2015-02-25
Mass spectrometric analysis of microbial metabolism provides a long list of possible compounds. Restricting the identification of the possible compounds to those produced by the specific organism would benefit the identification process. Currently, identification of mass spectrometry (MS) data is commonly done using empirically derived compound databases. Unfortunately, most databases contain relatively few compounds, leaving long lists of unidentified molecules. Incorporating genome-encoded metabolism enables MS output identification that may not be included in databases. Using an organism's genome as a database restricts metabolite identification to only those compounds that the organism can produce. To address the challenge of metabolomic analysis from MS data, a web-based application to directly search genome-constructed metabolic databases was developed. The user query returns a genome-restricted list of possible compound identifications along with the putative metabolic pathways based on the name, formula, SMILES structure, and the compound mass as defined by the user. Multiple queries can be done simultaneously by submitting a text file created by the user or obtained from the MS analysis software. The user can also provide parameters specific to the experiment's MS analysis conditions, such as mass deviation, adducts, and detection mode during the query so as to provide additional levels of evidence to produce the tentative identification. The query results are provided as an HTML page and downloadable text file of possible compounds that are restricted to a specific genome. Hyperlinks provided in the HTML file connect the user to the curated metabolic databases housed in ProCyc, a Pathway Tools platform, as well as the KEGG Pathway database for visualization and metabolic pathway analysis. Metabolome Searcher, a web-based tool, facilitates putative compound identification of MS output based on genome-restricted metabolic capability. This enables researchers to rapidly extend the possible identifications of large data sets for metabolites that are not in compound databases. Putative compound names with their associated metabolic pathways from metabolomics data sets are returned to the user for additional biological interpretation and visualization. This novel approach enables compound identification by restricting the possible masses to those encoded in the genome.
Handwriting Identification, Matching, and Indexing in Noisy Document Images
2006-01-01
algorithm to detect all parallel lines simultaneously. Our method can detect 96.8% of the severely broken rule lines in the Arabic database we collected...in the database to guide later processing. It is widely used in banks, post offices, and tax offices where the types of forms are most often pre...be used for different fields), and output the recognition results to a database . Although special anchors may be avail- able to facilitate form
Burnett, Leslie; Barlow-Stewart, Kris; Proos, Anné L; Aizenberg, Harry
2003-05-01
This article describes a generic model for access to samples and information in human genetic databases. The model utilises a "GeneTrustee", a third-party intermediary independent of the subjects and of the investigators or database custodians. The GeneTrustee model has been implemented successfully in various community genetics screening programs and has facilitated research access to genetic databases while protecting the privacy and confidentiality of research subjects. The GeneTrustee model could also be applied to various types of non-conventional genetic databases, including neonatal screening Guthrie card collections, and to forensic DNA samples.
Ortseifen, Vera; Stolze, Yvonne; Maus, Irena; Sczyrba, Alexander; Bremges, Andreas; Albaum, Stefan P; Jaenicke, Sebastian; Fracowiak, Jochen; Pühler, Alfred; Schlüter, Andreas
2016-08-10
To study the metaproteome of a biogas-producing microbial community, fermentation samples were taken from an agricultural biogas plant for microbial cell and protein extraction and corresponding metagenome analyses. Based on metagenome sequence data, taxonomic community profiling was performed to elucidate the composition of bacterial and archaeal sub-communities. The community's cytosolic metaproteome was represented in a 2D-PAGE approach. Metaproteome databases for protein identification were compiled based on the assembled metagenome sequence dataset for the biogas plant analyzed and non-corresponding biogas metagenomes. Protein identification results revealed that the corresponding biogas protein database facilitated the highest identification rate followed by other biogas-specific databases, whereas common public databases yielded insufficient identification rates. Proteins of the biogas microbiome identified as highly abundant were assigned to the pathways involved in methanogenesis, transport and carbon metabolism. Moreover, the integrated metagenome/-proteome approach enabled the examination of genetic-context information for genes encoding identified proteins by studying neighboring genes on the corresponding contig. Exemplarily, this approach led to the identification of a Methanoculleus sp. contig encoding 16 methanogenesis-related gene products, three of which were also detected as abundant proteins within the community's metaproteome. Thus, metagenome contigs provide additional information on the genetic environment of identified abundant proteins. Copyright © 2016 Elsevier B.V. All rights reserved.
MIDAS: a database-searching algorithm for metabolite identification in metabolomics.
Wang, Yingfeng; Kora, Guruprasad; Bowen, Benjamin P; Pan, Chongle
2014-10-07
A database searching approach can be used for metabolite identification in metabolomics by matching measured tandem mass spectra (MS/MS) against the predicted fragments of metabolites in a database. Here, we present the open-source MIDAS algorithm (Metabolite Identification via Database Searching). To evaluate a metabolite-spectrum match (MSM), MIDAS first enumerates possible fragments from a metabolite by systematic bond dissociation, then calculates the plausibility of the fragments based on their fragmentation pathways, and finally scores the MSM to assess how well the experimental MS/MS spectrum from collision-induced dissociation (CID) is explained by the metabolite's predicted CID MS/MS spectrum. MIDAS was designed to search high-resolution tandem mass spectra acquired on time-of-flight or Orbitrap mass spectrometer against a metabolite database in an automated and high-throughput manner. The accuracy of metabolite identification by MIDAS was benchmarked using four sets of standard tandem mass spectra from MassBank. On average, for 77% of original spectra and 84% of composite spectra, MIDAS correctly ranked the true compounds as the first MSMs out of all MetaCyc metabolites as decoys. MIDAS correctly identified 46% more original spectra and 59% more composite spectra at the first MSMs than an existing database-searching algorithm, MetFrag. MIDAS was showcased by searching a published real-world measurement of a metabolome from Synechococcus sp. PCC 7002 against the MetaCyc metabolite database. MIDAS identified many metabolites missed in the previous study. MIDAS identifications should be considered only as candidate metabolites, which need to be confirmed using standard compounds. To facilitate manual validation, MIDAS provides annotated spectra for MSMs and labels observed mass spectral peaks with predicted fragments. The database searching and manual validation can be performed online at http://midas.omicsbio.org.
Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R.
2015-01-01
MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level. PMID:25590854
Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R
2015-01-02
MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level.
USDA-ARS?s Scientific Manuscript database
The CBS-KNAW Fungal Biodiversity Centre’s Fusarium MLST website (http://www.cbs.knaw.nl/Fusarium), and the corresponding Fusarium-ID site hosted at the Pennsylvania State University (http://isolate.fusariumdb.org; Geiser et al. 2004, Park et al. 2010) were constructed to facilitate identification of...
Yang, Chunguang G; Granite, Stephen J; Van Eyk, Jennifer E; Winslow, Raimond L
2006-11-01
Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.
2011-01-01
Background Improvements in the techniques for metabolomics analyses and growing interest in metabolomic approaches are resulting in the generation of increasing numbers of metabolomic profiles. Platforms are required for profile management, as a function of experimental design, and for metabolite identification, to facilitate the mining of the corresponding data. Various databases have been created, including organism-specific knowledgebases and analytical technique-specific spectral databases. However, there is currently no platform meeting the requirements for both profile management and metabolite identification for nuclear magnetic resonance (NMR) experiments. Description MeRy-B, the first platform for plant 1H-NMR metabolomic profiles, is designed (i) to provide a knowledgebase of curated plant profiles and metabolites obtained by NMR, together with the corresponding experimental and analytical metadata, (ii) for queries and visualization of the data, (iii) to discriminate between profiles with spectrum visualization tools and statistical analysis, (iv) to facilitate compound identification. It contains lists of plant metabolites and unknown compounds, with information about experimental conditions, the factors studied and metabolite concentrations for several plant species, compiled from more than one thousand annotated NMR profiles for various organs or tissues. Conclusion MeRy-B manages all the data generated by NMR-based plant metabolomics experiments, from description of the biological source to identification of the metabolites and determinations of their concentrations. It is the first database allowing the display and overlay of NMR metabolomic profiles selected through queries on data or metadata. MeRy-B is available from http://www.cbib.u-bordeaux2.fr/MERYB/index.php. PMID:21668943
Resources for Functional Genomics Studies in Drosophila melanogaster
Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert
2014-01-01
Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
Sana, Theodore R; Roark, Joseph C; Li, Xiangdong; Waddell, Keith; Fischer, Steven M
2008-09-01
In an effort to simplify and streamline compound identification from metabolomics data generated by liquid chromatography time-of-flight mass spectrometry, we have created software for constructing Personalized Metabolite Databases with content from over 15,000 compounds pulled from the public METLIN database (http://metlin.scripps.edu/). Moreover, we have added extra functionalities to the database that (a) permit the addition of user-defined retention times as an orthogonal searchable parameter to complement accurate mass data; and (b) allow interfacing to separate software, a Molecular Formula Generator (MFG), that facilitates reliable interpretation of any database matches from the accurate mass spectral data. To test the utility of this identification strategy, we added retention times to a subset of masses in this database, representing a mixture of 78 synthetic urine standards. The synthetic mixture was analyzed and screened against this METLIN urine database, resulting in 46 accurate mass and retention time matches. Human urine samples were subsequently analyzed under the same analytical conditions and screened against this database. A total of 1387 ions were detected in human urine; 16 of these ions matched both accurate mass and retention time parameters for the 78 urine standards in the database. Another 374 had only an accurate mass match to the database, with 163 of those masses also having the highest MFG score. Furthermore, MFG calculated a formula for a further 849 ions that had no match to the database. Taken together, these results suggest that the METLIN Personal Metabolite database and MFG software offer a robust strategy for confirming the formula of database matches. In the event of no database match, it also suggests possible formulas that may be helpful in interpreting the experimental results.
Advancing the large-scale CCS database for metabolomics and lipidomics at the machine-learning era.
Zhou, Zhiwei; Tu, Jia; Zhu, Zheng-Jiang
2018-02-01
Metabolomics and lipidomics aim to comprehensively measure the dynamic changes of all metabolites and lipids that are present in biological systems. The use of ion mobility-mass spectrometry (IM-MS) for metabolomics and lipidomics has facilitated the separation and the identification of metabolites and lipids in complex biological samples. The collision cross-section (CCS) value derived from IM-MS is a valuable physiochemical property for the unambiguous identification of metabolites and lipids. However, CCS values obtained from experimental measurement and computational modeling are limited available, which significantly restricts the application of IM-MS. In this review, we will discuss the recently developed machine-learning based prediction approach, which could efficiently generate precise CCS databases in a large scale. We will also highlight the applications of CCS databases to support metabolomics and lipidomics. Copyright © 2017 Elsevier Ltd. All rights reserved.
TIPdb-3D: the three-dimensional structure database of phytochemicals from Taiwan indigenous plants
Tung, Chun-Wei; Lin, Ying-Chi; Chang, Hsun-Shuo; Wang, Chia-Chi; Chen, Ih-Sheng; Jheng, Jhao-Liang; Li, Jih-Heng
2014-01-01
The rich indigenous and endemic plants in Taiwan serve as a resourceful bank for biologically active phytochemicals. Based on our TIPdb database curating bioactive phytochemicals from Taiwan indigenous plants, this study presents a three-dimensional (3D) chemical structure database named TIPdb-3D to support the discovery of novel pharmacologically active compounds. The Merck Molecular Force Field (MMFF94) was used to generate 3D structures of phytochemicals in TIPdb. The 3D structures could facilitate the analysis of 3D quantitative structure–activity relationship, the exploration of chemical space and the identification of potential pharmacologically active compounds using protein–ligand docking. Database URL: http://cwtung.kmu.edu.tw/tipdb. PMID:24930145
Casting the Net: The Development of a Resource Collection for an Internet Database.
ERIC Educational Resources Information Center
McKiernan, Gerry
CyberStacks(sm), a demonstration prototype World Wide Web information service, was established on the home page server at Iowa State University with the intent of facilitating identification and use of significant Internet resources in science and technology. CyberStacks(sm) was created in response to perceived deficiencies in early efforts to…
TIPdb-3D: the three-dimensional structure database of phytochemicals from Taiwan indigenous plants.
Tung, Chun-Wei; Lin, Ying-Chi; Chang, Hsun-Shuo; Wang, Chia-Chi; Chen, Ih-Sheng; Jheng, Jhao-Liang; Li, Jih-Heng
2014-01-01
The rich indigenous and endemic plants in Taiwan serve as a resourceful bank for biologically active phytochemicals. Based on our TIPdb database curating bioactive phytochemicals from Taiwan indigenous plants, this study presents a three-dimensional (3D) chemical structure database named TIPdb-3D to support the discovery of novel pharmacologically active compounds. The Merck Molecular Force Field (MMFF94) was used to generate 3D structures of phytochemicals in TIPdb. The 3D structures could facilitate the analysis of 3D quantitative structure-activity relationship, the exploration of chemical space and the identification of potential pharmacologically active compounds using protein-ligand docking. Database URL: http://cwtung.kmu.edu.tw/tipdb. © The Author(s) 2014. Published by Oxford University Press.
Konc, Janez; Cesnik, Tomo; Konc, Joanna Trykowska; Penca, Matej; Janežič, Dušanka
2012-02-27
ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.
Allmer, Jens; Kuhlgert, Sebastian; Hippler, Michael
2008-07-07
The amount of information stemming from proteomics experiments involving (multi dimensional) separation techniques, mass spectrometric analysis, and computational analysis is ever-increasing. Data from such an experimental workflow needs to be captured, related and analyzed. Biological experiments within this scope produce heterogenic data ranging from pictures of one or two-dimensional protein maps and spectra recorded by tandem mass spectrometry to text-based identifications made by algorithms which analyze these spectra. Additionally, peptide and corresponding protein information needs to be displayed. In order to handle the large amount of data from computational processing of mass spectrometric experiments, automatic import scripts are available and the necessity for manual input to the database has been minimized. Information is in a generic format which abstracts from specific software tools typically used in such an experimental workflow. The software is therefore capable of storing and cross analysing results from many algorithms. A novel feature and a focus of this database is to facilitate protein identification by using peptides identified from mass spectrometry and link this information directly to respective protein maps. Additionally, our application employs spectral counting for quantitative presentation of the data. All information can be linked to hot spots on images to place the results into an experimental context. A summary of identified proteins, containing all relevant information per hot spot, is automatically generated, usually upon either a change in the underlying protein models or due to newly imported identifications. The supporting information for this report can be accessed in multiple ways using the user interface provided by the application. We present a proteomics database which aims to greatly reduce evaluation time of results from mass spectrometric experiments and enhance result quality by allowing consistent data handling. Import functionality, automatic protein detection, and summary creation act together to facilitate data analysis. In addition, supporting information for these findings is readily accessible via the graphical user interface provided. The database schema and the implementation, which can easily be installed on virtually any server, can be downloaded in the form of a compressed file from our project webpage.
Rose, Annkatrin; Manikantan, Sankaraganesh; Schraegle, Shannon J.; Maloy, Michael A.; Stahlberg, Eric A.; Meier, Iris
2004-01-01
Increasing evidence demonstrates the importance of long coiled-coil proteins for the spatial organization of cellular processes. Although several protein classes with long coiled-coil domains have been studied in animals and yeast, our knowledge about plant long coiled-coil proteins is very limited. The repeat nature of the coiled-coil sequence motif often prevents the simple identification of homologs of animal coiled-coil proteins by generic sequence similarity searches. As a consequence, counterparts of many animal proteins with long coiled-coil domains, like lamins, golgins, or microtubule organization center components, have not been identified yet in plants. Here, all Arabidopsis proteins predicted to contain long stretches of coiled-coil domains were identified by applying the algorithm MultiCoil to a genome-wide screen. A searchable protein database, ARABI-COIL (http://www.coiled-coil.org/arabidopsis), was established that integrates information on number, size, and position of predicted coiled-coil domains with subcellular localization signals, transmembrane domains, and available functional annotations. ARABI-COIL serves as a tool to sort and browse Arabidopsis long coiled-coil proteins to facilitate the identification and selection of candidate proteins of potential interest for specific research areas. Using the database, candidate proteins were identified for Arabidopsis membrane-bound, nuclear, and organellar long coiled-coil proteins. PMID:15020757
THE ROLE OF FORENSIC DENTIST FOLLOWING MASS DISASTER
Kolude, B.; Adeyemi, B.F.; Taiwo, J.O.; Sigbeku, O.F.; Eze, U.O.
2010-01-01
This review article focuses on mass disaster situations that may arise from natural or manmade circumstances and the significant role of forensic dental personnel in human identification following such occurrences. The various forensic dental modalities of identification that include matching techniques, postmortem profiling, genetic fingerprinting, dental fossil assessment and dental biometrics with digital subtraction were considered. The varying extent of use of forensic dental techniques and the resulting positive impact on human identification were considered. The importance of preparation by way of special training for forensic dental personnel, mock disaster rehearsal, and use of modern day technology was stressed. The need for international standardization of identification through the use of Interpol Disaster Victim Identification (DVI) for ms was further emphasized. Recommendations for improved human identification in Nigerian situation include reform of the National Emergency Management Association (NEMA), incorporation of dental care in primary health care to facilitate proper ante mortem database of the populace and commencement of identification at site of disaster. PMID:25161478
TDR Targets: a chemogenomics resource for neglected diseases.
Magariños, María P; Carmona, Santiago J; Crowther, Gregory J; Ralph, Stuart A; Roos, David S; Shanmugam, Dhanasekaran; Van Voorhis, Wesley C; Agüero, Fernán
2012-01-01
The TDR Targets Database (http://tdrtargets.org) has been designed and developed as an online resource to facilitate the rapid identification and prioritization of molecular targets for drug development, focusing on pathogens responsible for neglected human diseases. The database integrates pathogen specific genomic information with functional data (e.g. expression, phylogeny, essentiality) for genes collected from various sources, including literature curation. This information can be browsed and queried using an extensive web interface with functionalities for combining, saving, exporting and sharing the query results. Target genes can be ranked and prioritized using numerical weights assigned to the criteria used for querying. In this report we describe recent updates to the TDR Targets database, including the addition of new genomes (specifically helminths), and integration of chemical structure, property and bioactivity information for biological ligands, drugs and inhibitors and cheminformatic tools for querying and visualizing these chemical data. These changes greatly facilitate exploration of linkages (both known and predicted) between genes and small molecules, yielding insight into whether particular proteins may be druggable, effectively allowing the navigation of chemical space in a genomics context.
TDR Targets: a chemogenomics resource for neglected diseases
Magariños, María P.; Carmona, Santiago J.; Crowther, Gregory J.; Ralph, Stuart A.; Roos, David S.; Shanmugam, Dhanasekaran; Van Voorhis, Wesley C.; Agüero, Fernán
2012-01-01
The TDR Targets Database (http://tdrtargets.org) has been designed and developed as an online resource to facilitate the rapid identification and prioritization of molecular targets for drug development, focusing on pathogens responsible for neglected human diseases. The database integrates pathogen specific genomic information with functional data (e.g. expression, phylogeny, essentiality) for genes collected from various sources, including literature curation. This information can be browsed and queried using an extensive web interface with functionalities for combining, saving, exporting and sharing the query results. Target genes can be ranked and prioritized using numerical weights assigned to the criteria used for querying. In this report we describe recent updates to the TDR Targets database, including the addition of new genomes (specifically helminths), and integration of chemical structure, property and bioactivity information for biological ligands, drugs and inhibitors and cheminformatic tools for querying and visualizing these chemical data. These changes greatly facilitate exploration of linkages (both known and predicted) between genes and small molecules, yielding insight into whether particular proteins may be druggable, effectively allowing the navigation of chemical space in a genomics context. PMID:22116064
Wimmer, Helge; Gundacker, Nina C; Griss, Johannes; Haudek, Verena J; Stättner, Stefan; Mohr, Thomas; Zwickl, Hannes; Paulitschke, Verena; Baron, David M; Trittner, Wolfgang; Kubicek, Markus; Bayer, Editha; Slany, Astrid; Gerner, Christopher
2009-06-01
Interpretation of proteome data with a focus on biomarker discovery largely relies on comparative proteome analyses. Here, we introduce a database-assisted interpretation strategy based on proteome profiles of primary cells. Both 2-D-PAGE and shotgun proteomics are applied. We obtain high data concordance with these two different techniques. When applying mass analysis of tryptic spot digests from 2-D gels of cytoplasmic fractions, we typically identify several hundred proteins. Using the same protein fractions, we usually identify more than thousand proteins by shotgun proteomics. The data consistency obtained when comparing these independent data sets exceeds 99% of the proteins identified in the 2-D gels. Many characteristic differences in protein expression of different cells can thus be independently confirmed. Our self-designed SQL database (CPL/MUW - database of the Clinical Proteomics Laboratories at the Medical University of Vienna accessible via www.meduniwien.ac.at/proteomics/database) facilitates (i) quality management of protein identification data, which are based on MS, (ii) the detection of cell type-specific proteins and (iii) of molecular signatures of specific functional cell states. Here, we demonstrate, how the interpretation of proteome profiles obtained from human liver tissue and hepatocellular carcinoma tissue is assisted by the Clinical Proteomics Laboratories at the Medical University of Vienna-database. Therefore, we suggest that the use of reference experiments supported by a tailored database may substantially facilitate data interpretation of proteome profiling experiments.
2010-01-01
Background Papaver somniferum (opium poppy) is the source for several pharmaceutical benzylisoquinoline alkaloids including morphine, the codeine and sanguinarine. In response to treatment with a fungal elicitor, the biosynthesis and accumulation of sanguinarine is induced along with other plant defense responses in opium poppy cell cultures. The transcriptional induction of alkaloid metabolism in cultured cells provides an opportunity to identify components of this process via the integration of deep transcriptome and proteome databases generated using next-generation technologies. Results A cDNA library was prepared for opium poppy cell cultures treated with a fungal elicitor for 10 h. Using 454 GS-FLX Titanium pyrosequencing, 427,369 expressed sequence tags (ESTs) with an average length of 462 bp were generated. Assembly of these sequences yielded 93,723 unigenes, of which 23,753 were assigned Gene Ontology annotations. Transcripts encoding all known sanguinarine biosynthetic enzymes were identified in the EST database, 5 of which were represented among the 50 most abundant transcripts. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) of total protein extracts from cell cultures treated with a fungal elicitor for 50 h facilitated the identification of 1,004 proteins. Proteins were fractionated by one-dimensional SDS-PAGE and digested with trypsin prior to LC-MS/MS analysis. Query of an opium poppy-specific EST database substantially enhanced peptide identification. Eight out of 10 known sanguinarine biosynthetic enzymes and many relevant primary metabolic enzymes were represented in the peptide database. Conclusions The integration of deep transcriptome and proteome analyses provides an effective platform to catalogue the components of secondary metabolism, and to identify genes encoding uncharacterized enzymes. The establishment of corresponding transcript and protein databases generated by next-generation technologies in a system with a well-defined metabolite profile facilitates an improved linkage between genes, enzymes, and pathway components. The proteome database represents the most relevant alkaloid-producing enzymes, compared with the much deeper and more complete transcriptome library. The transcript database contained full-length mRNAs encoding most alkaloid biosynthetic enzymes, which is a key requirement for the functional characterization of novel gene candidates. PMID:21083930
Bade, Richard; Causanilles, Ana; Emke, Erik; Bijlsma, Lubertus; Sancho, Juan V; Hernandez, Felix; de Voogt, Pim
2016-11-01
A screening approach was applied to influent and effluent wastewater samples. After injection in a LC-LTQ-Orbitrap, data analysis was performed using two deconvolution tools, MsXelerator (modules MPeaks and MS Compare) and Sieve 2.1. The outputs were searched incorporating an in-house database of >200 pharmaceuticals and illicit drugs or ChemSpider. This hidden target screening approach led to the detection of numerous compounds including the illicit drug cocaine and its metabolite benzoylecgonine and the pharmaceuticals carbamazepine, gemfibrozil and losartan. The compounds found using both approaches were combined, and isotopic pattern and retention time prediction were used to filter out false positives. The remaining potential positives were reanalysed in MS/MS mode and their product ions were compared with literature and/or mass spectral libraries. The inclusion of the chemical database ChemSpider led to the tentative identification of several metabolites, including paraxanthine, theobromine, theophylline and carboxylosartan, as well as the pharmaceutical phenazone. The first three of these compounds are isomers and they were subsequently distinguished based on their product ions and predicted retention times. This work has shown that the use deconvolution tools facilitates non-target screening and enables the identification of a higher number of compounds. Copyright © 2016 Elsevier B.V. All rights reserved.
LeishCyc: a guide to building a metabolic pathway database and visualization of metabolomic data.
Saunders, Eleanor C; MacRae, James I; Naderer, Thomas; Ng, Milica; McConville, Malcolm J; Likić, Vladimir A
2012-01-01
The complexity of the metabolic networks in even the simplest organisms has raised new challenges in organizing metabolic information. To address this, specialized computer frameworks have been developed to capture, manage, and visualize metabolic knowledge. The leading databases of metabolic information are those organized under the umbrella of the BioCyc project, which consists of the reference database MetaCyc, and a number of pathway/genome databases (PGDBs) each focussed on a specific organism. A number of PGDBs have been developed for bacterial, fungal, and protozoan pathogens, greatly facilitating dissection of the metabolic potential of these organisms and the identification of new drug targets. Leishmania are protozoan parasites belonging to the family Trypanosomatidae that cause a broad spectrum of diseases in humans. In this work we use the LeishCyc database, the BioCyc database for Leishmania major, to describe how to build a BioCyc database from genomic sequences and associated annotations. By using metabolomic data generated in our group, we show how such databases can be utilized to elucidate specific changes in parasite metabolism.
The role of insurance claims databases in drug therapy outcomes research.
Lewis, N J; Patwell, J T; Briesacher, B A
1993-11-01
The use of insurance claims databases in drug therapy outcomes research holds great promise as a cost-effective alternative to post-marketing clinical trials. Claims databases uniquely capture information about episodes of care across healthcare services and settings. They also facilitate the examination of drug therapy effects on cohorts of patients and specific patient subpopulations. However, there are limitations to the use of insurance claims databases including incomplete diagnostic and provider identification data. The characteristics of the population included in the insurance plan, the plan benefit design, and the variables of the database itself can influence the research results. Given the current concerns regarding the completeness of insurance claims databases, and the validity of their data, outcomes research usually requires original data to validate claims data or to obtain additional information. Improvements to claims databases such as standardisation of claims information reporting, addition of pertinent clinical and economic variables, and inclusion of information relative to patient severity of illness, quality of life, and satisfaction with provided care will enhance the benefit of such databases for outcomes research.
Sharma, Amit K; Gohel, Sangeeta; Singh, Satya P
2012-01-01
Actinobase is a relational database of molecular diversity, phylogeny and biocatalytic potential of haloalkaliphilic actinomycetes. The main objective of this data base is to provide easy access to range of information, data storage, comparison and analysis apart from reduced data redundancy, data entry, storage, retrieval costs and improve data security. Information related to habitat, cell morphology, Gram reaction, biochemical characterization and molecular features would allow researchers in understanding identification and stress adaptation of the existing and new candidates belonging to salt tolerant alkaliphilic actinomycetes. The PHP front end helps to add nucleotides and protein sequence of reported entries which directly help researchers to obtain the required details. Analysis of the genus wise status of the salt tolerant alkaliphilic actinomycetes indicated 6 different genera among the 40 classified entries of the salt tolerant alkaliphilic actinomycetes. The results represented wide spread occurrence of salt tolerant alkaliphilic actinomycetes belonging to diverse taxonomic positions. Entries and information related to actinomycetes in the database are publicly accessible at http://www.actinobase.in. On clustalW/X multiple sequence alignment of the alkaline protease gene sequences, different clusters emerged among the groups. The narrow search and limit options of the constructed database provided comparable information. The user friendly access to PHP front end facilitates would facilitate addition of sequences of reported entries. The database is available for free at http://www.actinobase.in.
Abbott, Kenneth L; Nyre, Erik T; Abrahante, Juan; Ho, Yen-Yi; Isaksson Vogel, Rachel; Starr, Timothy K
2015-01-01
Identification of cancer driver gene mutations is crucial for advancing cancer therapeutics. Due to the overwhelming number of passenger mutations in the human tumor genome, it is difficult to pinpoint causative driver genes. Using transposon mutagenesis in mice many laboratories have conducted forward genetic screens and identified thousands of candidate driver genes that are highly relevant to human cancer. Unfortunately, this information is difficult to access and utilize because it is scattered across multiple publications using different mouse genome builds and strength metrics. To improve access to these findings and facilitate meta-analyses, we developed the Candidate Cancer Gene Database (CCGD, http://ccgd-starrlab.oit.umn.edu/). The CCGD is a manually curated database containing a unified description of all identified candidate driver genes and the genomic location of transposon common insertion sites (CISs) from all currently published transposon-based screens. To demonstrate relevance to human cancer, we performed a modified gene set enrichment analysis using KEGG pathways and show that human cancer pathways are highly enriched in the database. We also used hierarchical clustering to identify pathways enriched in blood cancers compared to solid cancers. The CCGD is a novel resource available to scientists interested in the identification of genetic drivers of cancer. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
A database de-identification framework to enable direct queries on medical data for secondary use.
Erdal, B S; Liu, J; Ding, J; Chen, J; Marsh, C B; Kamal, J; Clymer, B D
2012-01-01
To qualify the use of patient clinical records as non-human-subject for research purpose, electronic medical record data must be de-identified so there is minimum risk to protected health information exposure. This study demonstrated a robust framework for structured data de-identification that can be applied to any relational data source that needs to be de-identified. Using a real world clinical data warehouse, a pilot implementation of limited subject areas were used to demonstrate and evaluate this new de-identification process. Query results and performances are compared between source and target system to validate data accuracy and usability. The combination of hashing, pseudonyms, and session dependent randomizer provides a rigorous de-identification framework to guard against 1) source identifier exposure; 2) internal data analyst manually linking to source identifiers; and 3) identifier cross-link among different researchers or multiple query sessions by the same researcher. In addition, a query rejection option is provided to refuse queries resulting in less than preset numbers of subjects and total records to prevent users from accidental subject identification due to low volume of data. This framework does not prevent subject re-identification based on prior knowledge and sequence of events. Also, it does not deal with medical free text de-identification, although text de-identification using natural language processing can be included due its modular design. We demonstrated a framework resulting in HIPAA Compliant databases that can be directly queried by researchers. This technique can be augmented to facilitate inter-institutional research data sharing through existing middleware such as caGrid.
MRPrimerV: a database of PCR primers for RNA virus detection
Kim, Hyerin; Kang, NaNa; An, KyuHyeon; Kim, Doyun; Koo, JaeHyung; Kim, Min-Soo
2017-01-01
Many infectious diseases are caused by viral infections, and in particular by RNA viruses such as MERS, Ebola and Zika. To understand viral disease, detection and identification of these viruses are essential. Although PCR is widely used for rapid virus identification due to its low cost and high sensitivity and specificity, very few online database resources have compiled PCR primers for RNA viruses. To effectively detect viruses, the MRPrimerV database (http://MRPrimerV.com) contains 152 380 247 PCR primer pairs for detection of 1818 viruses, covering 7144 coding sequences (CDSs), representing 100% of the RNA viruses in the most up-to-date NCBI RefSeq database. Due to rigorous similarity testing against all human and viral sequences, every primer in MRPrimerV is highly target-specific. Because MRPrimerV ranks CDSs by the penalty scores of their best primer, users need only use the first primer pair for a single-phase PCR or the first two primer pairs for two-phase PCR. Moreover, MRPrimerV provides the list of genome neighbors that can be detected using each primer pair, covering 22 192 variants of 532 RefSeq RNA viruses. We believe that the public availability of MRPrimerV will facilitate viral metagenomics studies aimed at evaluating the variability of viruses, as well as other scientific tasks. PMID:27899620
De-identifying an EHR database - anonymity, correctness and readability of the medical record.
Pantazos, Kostas; Lauesen, Soren; Lippert, Soren
2011-01-01
Electronic health records (EHR) contain a large amount of structured data and free text. Exploring and sharing clinical data can improve healthcare and facilitate the development of medical software. However, revealing confidential information is against ethical principles and laws. We de-identified a Danish EHR database with 437,164 patients. The goal was to generate a version with real medical records, but related to artificial persons. We developed a de-identification algorithm that uses lists of named entities, simple language analysis, and special rules. Our algorithm consists of 3 steps: collect lists of identifiers from the database and external resources, define a replacement for each identifier, and replace identifiers in structured data and free text. Some patient records could not be safely de-identified, so the de-identified database has 323,122 patient records with an acceptable degree of anonymity, readability and correctness (F-measure of 95%). The algorithm has to be adjusted for each culture, language and database.
Protein Information Resource: a community resource for expert annotation of protein data
Barker, Winona C.; Garavelli, John S.; Hou, Zhenglin; Huang, Hongzhan; Ledley, Robert S.; McGarvey, Peter B.; Mewes, Hans-Werner; Orcutt, Bruce C.; Pfeiffer, Friedhelm; Tsugita, Akira; Vinayaka, C. R.; Xiao, Chunlin; Yeh, Lai-Su L.; Wu, Cathy
2001-01-01
The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classification-driven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively cross-referenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP. PMID:11125041
SerpentinaDB: a database of plant-derived molecules of Rauvolfia serpentina.
Pathania, Shivalika; Ramakrishnan, Sai Mukund; Randhawa, Vinay; Bagler, Ganesh
2015-08-04
Plant-derived molecules (PDMs) are known to be a rich source of diverse scaffolds that could serve as a basis for rational drug design. Structured compilation of phytochemicals from traditional medicinal plants can facilitate prospection for novel PDMs and their analogs as therapeutic agents. Rauvolfia serpentina is an important medicinal plant, endemic to Himalayan mountain ranges of Indian subcontinent, reported to be of immense therapeutic value against various diseases. We present SerpentinaDB, a structured compilation of 147 R. serpentina PDMs, inclusive of their plant part source, chemical classification, IUPAC, SMILES, physicochemical properties, and 3D chemical structures with associated references. It also provides refined search option for identification of analogs of natural molecules against ZINC database at user-defined cut-off. SerpentinaDB is an exhaustive resource of R. serpentina molecules facilitating prospection for therapeutic molecules from a medicinally important source of natural products. It also provides refined search option to explore the neighborhood of chemical space against ZINC database to identify analogs of natural molecules obtained as leads. In a previous study, we have demonstrated the utility of this resource by identifying novel aldose reductase inhibitors towards intervention of complications of diabetes.
ProtaBank: A repository for protein design and engineering data.
Wang, Connie Y; Chang, Paul M; Ary, Marie L; Allen, Benjamin D; Chica, Roberto A; Mayo, Stephen L; Olafson, Barry D
2018-03-25
We present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user-friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at https://protabank.org. © 2018 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Raw Cow Milk Bacterial Population Shifts Attributable to Refrigeration
Lafarge, Véronique; Ogier, Jean-Claude; Girard, Victoria; Maladen, Véronique; Leveau, Jean-Yves; Gruss, Alexandra; Delacroix-Buchet, Agnès
2004-01-01
We monitored the dynamic changes in the bacterial population in milk associated with refrigeration. Direct analyses of DNA by using temporal temperature gel electrophoresis (TTGE) and denaturing gradient gel electrophoresis (DGGE) allowed us to make accurate species assignments for bacteria with low-GC-content (low-GC%) (<55%) and medium- or high-GC% (>55%) genomes, respectively. We examined raw milk samples before and after 24-h conservation at 4°C. Bacterial identification was facilitated by comparison with an extensive bacterial reference database (∼150 species) that we established with DNA fragments of pure bacterial strains. Cloning and sequencing of fragments missing from the database were used to achieve complete species identification. Considerable evolution of bacterial populations occurred during conservation at 4°C. TTGE and DGGE are shown to be a powerful tool for identifying the main bacterial species of the raw milk samples and for monitoring changes in bacterial populations during conservation at 4°C. The emergence of psychrotrophic bacteria such as Listeria spp. or Aeromonas hydrophila is demonstrated. PMID:15345453
PRIDE: new developments and new datasets.
Jones, Philip; Côté, Richard G; Cho, Sang Yun; Klie, Sebastian; Martens, Lennart; Quinn, Antony F; Thorneycroft, David; Hermjakob, Henning
2008-01-01
The PRIDE (http://www.ebi.ac.uk/pride) database of protein and peptide identifications was previously described in the NAR Database Special Edition in 2006. Since this publication, the volume of public data in the PRIDE relational database has increased by more than an order of magnitude. Several significant public datasets have been added, including identifications and processed mass spectra generated by the HUPO Brain Proteome Project and the HUPO Liver Proteome Project. The PRIDE software development team has made several significant changes and additions to the user interface and tool set associated with PRIDE. The focus of these changes has been to facilitate the submission process and to improve the mechanisms by which PRIDE can be queried. The PRIDE team has developed a Microsoft Excel workbook that allows the required data to be collated in a series of relatively simple spreadsheets, with automatic generation of PRIDE XML at the end of the process. The ability to query PRIDE has been augmented by the addition of a BioMart interface allowing complex queries to be constructed. Collaboration with groups outside the EBI has been fruitful in extending PRIDE, including an approach to encode iTRAQ quantitative data in PRIDE XML.
BEAUTY-X: enhanced BLAST searches for DNA queries.
Worley, K C; Culpepper, P; Wiese, B A; Smith, R F
1998-01-01
BEAUTY (BLAST Enhanced Alignment Utility) is an enhanced version of the BLAST database search tool that facilitates identification of the functions of matched sequences. Three recent improvements to the BEAUTY program described here make the enhanced output (1) available for DNA queries, (2) available for searches of any protein database, and (3) more up-to-date, with periodic updates of the domain information. BEAUTY searches of the NCBI and EMBL non-redundant protein sequence databases are available from the BCM Search Launcher Web pages (http://gc.bcm.tmc. edu:8088/search-launcher/launcher.html). BEAUTY Post-Processing of submitted search results is available using the BCM Search Launcher Batch Client (version 2.6) (ftp://gc.bcm.tmc. edu/pub/software/search-launcher/). Example figures are available at http://dot.bcm.tmc. edu:9331/papers/beautypp.html (kworley,culpep)@bcm.tmc.edu
Use of the World Wide Web for multisite data collection.
Subramanian, A K; McAfee, A T; Getzinger, J P
1997-08-01
As access to the Internet becomes increasingly available, research applications in medicine will increase. This paper describes the use of the Internet, and, more specifically, the World Wide Web (WWW), as a channel of communication between EDs throughout the world and investigators who are interested in facilitating the collection of data from multiple sites. Data entered into user-friendly electronic surveys can be transmitted over the Internet to a database located at the site of the study, rendering geographic separation less of a barrier to the conduction of multisite studies. The electronic format of the data can enable real-time statistical processing while data are stored using existing database technologies. In theory, automated processing of variables within such a database enables early identification of data trends. Methods of ensuring validity, security, and compliance are discussed.
van Walraven, Carl; Austin, Peter C; Manuel, Douglas; Knoll, Greg; Jennings, Allison; Forster, Alan J
2010-12-01
Administrative databases commonly use codes to indicate diagnoses. These codes alone are often inadequate to accurately identify patients with particular conditions. In this study, we determined whether we could quantify the probability that a person has a particular disease-in this case renal failure-using other routinely collected information available in an administrative data set. This would allow the accurate identification of a disease cohort in an administrative database. We determined whether patients in a randomly selected 100,000 hospitalizations had kidney disease (defined as two or more sequential serum creatinines or the single admission creatinine indicating a calculated glomerular filtration rate less than 60 mL/min/1.73 m²). The independent association of patient- and hospitalization-level variables with renal failure was measured using a multivariate logistic regression model in a random 50% sample of the patients. The model was validated in the remaining patients. Twenty thousand seven hundred thirteen patients had kidney disease (20.7%). A diagnostic code of kidney disease was strongly associated with kidney disease (relative risk: 34.4), but the accuracy of the code was poor (sensitivity: 37.9%; specificity: 98.9%). Twenty-nine patient- and hospitalization-level variables entered the kidney disease model. This model had excellent discrimination (c-statistic: 90.1%) and accurately predicted the probability of true renal failure. The probability threshold that maximized sensitivity and specificity for the identification of true kidney disease was 21.3% (sensitivity: 80.0%; specificity: 82.2%). Multiple variables available in administrative databases can be combined to quantify the probability that a person has a particular disease. This process permits accurate identification of a disease cohort in an administrative database. These methods may be extended to other diagnoses or procedures and could both facilitate and clarify the use of administrative databases for research and quality improvement. Copyright © 2010 Elsevier Inc. All rights reserved.
9 CFR 55.25 - Animal identification.
Code of Federal Regulations, 2014 CFR
2014-01-01
... CWD National Database or in an approved State database. The second animal identification must be... CWD National Database or in an approved State database. The means of animal identification must be...
9 CFR 55.25 - Animal identification.
Code of Federal Regulations, 2013 CFR
2013-01-01
... CWD National Database or in an approved State database. The second animal identification must be... CWD National Database or in an approved State database. The means of animal identification must be...
Yang, Jin; Liang, Qian; Wang, Mei; Jeffries, Cynthia; Smithson, David; Tu, Ying; Boulos, Nidal; Jacob, Melissa R; Shelat, Anang A; Wu, Yunshan; Ravu, Ranga Rao; Gilbertson, Richard; Avery, Mitchell A; Khan, Ikhlas A; Walker, Larry A; Guy, R Kiplin; Li, Xing-Cong
2014-04-25
The generation of natural product libraries containing column fractions, each with only a few small molecules, using a high-throughput, automated fractionation system, has made it possible to implement an improved dereplication strategy for selection and prioritization of leads in a natural product discovery program. Analysis of databased UPLC-MS-ELSD-PDA information of three leads from a biological screen employing the ependymoma cell line EphB2-EPD generated details on the possible structures of active compounds present. The procedure allows the rapid identification of known compounds and guides the isolation of unknown compounds of interest. Three previously known flavanone-type compounds, homoeriodictyol (1), hesperetin (2), and sterubin (3), were identified in a selected fraction derived from the leaves of Eriodictyon angustifolium. The lignan compound deoxypodophyllotoxin (8) was confirmed to be an active constituent in two lead fractions derived from the bark and leaves of Thuja occidentalis. In addition, two new but inactive labdane-type diterpenoids with an uncommon triol side chain were also identified as coexisting with deoxypodophyllotoxin in a lead fraction from the bark of T. occidentalis. Both diterpenoids were isolated in acetylated form, and their structures were determined as 14S,15-diacetoxy-13R-hydroxylabd-8(17)-en-19-oic acid (9) and 14R,15-diacetoxy-13S-hydroxylabd-8(17)-en-19-oic acid (10), respectively, by spectroscopic data interpretation and X-ray crystallography. This work demonstrates that a UPLC-MS-ELSD-PDA database produced during fractionation may be used as a powerful dereplication tool to facilitate compound identification from chromatographically tractable small-molecule natural product libraries.
Gallagher, Sarah A; Smith, Angela B; Matthews, Jonathan E; Potter, Clarence W; Woods, Michael E; Raynor, Mathew; Wallen, Eric M; Rathmell, W Kimryn; Whang, Young E; Kim, William Y; Godley, Paul A; Chen, Ronald C; Wang, Andrew; You, Chaochen; Barocas, Daniel A; Pruthi, Raj S; Nielsen, Matthew E; Milowsky, Matthew I
2014-01-01
The management of genitourinary malignancies requires a multidisciplinary care team composed of urologists, medical oncologists, and radiation oncologists. A genitourinary (GU) oncology clinical database is an invaluable resource for patient care and research. Although electronic medical records provide a single web-based record used for clinical care, billing, and scheduling, information is typically stored in a discipline-specific manner and data extraction is often not applicable to a research setting. A GU oncology database may be used for the development of multidisciplinary treatment plans, analysis of disease-specific practice patterns, and identification of patients for research studies. Despite the potential utility, there are many important considerations that must be addressed when developing and implementing a discipline-specific database. The creation of the GU oncology database including prostate, bladder, and kidney cancers with the identification of necessary variables was facilitated by meetings of stakeholders in medical oncology, urology, and radiation oncology at the University of North Carolina (UNC) at Chapel Hill with a template data dictionary provided by the Department of Urologic Surgery at Vanderbilt University Medical Center. Utilizing Research Electronic Data Capture (REDCap, version 4.14.5), the UNC Genitourinary OncoLogy Database (UNC GOLD) was designed and implemented. The process of designing and implementing a discipline-specific clinical database requires many important considerations. The primary consideration is determining the relationship between the database and the Institutional Review Board (IRB) given the potential applications for both clinical and research uses. Several other necessary steps include ensuring information technology security and federal regulation compliance; determination of a core complete dataset; creation of standard operating procedures; standardizing entry of free text fields; use of data exports, queries, and de-identification strategies; inclusion of individual investigators' data; and strategies for prioritizing specific projects and data entry. A discipline-specific database requires a buy-in from all stakeholders, meticulous development, and data entry resources to generate a unique platform for housing information that may be used for clinical care and research with IRB approval. The steps and issues identified in the development of UNC GOLD provide a process map for others interested in developing a GU oncology database. Copyright © 2014 Elsevier Inc. All rights reserved.
Code of Federal Regulations, 2014 CFR
2014-04-01
... Unique Device Identification Database. 830.350 Section 830.350 Food and Drugs FOOD AND DRUG... Global Unique Device Identification Database § 830.350 Correction of information submitted to the Global Unique Device Identification Database. (a) If FDA becomes aware that any information submitted to the...
Database for Safety-Oriented Tracking of Chemicals
NASA Technical Reports Server (NTRS)
Stump, Jacob; Carr, Sandra; Plumlee, Debrah; Slater, Andy; Samson, Thomas M.; Holowaty, Toby L.; Skeete, Darren; Haenz, Mary Alice; Hershman, Scot; Raviprakash, Pushpa
2010-01-01
SafetyChem is a computer program that maintains a relational database for tracking chemicals and associated hazards at Johnson Space Center (JSC) by use of a Web-based graphical user interface. The SafetyChem database is accessible to authorized users via a JSC intranet. All new chemicals pass through a safety office, where information on hazards, required personal protective equipment (PPE), fire-protection warnings, and target organ effects (TOEs) is extracted from material safety data sheets (MSDSs) and recorded in the database. The database facilitates real-time management of inventory with attention to such issues as stability, shelf life, reduction of waste through transfer of unused chemicals to laboratories that need them, quantification of chemical wastes, and identification of chemicals for which disposal is required. Upon searching the database for a chemical, the user receives information on physical properties of the chemical, hazard warnings, required PPE, a link to the MSDS, and references to the applicable International Standards Organization (ISO) 9000 standard work instructions and the applicable job hazard analysis. Also, to reduce the labor hours needed to comply with reporting requirements of the Occupational Safety and Health Administration, the data can be directly exported into the JSC hazardous- materials database.
DNA-barcoding of forensically important blow flies (Diptera: Calliphoridae) in the Caribbean Region
Agnarsson, Ingi
2017-01-01
Correct identification of forensically important insects, such as flies in the family Calliphoridae, is a crucial step for them to be used as evidence in legal investigations. Traditional identification based on morphology has been effective, but has some limitations when it comes to identifying immature stages of certain species. DNA-barcoding, using COI, has demonstrated potential for rapid and accurate identification of Calliphoridae, however, this gene does not reliably distinguish among some recently diverged species, raising questions about its use for delimitation of species of forensic importance. To facilitate DNA based identification of Calliphoridae in the Caribbean we developed a vouchered reference collection from across the region, and a DNA sequence database, and further added the nuclear ITS2 as a second marker to increase accuracy of identification through barcoding. We morphologically identified freshly collected specimens, did phylogenetic analyses and employed several species delimitation methods for a total of 468 individuals representing 19 described species. Our results show that combination of COI + ITS2 genes yields more accurate identification and diagnoses, and better agreement with morphological data, than the mitochondrial barcodes alone. All of our results from independent and concatenated trees and most of the species delimitation methods yield considerably higher diversity estimates than the distance based approach and morphology. Molecular data support at least 24 distinct clades within Calliphoridae in this study, recovering substantial geographic variation for Lucilia eximia, Lucilia retroversa, Lucilia rica and Chloroprocta idioidea, probably indicating several cryptic species. In sum, our study demonstrates the importance of employing a second nuclear marker for barcoding analyses and species delimitation of calliphorids, and the power of molecular data in combination with a complete reference database to enable identification of taxonomically and geographically diverse insects of forensic importance. PMID:28761780
MetaPro-IQ: a universal metaproteomic approach to studying human and mouse gut microbiota.
Zhang, Xu; Ning, Zhibin; Mayne, Janice; Moore, Jasmine I; Li, Jennifer; Butcher, James; Deeke, Shelley Ann; Chen, Rui; Chiang, Cheng-Kang; Wen, Ming; Mack, David; Stintzi, Alain; Figeys, Daniel
2016-06-24
The gut microbiota has been shown to be closely associated with human health and disease. While next-generation sequencing can be readily used to profile the microbiota taxonomy and metabolic potential, metaproteomics is better suited for deciphering microbial biological activities. However, the application of gut metaproteomics has largely been limited due to the low efficiency of protein identification. Thus, a high-performance and easy-to-implement gut metaproteomic approach is required. In this study, we developed a high-performance and universal workflow for gut metaproteome identification and quantification (named MetaPro-IQ) by using the close-to-complete human or mouse gut microbial gene catalog as database and an iterative database search strategy. An average of 38 and 33 % of the acquired tandem mass spectrometry (MS) spectra was confidently identified for the studied mouse stool and human mucosal-luminal interface samples, respectively. In total, we accurately quantified 30,749 protein groups for the mouse metaproteome and 19,011 protein groups for the human metaproteome. Moreover, the MetaPro-IQ approach enabled comparable identifications with the matched metagenome database search strategy that is widely used but needs prior metagenomic sequencing. The response of gut microbiota to high-fat diet in mice was then assessed, which showed distinct metaproteome patterns for high-fat-fed mice and identified 849 proteins as significant responders to high-fat feeding in comparison to low-fat feeding. We present MetaPro-IQ, a metaproteomic approach for highly efficient intestinal microbial protein identification and quantification, which functions as a universal workflow for metaproteomic studies, and will thus facilitate the application of metaproteomics for better understanding the functions of gut microbiota in health and disease.
HEDD: Human Enhancer Disease Database
Wang, Zhen; Zhang, Quanwei; Zhang, Wen; Lin, Jhih-Rong; Cai, Ying; Mitra, Joydeep
2018-01-01
Abstract Enhancers, as specialized genomic cis-regulatory elements, activate transcription of their target genes and play an important role in pathogenesis of many human complex diseases. Despite recent systematic identification of them in the human genome, currently there is an urgent need for comprehensive annotation databases of human enhancers with a focus on their disease connections. In response, we built the Human Enhancer Disease Database (HEDD) to facilitate studies of enhancers and their potential roles in human complex diseases. HEDD currently provides comprehensive genomic information for ∼2.8 million human enhancers identified by ENCODE, FANTOM5 and RoadMap with disease association scores based on enhancer–gene and gene–disease connections. It also provides Web-based analytical tools to visualize enhancer networks and score enhancers given a set of selected genes in a specific gene network. HEDD is freely accessible at http://zdzlab.einstein.yu.edu/1/hedd.php. PMID:29077884
RatMap--rat genome tools and data.
Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M; Ståhl, Fredrik
2005-01-01
The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB-Genetics at Goteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided.
RatMap—rat genome tools and data
Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M.; Ståhl, Fredrik
2005-01-01
The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB–Genetics at Göteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided. PMID:15608244
T3SEdb: data warehousing of virulence effectors secreted by the bacterial Type III Secretion System.
Tay, Daniel Ming Ming; Govindarajan, Kunde Ramamoorthy; Khan, Asif M; Ong, Terenze Yao Rui; Samad, Hanif M; Soh, Wei Wei; Tong, Minyan; Zhang, Fan; Tan, Tin Wee
2010-10-15
Effectors of Type III Secretion System (T3SS) play a pivotal role in establishing and maintaining pathogenicity in the host and therefore the identification of these effectors is important in understanding virulence. However, the effectors display high level of sequence diversity, therefore making the identification a difficult process. There is a need to collate and annotate existing effector sequences in public databases to enable systematic analyses of these sequences for development of models for screening and selection of putative novel effectors from bacterial genomes that can be validated by a smaller number of key experiments. Herein, we present T3SEdb http://effectors.bic.nus.edu.sg/T3SEdb, a specialized database of annotated T3SS effector (T3SE) sequences containing 1089 records from 46 bacterial species compiled from the literature and public protein databases. Procedures have been defined for i) comprehensive annotation of experimental status of effectors, ii) submission and curation review of records by users of the database, and iii) the regular update of T3SEdb existing and new records. Keyword fielded and sequence searches (BLAST, regular expression) are supported for both experimentally verified and hypothetical T3SEs. More than 171 clusters of T3SEs were detected based on sequence identity comparisons (intra-cluster difference up to ~60%). Owing to this high level of sequence diversity of T3SEs, the T3SEdb provides a large number of experimentally known effector sequences with wide species representation for creation of effector predictors. We created a reliable effector prediction tool, integrated into the database, to demonstrate the application of the database for such endeavours. T3SEdb is the first specialised database reported for T3SS effectors, enriched with manual annotations that facilitated systematic construction of a reliable prediction model for identification of novel effectors. The T3SEdb represents a platform for inclusion of additional annotations of metadata for future developments of sophisticated effector prediction models for screening and selection of putative novel effectors from bacterial genomes/proteomes that can be validated by a small number of key experiments.
Sys-BodyFluid: a systematical database for human body fluid proteome research
Li, Su-Jun; Peng, Mao; Li, Hong; Liu, Bo-Shu; Wang, Chuan; Wu, Jia-Rui; Li, Yi-Xue; Zeng, Rong
2009-01-01
Recently, body fluids have widely become an important target for proteomic research and proteomic study has produced more and more body fluid related protein data. A database is needed to collect and analyze these proteome data. Thus, we developed this web-based body fluid proteome database Sys-BodyFluid. It contains eleven kinds of body fluid proteomes, including plasma/serum, urine, cerebrospinal fluid, saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, seminal fluid, human milk and amniotic fluid. Over 10 000 proteins are presented in the Sys-BodyFluid. Sys-BodyFluid provides the detailed protein annotations, including protein description, Gene Ontology, domain information, protein sequence and involved pathways. These proteome data can be retrieved by using protein name, protein accession number and sequence similarity. In addition, users can query between these different body fluids to get the different proteins identification information. Sys-BodyFluid database can facilitate the body fluid proteomics and disease proteomics research as a reference database. It is available at http://www.biosino.org/bodyfluid/. PMID:18978022
Sys-BodyFluid: a systematical database for human body fluid proteome research.
Li, Su-Jun; Peng, Mao; Li, Hong; Liu, Bo-Shu; Wang, Chuan; Wu, Jia-Rui; Li, Yi-Xue; Zeng, Rong
2009-01-01
Recently, body fluids have widely become an important target for proteomic research and proteomic study has produced more and more body fluid related protein data. A database is needed to collect and analyze these proteome data. Thus, we developed this web-based body fluid proteome database Sys-BodyFluid. It contains eleven kinds of body fluid proteomes, including plasma/serum, urine, cerebrospinal fluid, saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, seminal fluid, human milk and amniotic fluid. Over 10,000 proteins are presented in the Sys-BodyFluid. Sys-BodyFluid provides the detailed protein annotations, including protein description, Gene Ontology, domain information, protein sequence and involved pathways. These proteome data can be retrieved by using protein name, protein accession number and sequence similarity. In addition, users can query between these different body fluids to get the different proteins identification information. Sys-BodyFluid database can facilitate the body fluid proteomics and disease proteomics research as a reference database. It is available at http://www.biosino.org/bodyfluid/.
Update of the Diatom EST Database: a new tool for digital transcriptomics
Maheswari, Uma; Mock, Thomas; Armbrust, E. Virginia; Bowler, Chris
2009-01-01
The Diatom Expressed Sequence Tag (EST) Database was constructed to provide integral access to ESTs from these ecologically and evolutionarily interesting microalgae. It has now been updated with 130 000 Phaeodactylum tricornutum ESTs from 16 cDNA libraries and 77 000 Thalassiosira pseudonana ESTs from seven libraries, derived from cells grown in different nutrient and stress regimes. The updated relational database incorporates results from statistical analyses such as log-likelihood ratios and hierarchical clustering, which help to identify differentially expressed genes under different conditions, and allow similarities in gene expression in different libraries to be investigated in a functional context. The database also incorporates links to the recently sequenced genomes of P. tricornutum and T. pseudonana, enabling an easy cross-talk between the expression pattern of diatom orthologs and the genome browsers. These improvements will facilitate exploration of diatom responses to conditions of ecological relevance and will aid gene function identification of diatom-specific genes and in silico gene prediction in this largely unexplored class of eukaryotes. The updated Diatom EST Database is available at http://www.biologie.ens.fr/diatomics/EST3. PMID:19029140
SATPdb: a database of structurally annotated therapeutic peptides
Singh, Sandeep; Chaudhary, Kumardeep; Dhanda, Sandeep Kumar; Bhalla, Sherry; Usmani, Salman Sadullah; Gautam, Ankur; Tuknait, Abhishek; Agrawal, Piyush; Mathur, Deepika; Raghava, Gajendra P.S.
2016-01-01
SATPdb (http://crdd.osdd.net/raghava/satpdb/) is a database of structurally annotated therapeutic peptides, curated from 22 public domain peptide databases/datasets including 9 of our own. The current version holds 19192 unique experimentally validated therapeutic peptide sequences having length between 2 and 50 amino acids. It covers peptides having natural, non-natural and modified residues. These peptides were systematically grouped into 10 categories based on their major function or therapeutic property like 1099 anticancer, 10585 antimicrobial, 1642 drug delivery and 1698 antihypertensive peptides. We assigned or annotated structure of these therapeutic peptides using structural databases (Protein Data Bank) and state-of-the-art structure prediction methods like I-TASSER, HHsearch and PEPstrMOD. In addition, SATPdb facilitates users in performing various tasks that include: (i) structure and sequence similarity search, (ii) peptide browsing based on their function and properties, (iii) identification of moonlighting peptides and (iv) searching of peptides having desired structure and therapeutic activities. We hope this database will be useful for researchers working in the field of peptide-based therapeutics. PMID:26527728
galaxie--CGI scripts for sequence identification through automated phylogenetic analysis.
Nilsson, R Henrik; Larsson, Karl-Henrik; Ursing, Björn M
2004-06-12
The prevalent use of similarity searches like BLAST to identify sequences and species implicitly assumes the reference database to be of extensive sequence sampling. This is often not the case, restraining the correctness of the outcome as a basis for sequence identification. Phylogenetic inference outperforms similarity searches in retrieving correct phylogenies and consequently sequence identities, and a project was initiated to design a freely available script package for sequence identification through automated Web-based phylogenetic analysis. Three CGI scripts were designed to facilitate qualified sequence identification from a Web interface. Query sequences are aligned to pre-made alignments or to alignments made by ClustalW with entries retrieved from a BLAST search. The subsequent phylogenetic analysis is based on the PHYLIP package for inferring neighbor-joining and parsimony trees. The scripts are highly configurable. A service installation and a version for local use are found at http://andromeda.botany.gu.se/galaxiewelcome.html and http://galaxie.cgb.ki.se
Sandhu, Maninder; Sureshkumar, V; Prakash, Chandra; Dixit, Rekha; Solanke, Amolkumar U; Sharma, Tilak Raj; Mohapatra, Trilochan; S V, Amitha Mithra
2017-09-30
Genome-wide microarray has enabled development of robust databases for functional genomics studies in rice. However, such databases do not directly cater to the needs of breeders. Here, we have attempted to develop a web interface which combines the information from functional genomic studies across different genetic backgrounds with DNA markers so that they can be readily deployed in crop improvement. In the current version of the database, we have included drought and salinity stress studies since these two are the major abiotic stresses in rice. RiceMetaSys, a user-friendly and freely available web interface provides comprehensive information on salt responsive genes (SRGs) and drought responsive genes (DRGs) across genotypes, crop development stages and tissues, identified from multiple microarray datasets. 'Physical position search' is an attractive tool for those using QTL based approach for dissecting tolerance to salt and drought stress since it can provide the list of SRGs and DRGs in any physical interval. To identify robust candidate genes for use in crop improvement, the 'common genes across varieties' search tool is useful. Graphical visualization of expression profiles across genes and rice genotypes has been enabled to facilitate the user and to make the comparisons more impactful. Simple Sequence Repeat (SSR) search in the SRGs and DRGs is a valuable tool for fine mapping and marker assisted selection since it provides primers for survey of polymorphism. An external link to intron specific markers is also provided for this purpose. Bulk retrieval of data without any limit has been enabled in case of locus and SSR search. The aim of this database is to facilitate users with a simple and straight-forward search options for identification of robust candidate genes from among thousands of SRGs and DRGs so as to facilitate linking variation in expression profiles to variation in phenotype. Database URL: http://14.139.229.201.
PeroxisomeDB: a database for the peroxisomal proteome, functional genomics and disease
Schlüter, Agatha; Fourcade, Stéphane; Domènech-Estévez, Enric; Gabaldón, Toni; Huerta-Cepas, Jaime; Berthommier, Guillaume; Ripp, Raymond; Wanders, Ronald J. A.; Poch, Olivier; Pujol, Aurora
2007-01-01
Peroxisomes are essential organelles of eukaryotic origin, ubiquitously distributed in cells and organisms, playing key roles in lipid and antioxidant metabolism. Loss or malfunction of peroxisomes causes more than 20 fatal inherited conditions. We have created a peroxisomal database () that includes the complete peroxisomal proteome of Homo sapiens and Saccharomyces cerevisiae, by gathering, updating and integrating the available genetic and functional information on peroxisomal genes. PeroxisomeDB is structured in interrelated sections ‘Genes’, ‘Functions’, ‘Metabolic pathways’ and ‘Diseases’, that include hyperlinks to selected features of NCBI, ENSEMBL and UCSC databases. We have designed graphical depictions of the main peroxisomal metabolic routes and have included updated flow charts for diagnosis. Precomputed BLAST, PSI-BLAST, multiple sequence alignment (MUSCLE) and phylogenetic trees are provided to assist in direct multispecies comparison to study evolutionary conserved functions and pathways. Highlights of the PeroxisomeDB include new tools developed for facilitating (i) identification of novel peroxisomal proteins, by means of identifying proteins carrying peroxisome targeting signal (PTS) motifs, (ii) detection of peroxisomes in silico, particularly useful for screening the deluge of newly sequenced genomes. PeroxisomeDB should contribute to the systematic characterization of the peroxisomal proteome and facilitate system biology approaches on the organelle. PMID:17135190
Age-related differences in audiovisual interactions of semantically different stimuli.
Viggiano, Maria Pia; Giovannelli, Fabio; Giganti, Fiorenza; Rossi, Arianna; Metitieri, Tiziana; Rebai, Mohamed; Guerrini, Renzo; Cincotta, Massimo
2017-01-01
Converging results have shown that adults benefit from congruent multisensory stimulation in the identification of complex stimuli, whereas the developmental trajectory of the ability to integrate multisensory inputs in children is less well understood. In this study we explored the effects of audiovisual semantic congruency on identification of visually presented stimuli belonging to different categories, using a cross-modal approach. Four groups of children ranging in age from 6 to 13 years and adults were administered an object identification task of visually presented pictures belonging to living and nonliving entities. Stimuli were presented in visual, congruent audiovisual, incongruent audiovisual, and noise conditions. Results showed that children under 12 years of age did not benefit from multisensory presentation in speeding up the identification. In children the incoherent audiovisual condition had an interfering effect, especially for the identification of living things. These data suggest that the facilitating effect of the audiovisual interaction into semantic factors undergoes developmental changes and the consolidation of adult-like processing of multisensory stimuli begins in late childhood. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
GMDD: a database of GMO detection methods.
Dong, Wei; Yang, Litao; Shen, Kailin; Kim, Banghyun; Kleter, Gijs A; Marvin, Hans J P; Guo, Rong; Liang, Wanqi; Zhang, Dabing
2008-06-04
Since more than one hundred events of genetically modified organisms (GMOs) have been developed and approved for commercialization in global area, the GMO analysis methods are essential for the enforcement of GMO labelling regulations. Protein and nucleic acid-based detection techniques have been developed and utilized for GMOs identification and quantification. However, the information for harmonization and standardization of GMO analysis methods at global level is needed. GMO Detection method Database (GMDD) has collected almost all the previous developed and reported GMOs detection methods, which have been grouped by different strategies (screen-, gene-, construct-, and event-specific), and also provide a user-friendly search service of the detection methods by GMO event name, exogenous gene, or protein information, etc. In this database, users can obtain the sequences of exogenous integration, which will facilitate PCR primers and probes design. Also the information on endogenous genes, certified reference materials, reference molecules, and the validation status of developed methods is included in this database. Furthermore, registered users can also submit new detection methods and sequences to this database, and the newly submitted information will be released soon after being checked. GMDD contains comprehensive information of GMO detection methods. The database will make the GMOs analysis much easier.
Metal oxide based multisensor array and portable database for field analysis of antioxidants
Sharpe, Erica; Bradley, Ryan; Frasco, Thalia; Jayathilaka, Dilhani; Marsh, Amanda; Andreescu, Silvana
2014-01-01
We report a novel chemical sensing array based on metal oxide nanoparticles as a portable and inexpensive paper-based colorimetric method for polyphenol detection and field characterization of antioxidant containing samples. Multiple metal oxide nanoparticles with various polyphenol binding properties were used as active sensing materials to develop the sensor array and establish a database of polyphenol standards that include epigallocatechin gallate, gallic acid, resveratrol, and Trolox among others. Unique charge-transfer complexes are formed between each polyphenol and each metal oxide on the surface of individual sensors in the array, creating distinct optically detectable signals which have been quantified and logged into a reference database for polyphenol identification. The field-portable Pantone/X-Rite© CapSure® color reader was used to create this database and to facilitate rapid colorimetric analysis. The use of multiple metal-oxide sensors allows for cross-validation of results and increases accuracy of analysis. The database has enabled successful identification and quantification of antioxidant constituents within real botanical extractions including green tea. Formation of charge-transfer complexes is also correlated with antioxidant activity exhibiting electron transfer capabilities of each polyphenol. The antioxidant activity of each sample was calculated and validated against the oxygen radical absorbance capacity (ORAC) assay showing good comparability. The results indicate that this method can be successfully used for a more comprehensive analysis of antioxidant containing samples as compared to conventional methods. This technology can greatly simplify investigations into plant phenolics and make possible the on-site determination of antioxidant composition and activity in remote locations. PMID:24610993
Discovering Knowledge from AIS Database for Application in VTS
NASA Astrophysics Data System (ADS)
Tsou, Ming-Cheng
The widespread use of the Automatic Identification System (AIS) has had a significant impact on maritime technology. AIS enables the Vessel Traffic Service (VTS) not only to offer commonly known functions such as identification, tracking and monitoring of vessels, but also to provide rich real-time information that is useful for marine traffic investigation, statistical analysis and theoretical research. However, due to the rapid accumulation of AIS observation data, the VTS platform is often unable quickly and effectively to absorb and analyze it. Traditional observation and analysis methods are becoming less suitable for the modern AIS generation of VTS. In view of this, we applied the same data mining technique used for business intelligence discovery (in Customer Relation Management (CRM) business marketing) to the analysis of AIS observation data. This recasts the marine traffic problem as a business-marketing problem and integrates technologies such as Geographic Information Systems (GIS), database management systems, data warehousing and data mining to facilitate the discovery of hidden and valuable information in a huge amount of observation data. Consequently, this provides the marine traffic managers with a useful strategic planning resource.
Sharma, Vishal K; Fraulin, Frankie Og; Harrop, A Robertson; McPhalen, Donald F
2011-01-01
Databases are useful tools in clinical settings. The authors review the benefits and challenges associated with the development and implementation of an efficient electronic database for the multidisciplinary Vascular Birthmark Clinic at the Alberta Children's Hospital, Calgary, Alberta. The content and structure of the database were designed using the technical expertise of a data analyst from the Calgary Health Region. Relevant clinical and demographic data fields were included with the goal of documenting ongoing care of individual patients, and facilitating future epidemiological studies of this patient population. After completion of this database, 10 challenges encountered during development were retrospectively identified. Practical solutions for these challenges are presented. THE CHALLENGES IDENTIFIED DURING THE DATABASE DEVELOPMENT PROCESS INCLUDED: identification of relevant data fields; balancing simplicity and user-friendliness with complexity and comprehensive data storage; database expertise versus clinical expertise; software platform selection; linkage of data from the previous spreadsheet to a new data management system; ethics approval for the development of the database and its utilization for research studies; ensuring privacy and limited access to the database; integration of digital photographs into the database; adoption of the database by support staff in the clinic; and maintaining up-to-date entries in the database. There are several challenges involved in the development of a useful and efficient clinical database. Awareness of these potential obstacles, in advance, may simplify the development of clinical databases by others in various surgical settings.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-24
...] Global Unique Device Identification Database; Draft Guidance for Industry; Availability AGENCY: Food and... the availability of the draft guidance entitled ``Global Unique Device Identification Database (GUDID... manufacturer) will interface with the GUDID, as well as information on the database elements that must be...
Park, Ju Heon; Shin, Jong Hee; Choi, Min Ji; Choi, Jin Un; Park, Yeon-Joon; Jang, Sook Jin; Won, Eun Jeong; Kim, Soo Hyun; Kee, Seung Jung; Shin, Myung Geun; Suh, Soon Pal
2017-01-01
We evaluated the ability of the Filamentous Fungi Library 1.0 of the MALDI-TOF MS Biotyper system to identify 345 clinical Aspergillus isolates from 11 Korean hospitals. Compared with results of the internal transcribed spacer region sequencing, the frequencies of correct identification at the species-complex level were 94.5% and 98.8% with cutoff values of 2.0 and 1.7, respectively. Compared with results of β-tubulin gene sequencing, the frequencies of correct identification at the species level were 96.0% (cutoff 2.0) and 100% (cutoff 1.7) for 303 Aspergillus isolates of five common, non-cryptic species, but only 4.8% (cutoff 1.7) and 0% (cutoff 2.0) for 42 Aspergillus isolates of six cryptic species (identifiable by β-tubulin or calmodulin sequencing). These results show that the MALDI Biotyper using the Filamentous Fungi Library version 1.0 enables reliable identification of the majority of common clinical Aspergillus isolates, although the database should be expanded to facilitate identification of cryptic species. Copyright © 2016 Elsevier Inc. All rights reserved.
Arvind, Akanksha; Jain, Vaibhav; Saravanan, Parameswaran; Mohan, C Gopi
2013-12-01
Mycobacterium tuberculosis (Mtb) is a causative agent of tuberculosis (TB) disease, which has affected approximately 2 billion people worldwide. Due to the emergence of resistance towards the existing drugs, discovery of new anti-TB drugs is an important global healthcare challenge. To address this problem, there is an urgent need to identify new drug targets in Mtb. In the present study, the subtractive genomics approach has been employed for the identification of new drug targets against TB. Screening the Mtb proteome using the Database of Essential Genes (DEG) and human proteome resulted in the identification of 60 key proteins which have no eukaryotic counterparts. Critical analysis of these proteins using Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways database revealed uridine monophosphate kinase (UMPK) enzyme as a potential drug target for developing novel anti-TB drugs. Homology model of Mtb-UMPK was constructed for the first time on the basis of the crystal structure of E. coli-UMPK, in order to understand its structure-function relationships, and which would in turn facilitate to perform structure-based inhibitor design. Furthermore, the structural similarity search was carried out using physiological inhibitor UTP of Mtb-UMPK to virtually screen ZINC database. Retrieved hits were further screened by implementing several filters like ADME and toxicity followed by molecular docking. Finally, on the basis of the Glide docking score and the mode of binding, 6 putative leads were identified as inhibitors of this enzyme which can potentially emerge as future drugs for the treatment of TB.
Challenges in developing medicinal plant databases for sharing ethnopharmacological knowledge.
Ningthoujam, Sanjoy Singh; Talukdar, Anupam Das; Potsangbam, Kumar Singh; Choudhury, Manabendra Dutta
2012-05-07
Major research contributions in ethnopharmacology have generated vast amount of data associated with medicinal plants. Computerized databases facilitate data management and analysis making coherent information available to researchers, planners and other users. Web-based databases also facilitate knowledge transmission and feed the circle of information exchange between the ethnopharmacological studies and public audience. However, despite the development of many medicinal plant databases, a lack of uniformity is still discernible. Therefore, it calls for defining a common standard to achieve the common objectives of ethnopharmacology. The aim of the study is to review the diversity of approaches in storing ethnopharmacological information in databases and to provide some minimal standards for these databases. Survey for articles on medicinal plant databases was done on the Internet by using selective keywords. Grey literatures and printed materials were also searched for information. Listed resources were critically analyzed for their approaches in content type, focus area and software technology. Necessity for rapid incorporation of traditional knowledge by compiling primary data has been felt. While citation collection is common approach for information compilation, it could not fully assimilate local literatures which reflect traditional knowledge. Need for defining standards for systematic evaluation, checking quality and authenticity of the data is felt. Databases focussing on thematic areas, viz., traditional medicine system, regional aspect, disease and phytochemical information are analyzed. Issues pertaining to data standard, data linking and unique identification need to be addressed in addition to general issues like lack of update and sustainability. In the background of the present study, suggestions have been made on some minimum standards for development of medicinal plant database. In spite of variations in approaches, existence of many overlapping features indicates redundancy of resources and efforts. As the development of global data in a single database may not be possible in view of the culture-specific differences, efforts can be given to specific regional areas. Existing scenario calls for collaborative approach for defining a common standard in medicinal plant database for knowledge sharing and scientific advancement. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Content-based video indexing and searching with wavelet transformation
NASA Astrophysics Data System (ADS)
Stumpf, Florian; Al-Jawad, Naseer; Du, Hongbo; Jassim, Sabah
2006-05-01
Biometric databases form an essential tool in the fight against international terrorism, organised crime and fraud. Various government and law enforcement agencies have their own biometric databases consisting of combination of fingerprints, Iris codes, face images/videos and speech records for an increasing number of persons. In many cases personal data linked to biometric records are incomplete and/or inaccurate. Besides, biometric data in different databases for the same individual may be recorded with different personal details. Following the recent terrorist atrocities, law enforcing agencies collaborate more than before and have greater reliance on database sharing. In such an environment, reliable biometric-based identification must not only determine who you are but also who else you are. In this paper we propose a compact content-based video signature and indexing scheme that can facilitate retrieval of multiple records in face biometric databases that belong to the same person even if their associated personal data are inconsistent. We shall assess the performance of our system using a benchmark audio visual face biometric database that has multiple videos for each subject but with different identity claims. We shall demonstrate that retrieval of relatively small number of videos that are nearest, in terms of the proposed index, to any video in the database results in significant proportion of that individual biometric data.
Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita
2010-12-31
Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification.
Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita
2010-01-01
Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification. PMID:21347215
Text mining for metabolic pathways, signaling cascades, and protein networks.
Hoffmann, Robert; Krallinger, Martin; Andres, Eduardo; Tamames, Javier; Blaschke, Christian; Valencia, Alfonso
2005-05-10
The complexity of the information stored in databases and publications on metabolic and signaling pathways, the high throughput of experimental data, and the growing number of publications make it imperative to provide systems to help the researcher navigate through these interrelated information resources. Text-mining methods have started to play a key role in the creation and maintenance of links between the information stored in biological databases and its original sources in the literature. These links will be extremely useful for database updating and curation, especially if a number of technical problems can be solved satisfactorily, including the identification of protein and gene names (entities in general) and the characterization of their types of interactions. The first generation of openly accessible text-mining systems, such as iHOP (Information Hyperlinked over Proteins), provides additional functions to facilitate the reconstruction of protein interaction networks, combine database and text information, and support the scientist in the formulation of novel hypotheses. The next challenge is the generation of comprehensive information regarding the general function of signaling pathways and protein interaction networks.
A tuberculosis biomarker database: the key to novel TB diagnostics.
Yerlikaya, Seda; Broger, Tobias; MacLean, Emily; Pai, Madhukar; Denkinger, Claudia M
2017-03-01
New diagnostic innovations for tuberculosis (TB), including point-of-care solutions, are critical to reach the goals of the End TB Strategy. However, despite decades of research, numerous reports on new biomarker candidates, and significant investment, no well-performing, simple and rapid TB diagnostic test is yet available on the market, and the search for accurate, non-DNA biomarkers remains a priority. To help overcome this 'biomarker pipeline problem', FIND and partners are working on the development of a well-curated and user-friendly TB biomarker database. The web-based database will enable the dynamic tracking of evidence surrounding biomarker candidates in relation to target product profiles (TPPs) for needed TB diagnostics. It will be able to accommodate raw datasets and facilitate the verification of promising biomarker candidates and the identification of novel biomarker combinations. As such, the database will simplify data and knowledge sharing, empower collaboration, help in the coordination of efforts and allocation of resources, streamline the verification and validation of biomarker candidates, and ultimately lead to an accelerated translation into clinically useful tools. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
21 CFR 830.320 - Submission of unique device identification information.
Code of Federal Regulations, 2014 CFR
2014-04-01
... Identification Database § 830.320 Submission of unique device identification information. (a) Designation of... Unique Device Identification Database (GUDID) in a format that we can process, review, and archive...
21 CFR 830.340 - Voluntary submission of ancillary device identification information.
Code of Federal Regulations, 2014 CFR
2014-04-01
... Identification Database § 830.340 Voluntary submission of ancillary device identification information. (a) You may not submit any information to the Global Unique Device Identification Database (GUDID) other than...
Enhancing AstroInformatics and Science Discovery from Data in Journal Articles
NASA Astrophysics Data System (ADS)
Mazzarella, Joseph
2011-05-01
Traditional methods of publishing scientific data and metadata in journal articles are in need of major upgrades to reach the full potential of astronomical databases and astroinformatics techniques to facilitate semi-automated, and eventually autonomous, methods of science discovery. I will review a growing collaboration involving the NASA/IPAC Extragalactic Database (NED), the Astrophysics Data System (ADS), the Virtual Astronomical Observatory (VAO), the AAS Journals and IOP, and the Data Conservancy that is aimed toward transforming the methodology used to publish, capture and link data associated with astrophysics journal articles. We are planning a web-based workflow to assist astronomers during the publication of journal articles. The primary goals are to facilitate the application of structure and standards to (meta)data, reduce errors, remove ambiguities in the identification of astrophysical objects and regions of sky, capture and preserve the images and spectral data files used to make plots, and accelerate the ingestion of the data into relevant repositories, search engines and integration services. The outcome of this community wide effort will address a recent public policy mandate to publish scientific data in open formats to allow reproducibility of results and to facilitate new discoveries. Equally important, this work has the potential to usher in a new wave of science discovery based on seamless connectivity between data relationships that are continuously growing in size and complexity, and increasingly sophisticated data visualization and analysis applications.
Gwak, Seongshin; Arroyo-Mora, Luis E; Almirall, José R
2015-02-01
Designer drugs are analogues or derivatives of illicit drugs with a modification of their chemical structure in order to circumvent current legislation for controlled substances. Designer drugs of abuse have increased dramatically in popularity all over the world for the past couple of years. Currently, the qualitative seized-drug analysis is mainly performed by gas chromatography-electron ionization-mass spectrometry (GC-EI-MS) in which most of these emerging designer drug derivatives are extensively fragmented not presenting a molecular ion in their mass spectra. The absence of molecular ion and/or similar fragmentation pattern among these derivatives may cause the equivocal identification of unknown seized-substances. In this study, the qualitative identification of 34 designer drugs, mainly synthetic cannabinoids and synthetic cathinones, were performed by gas chromatography-triple quadrupole-tandem mass spectrometry with two different ionization techniques, including electron ionization (EI) and chemical ionization (CI) only focusing on qualitative seized-drug analysis, not from the toxicological point of view. The implementation of CI source facilitates the determination of molecular mass and the identification of seized designer drugs. Developed multiple reaction monitoring (MRM) mode may increase sensitivity and selectivity in the analysis of seized designer drugs. In addition, CI mass spectra and MRM mass spectra of these designer drug derivatives can be used as a potential supplemental database along with EI mass spectral database. Copyright © 2014 John Wiley & Sons, Ltd.
dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock.
Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta
2016-01-01
Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf.
GMDD: a database of GMO detection methods
Dong, Wei; Yang, Litao; Shen, Kailin; Kim, Banghyun; Kleter, Gijs A; Marvin, Hans JP; Guo, Rong; Liang, Wanqi; Zhang, Dabing
2008-01-01
Background Since more than one hundred events of genetically modified organisms (GMOs) have been developed and approved for commercialization in global area, the GMO analysis methods are essential for the enforcement of GMO labelling regulations. Protein and nucleic acid-based detection techniques have been developed and utilized for GMOs identification and quantification. However, the information for harmonization and standardization of GMO analysis methods at global level is needed. Results GMO Detection method Database (GMDD) has collected almost all the previous developed and reported GMOs detection methods, which have been grouped by different strategies (screen-, gene-, construct-, and event-specific), and also provide a user-friendly search service of the detection methods by GMO event name, exogenous gene, or protein information, etc. In this database, users can obtain the sequences of exogenous integration, which will facilitate PCR primers and probes design. Also the information on endogenous genes, certified reference materials, reference molecules, and the validation status of developed methods is included in this database. Furthermore, registered users can also submit new detection methods and sequences to this database, and the newly submitted information will be released soon after being checked. Conclusion GMDD contains comprehensive information of GMO detection methods. The database will make the GMOs analysis much easier. PMID:18522755
dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock
Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta
2016-01-01
Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf. PMID:26727469
Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.
Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis
2017-01-01
Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.
Wada, Yoshinao; Nakano, Norihiko
2004-01-01
The identification of proteins by mass spectrometry has revolutionalized the basic method of identifying proteins constituting an intracellular unit or network for certain biological functions. The gel-based strategy following immunoprecipitation was applied to elucidating proteins associated with the aryl hydrocarbon receptor (AhR). Two hundred femtomoles of AhR was recovered from approximately 2 x 10(7) HepG2 cells by immunoprecipitation and was sufficient for identification by peptide mass fingerprinting. Possible candidates for the AhR-associated proteins were also identified. Improvements of the current strategy to increase the overall sensitivity tenfold are required to clarify the AhR complex in full detail. For example, a combination of trypsin and Achromobacter protease I for in-gel digestion allows the number of missed cleavage sites to be set at zero for database searching, thereby reducing random matches and facilitating identification. There is also room for improvement in each step of sample preparation prior to mass spectrometry.
Merriam, Tim; Kaufmann, Rolf; Ebert, Lars; Figi, Renato; Erni, Rolf; Pauer, Robin; Sieberth, Till
2018-06-01
Today, post-mortem computed tomography (CT) is routinely used for forensic identification. Mobile energy-dispersive X-ray fluorescence (EDXRF) spectroscopy of a dentition is a method of identification that has the potential to be easier and cheaper than CT, although it cannot be used with every dentition. In challenging cases, combining both techniques could facilitate the process of identification and prove to be advantageous over chemical analyses. Nine dental restorative material brands were analyzed using EDXRF spectroscopy. Their differentiability was assessed by comparing each material's x-ray fluorescence spectrum and then comparing the spectra to previous research investigating differentiability in CT. To verify EDXRF's precision and accuracy, select dental specimens underwent comparative electron beam excited x-ray spectroscopy (EDS) scans, while the impact of the restorative surface area was studied by scanning a row of dental specimens with varying restorative surface areas (n = 10). EDXRF was able to differentiate all 36 possible pairs of dental filling materials; however, dual-energy CT was only able to differentiate 33 out of 36. The EDS scans showed correlating x-ray fluorescence peaks on the x-ray spectra compared to our EDXRF. In addition, the surface area showed no influence on the differentiability of the dental filling materials. EDXRF has the potential to facilitate corpse identification by differentiating and comparing restorative materials, providing more information compared to post-mortem CT alone. Despite not being able to explicitly identify a brand without a control sample or database, its fast and mobile use could accelerate daily routines or mass victim identification processes. To achieve this goal, further development of EDXRF scanners for this application and further studies evaluating the method within a specific routine need to be performed.
Bousquet, P-J; Caillet, P; Coeuret-Pellicer, M; Goulard, H; Kudjawu, Y C; Le Bihan, C; Lecuyer, A I; Séguret, F
2017-10-01
The development and use of healthcare databases accentuates the need for dedicated tools, including validated selection algorithms of cancer diseased patients. As part of the development of the French National Health Insurance System data network REDSIAM, the tumor taskforce established an inventory of national and internal published algorithms in the field of cancer. This work aims to facilitate the choice of a best-suited algorithm. A non-systematic literature search was conducted for various cancers. Results are presented for lung, breast, colon, and rectum. Medline, Scopus, the French Database in Public Health, Google Scholar, and the summaries of the main French journals in oncology and public health were searched for publications until August 2016. An extraction grid adapted to oncology was constructed and used for the extraction process. A total of 18 publications were selected for lung cancer, 18 for breast cancer, and 12 for colorectal cancer. Validation studies of algorithms are scarce. When information is available, the performance and choice of an algorithm are dependent on the context, purpose, and location of the planned study. Accounting for cancer disease specificity, the proposed extraction chart is more detailed than the generic chart developed for other REDSIAM taskforces, but remains easily usable in practice. This study illustrates the complexity of cancer detection through sole reliance on healthcare databases and the lack of validated algorithms specifically designed for this purpose. Studies that standardize and facilitate validation of these algorithms should be developed and promoted. Copyright © 2017. Published by Elsevier Masson SAS.
Kennedy, Amy E.; Khoury, Muin J.; Ioannidis, John P.A.; Brotzman, Michelle; Miller, Amy; Lane, Crystal; Lai, Gabriel Y.; Rogers, Scott D.; Harvey, Chinonye; Elena, Joanne W.; Seminara, Daniela
2017-01-01
Background We report on the establishment of a web-based Cancer Epidemiology Descriptive Cohort Database (CEDCD). The CEDCD’s goals are to enhance awareness of resources, facilitate interdisciplinary research collaborations, and support existing cohorts for the study of cancer-related outcomes. Methods Comprehensive descriptive data were collected from large cohorts established to study cancer as primary outcome using a newly developed questionnaire. These included an inventory of baseline and follow-up data, biospecimens, genomics, policies, and protocols. Additional descriptive data extracted from publicly available sources were also collected. This information was entered in a searchable and publicly accessible database. We summarized the descriptive data across cohorts and reported the characteristics of this resource. Results As of December 2015, the CEDCD includes data from 46 cohorts representing more than 6.5 million individuals (29% ethnic/racial minorities). Overall, 78% of the cohorts have collected blood at least once, 57% at multiple time points, and 46% collected tissue samples. Genotyping has been performed by 67% of the cohorts, while 46% have performed whole-genome or exome sequencing in subsets of enrolled individuals. Information on medical conditions other than cancer has been collected in more than 50% of the cohorts. More than 600,000 incident cancer cases and more than 40,000 prevalent cases are reported, with 24 cancer sites represented. Conclusions The CEDCD assembles detailed descriptive information on a large number of cancer cohorts in a searchable database. Impact Information from the CEDCD may assist the interdisciplinary research community by facilitating identification of well-established population resources and large-scale collaborative and integrative research. PMID:27439404
Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R
2018-05-01
Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
New taxonomy and old collections: integrating DNA barcoding into the collection curation process.
Puillandre, N; Bouchet, P; Boisselier-Dubayle, M-C; Brisset, J; Buge, B; Castelin, M; Chagnoux, S; Christophe, T; Corbari, L; Lambourdière, J; Lozouet, P; Marani, G; Rivasseau, A; Silva, N; Terryn, Y; Tillier, S; Utge, J; Samadi, S
2012-05-01
Because they house large biodiversity collections and are also research centres with sequencing facilities, natural history museums are well placed to develop DNA barcoding best practices. The main difficulty is generally the vouchering system: it must ensure that all data produced remain attached to the corresponding specimen, from the field to publication in articles and online databases. The Museum National d'Histoire Naturelle in Paris is one of the leading laboratories in the Marine Barcode of Life (MarBOL) project, which was used as a pilot programme to include barcode collections for marine molluscs and crustaceans. The system is based on two relational databases. The first one classically records the data (locality and identification) attached to the specimens. In the second one, tissue-clippings, DNA extractions (both preserved in 2D barcode tubes) and PCR data (including primers) are linked to the corresponding specimen. All the steps of the process [sampling event, specimen identification, molecular processing, data submission to Barcode Of Life Database (BOLD) and GenBank] are thus linked together. Furthermore, we have developed several web-based tools to automatically upload data into the system, control the quality of the sequences produced and facilitate the submission to online databases. This work is the result of a joint effort from several teams in the Museum National d'Histoire Naturelle (MNHN), but also from a collaborative network of taxonomists and molecular systematists outside the museum, resulting in the vouchering so far of ∼41,000 sequences and the production of ∼11,000 COI sequences. © 2012 Blackwell Publishing Ltd.
Recon2Neo4j: applying graph database technologies for managing comprehensive genome-scale networks.
Balaur, Irina; Mazein, Alexander; Saqi, Mansoor; Lysenko, Artem; Rawlings, Christopher J; Auffray, Charles
2017-04-01
The goal of this work is to offer a computational framework for exploring data from the Recon2 human metabolic reconstruction model. Advanced user access features have been developed using the Neo4j graph database technology and this paper describes key features such as efficient management of the network data, examples of the network querying for addressing particular tasks, and how query results are converted back to the Systems Biology Markup Language (SBML) standard format. The Neo4j-based metabolic framework facilitates exploration of highly connected and comprehensive human metabolic data and identification of metabolic subnetworks of interest. A Java-based parser component has been developed to convert query results (available in the JSON format) into SBML and SIF formats in order to facilitate further results exploration, enhancement or network sharing. The Neo4j-based metabolic framework is freely available from: https://diseaseknowledgebase.etriks.org/metabolic/browser/ . The java code files developed for this work are available from the following url: https://github.com/ibalaur/MetabolicFramework . ibalaur@eisbm.org. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Recon2Neo4j: applying graph database technologies for managing comprehensive genome-scale networks
Mazein, Alexander; Saqi, Mansoor; Lysenko, Artem; Rawlings, Christopher J.; Auffray, Charles
2017-01-01
Abstract Summary: The goal of this work is to offer a computational framework for exploring data from the Recon2 human metabolic reconstruction model. Advanced user access features have been developed using the Neo4j graph database technology and this paper describes key features such as efficient management of the network data, examples of the network querying for addressing particular tasks, and how query results are converted back to the Systems Biology Markup Language (SBML) standard format. The Neo4j-based metabolic framework facilitates exploration of highly connected and comprehensive human metabolic data and identification of metabolic subnetworks of interest. A Java-based parser component has been developed to convert query results (available in the JSON format) into SBML and SIF formats in order to facilitate further results exploration, enhancement or network sharing. Availability and Implementation: The Neo4j-based metabolic framework is freely available from: https://diseaseknowledgebase.etriks.org/metabolic/browser/. The java code files developed for this work are available from the following url: https://github.com/ibalaur/MetabolicFramework. Contact: ibalaur@eisbm.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27993779
Does filler database size influence identification accuracy?
Bergold, Amanda N; Heaton, Paul
2018-06-01
Police departments increasingly use large photo databases to select lineup fillers using facial recognition software, but this technological shift's implications have been largely unexplored in eyewitness research. Database use, particularly if coupled with facial matching software, could enable lineup constructors to increase filler-suspect similarity and thus enhance eyewitness accuracy (Fitzgerald, Oriet, Price, & Charman, 2013). However, with a large pool of potential fillers, such technologies might theoretically produce lineup fillers too similar to the suspect (Fitzgerald, Oriet, & Price, 2015; Luus & Wells, 1991; Wells, Rydell, & Seelau, 1993). This research proposes a new factor-filler database size-as a lineup feature affecting eyewitness accuracy. In a facial recognition experiment, we select lineup fillers in a legally realistic manner using facial matching software applied to filler databases of 5,000, 25,000, and 125,000 photos, and find that larger databases are associated with a higher objective similarity rating between suspects and fillers and lower overall identification accuracy. In target present lineups, witnesses viewing lineups created from the larger databases were less likely to make correct identifications and more likely to select known innocent fillers. When the target was absent, database size was associated with a lower rate of correct rejections and a higher rate of filler identifications. Higher algorithmic similarity ratings were also associated with decreases in eyewitness identification accuracy. The results suggest that using facial matching software to select fillers from large photograph databases may reduce identification accuracy, and provides support for filler database size as a meaningful system variable. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Data Independent Acquisition analysis in ProHits 4.0.
Liu, Guomin; Knight, James D R; Zhang, Jian Ping; Tsou, Chih-Chiang; Wang, Jian; Lambert, Jean-Philippe; Larsen, Brett; Tyers, Mike; Raught, Brian; Bandeira, Nuno; Nesvizhskii, Alexey I; Choi, Hyungwon; Gingras, Anne-Claude
2016-10-21
Affinity purification coupled with mass spectrometry (AP-MS) is a powerful technique for the identification and quantification of physical interactions. AP-MS requires careful experimental design, appropriate control selection and quantitative workflows to successfully identify bona fide interactors amongst a large background of contaminants. We previously introduced ProHits, a Laboratory Information Management System for interaction proteomics, which tracks all samples in a mass spectrometry facility, initiates database searches and provides visualization tools for spectral counting-based AP-MS approaches. More recently, we implemented Significance Analysis of INTeractome (SAINT) within ProHits to provide scoring of interactions based on spectral counts. Here, we provide an update to ProHits to support Data Independent Acquisition (DIA) with identification software (DIA-Umpire and MSPLIT-DIA), quantification tools (through DIA-Umpire, or externally via targeted extraction), and assessment of quantitative enrichment (through mapDIA) and scoring of interactions (through SAINT-intensity). With additional improvements, notably support of the iProphet pipeline, facilitated deposition into ProteomeXchange repositories and enhanced export and viewing functions, ProHits 4.0 offers a comprehensive suite of tools to facilitate affinity proteomics studies. It remains challenging to score, annotate and analyze proteomics data in a transparent manner. ProHits was previously introduced as a LIMS to enable storing, tracking and analysis of standard AP-MS data. In this revised version, we expand ProHits to include integration with a number of identification and quantification tools based on Data-Independent Acquisition (DIA). ProHits 4.0 also facilitates data deposition into public repositories, and the transfer of data to new visualization tools. Copyright © 2016 Elsevier B.V. All rights reserved.
A data mining method to facilitate SAR transfer.
Wassermann, Anne Mai; Bajorath, Jürgen
2011-08-22
A challenging practical problem in medicinal chemistry is the transfer of SAR information from one chemical series to another. Currently, there are no computational methods available to rationalize or support this process. Herein, we present a data mining approach that enables the identification of alternative analog series with different core structures, corresponding substitution patterns, and comparable potency progression. Scaffolds can be exchanged between these series and new analogs suggested that incorporate preferred R-groups. The methodology can be applied to search for alternative analog series if one series is known or, alternatively, to systematically assess SAR transfer potential in compound databases.
Mugshot Identification Database (MID)
National Institute of Standards and Technology Data Gateway
NIST Mugshot Identification Database (MID) (Web, free access) NIST Special Database 18 is being distributed for use in development and testing of automated mugshot identification systems. The database consists of three CD-ROMs, containing a total of 3248 images of variable size using lossless compression. A newer version of the compression/decompression software on the CDROM can be found at the website http://www.nist.gov/itl/iad/ig/nigos.cfm as part of the NBIS package.
Internet-accessible DNA sequence database for identifying fusaria from human and animal infections.
O'Donnell, Kerry; Sutton, Deanna A; Rinaldi, Michael G; Sarver, Brice A J; Balajee, S Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C; Robert, Vincent A R G; Crous, Pedro W; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M
2010-10-01
Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1α (EF-1α), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the sequence results can be verified and isolates are made available for future study.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 6 Domestic Security 1 2012-01-01 2012-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
Code of Federal Regulations, 2010 CFR
2010-01-01
... 6 Domestic Security 1 2010-01-01 2010-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
Code of Federal Regulations, 2014 CFR
2014-01-01
... 6 Domestic Security 1 2014-01-01 2014-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
Code of Federal Regulations, 2013 CFR
2013-01-01
... 6 Domestic Security 1 2013-01-01 2013-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
Code of Federal Regulations, 2011 CFR
2011-01-01
... 6 Domestic Security 1 2011-01-01 2011-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
GMOMETHODS: the European Union database of reference methods for GMO analysis.
Bonfini, Laura; Van den Bulcke, Marc H; Mazzara, Marco; Ben, Enrico; Patak, Alexandre
2012-01-01
In order to provide reliable and harmonized information on methods for GMO (genetically modified organism) analysis we have published a database called "GMOMETHODS" that supplies information on PCR assays validated according to the principles and requirements of ISO 5725 and/or the International Union of Pure and Applied Chemistry protocol. In addition, the database contains methods that have been verified by the European Union Reference Laboratory for Genetically Modified Food and Feed in the context of compliance with an European Union legislative act. The web application provides search capabilities to retrieve primers and probes sequence information on the available methods. It further supplies core data required by analytical labs to carry out GM tests and comprises information on the applied reference material and plasmid standards. The GMOMETHODS database currently contains 118 different PCR methods allowing identification of 51 single GM events and 18 taxon-specific genes in a sample. It also provides screening assays for detection of eight different genetic elements commonly used for the development of GMOs. The application is referred to by the Biosafety Clearing House, a global mechanism set up by the Cartagena Protocol on Biosafety to facilitate the exchange of information on Living Modified Organisms. The publication of the GMOMETHODS database can be considered an important step toward worldwide standardization and harmonization in GMO analysis.
Toward a mtDNA locus-specific mutation database using the LOVD platform.
Elson, Joanna L; Sweeney, Mary G; Procaccio, Vincent; Yarham, John W; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H; Pitceathly, Robert D S; Thorburn, David R; Lott, Marie T; Wallace, Douglas C; Taylor, Robert W; McFarland, Robert
2012-09-01
The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. © 2012 Wiley Periodicals, Inc.
Toward a mtDNA Locus-Specific Mutation Database Using the LOVD Platform
Elson, Joanna L.; Sweeney, Mary G.; Procaccio, Vincent; Yarham, John W.; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H.; Pitceathly, Robert D.S.; Thorburn, David R.; Lott, Marie T.; Wallace, Douglas C.; Taylor, Robert W.; McFarland, Robert
2015-01-01
The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. PMID:22581690
Bhawna; Chaduvula, Pavan K; Bonthala, Venkata S; Manjusha, Verma; Siddiq, Ebrahimali A; Polumetla, Ananda K; Prasad, Gajula M N V
2015-01-01
Cucumis melo L. that belongs to Cucurbitaceae family ranks among one of the highest valued horticulture crops being cultivated across the globe. Besides its economical and medicinal importance, Cucumis melo L. is a valuable resource and model system for the evolutionary studies of cucurbit family. However, very limited numbers of molecular markers were reported for Cucumis melo L. so far that limits the pace of functional genomic research in melon and other similar horticulture crops. We developed the first whole genome based microsatellite DNA marker database of Cucumis melo L. and comprehensive web resource that aids in variety identification and physical mapping of Cucurbitaceae family. The Cucumis melo L. microsatellite database (CmMDb: http://65.181.125.102/cmmdb2/index.html) encompasses 39,072 SSR markers along with its motif repeat, motif length, motif sequence, marker ID, motif type and chromosomal locations. The database is featured with novel automated primer designing facility to meet the needs of wet lab researchers. CmMDb is a freely available web resource that facilitates the researchers to select the most appropriate markers for marker-assisted selection in melons and to improve breeding strategies.
probeBase—an online resource for rRNA-targeted oligonucleotide probes and primers: new features 2016
Greuter, Daniel; Loy, Alexander; Horn, Matthias; Rattei, Thomas
2016-01-01
probeBase http://www.probebase.net is a manually maintained and curated database of rRNA-targeted oligonucleotide probes and primers. Contextual information and multiple options for evaluating in silico hybridization performance against the most recent rRNA sequence databases are provided for each oligonucleotide entry, which makes probeBase an important and frequently used resource for microbiology research and diagnostics. Here we present a major update of probeBase, which was last featured in the NAR Database Issue 2007. This update describes a complete remodeling of the database architecture and environment to accommodate computationally efficient access. Improved search functions, sequence match tools and data output now extend the opportunities for finding suitable hierarchical probe sets that target an organism or taxon at different taxonomic levels. To facilitate the identification of complementary probe sets for organisms represented by short rRNA sequence reads generated by amplicon sequencing or metagenomic analysis with next generation sequencing technologies such as Illumina and IonTorrent, we introduce a novel tool that recovers surrogate near full-length rRNA sequences for short query sequences and finds matching oligonucleotides in probeBase. PMID:26586809
Ceyssens, Pieter-Jan; Soetaert, Karine; Timke, Markus; Van den Bossche, An; Sparbier, Katrin; De Cremer, Koen; Kostrzewa, Markus; Hendrickx, Marijke; Mathys, Vanessa
2017-02-01
Species identification and drug susceptibility testing (DST) of mycobacteria are important yet complex processes traditionally reserved for reference laboratories. Recent technical improvements in matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) has started to facilitate routine mycobacterial identifications in clinical laboratories. In this paper, we investigate the possibility of performing phenotypic MALDI-based DST in mycobacteriology using the recently described MALDI Biotyper antibiotic susceptibility test rapid assay (MBT-ASTRA). We randomly selected 72 clinical Mycobacterium tuberculosis and nontuberculous mycobacterial (NTM) strains, subjected them to MBT-ASTRA methodology, and compared its results to current gold-standard methods. Drug susceptibility was tested for rifampin, isoniazid, linezolid, and ethambutol (M. tuberculosis, n = 39), and clarithromycin and rifabutin (NTM, n = 33). Combined species identification was performed using the Biotyper Mycobacteria Library 4.0. Mycobacterium-specific MBT-ASTRA parameters were derived (calculation window, m/z 5,000 to 13,000, area under the curve [AUC] of >0.015, relative growth [RG] of <0.5; see the text for details). Using these settings, MBT-ASTRA analyses returned 175/177 M. tuberculosis and 65/66 NTM drug resistance profiles which corresponded to standard testing results. Turnaround times were not significantly different in M. tuberculosis testing, but the MBT-ASTRA method delivered on average a week faster than routine DST in NTM. Databases searches returned 90.4% correct species-level identifications, which increased to 98.6% when score thresholds were lowered to 1.65. In conclusion, the MBT-ASTRA technology holds promise to facilitate and fasten mycobacterial DST and to combine it directly with high-confidence species-level identifications. Given the ease of interpretation, its application in NTM typing might be the first in finding its way to current diagnostic workflows. However, further validations and automation are required before routine implementation can be envisioned. Copyright © 2017 American Society for Microbiology.
9 CFR 79.2 - Identification of sheep and goats in interstate commerce.
Code of Federal Regulations, 2014 CFR
2014-01-01
... prefix that has been linked in the National Scrapie Database with the assigned premises identification... official identification method or device approved by the Administrator. (3) The owner of the flock of... premises identification if they are linked to the premises in the National Scrapie Database) will be...
9 CFR 79.2 - Identification of sheep and goats in interstate commerce.
Code of Federal Regulations, 2013 CFR
2013-01-01
... prefix that has been linked in the National Scrapie Database with the assigned premises identification... official identification method or device approved by the Administrator. (3) The owner of the flock of... premises identification if they are linked to the premises in the National Scrapie Database) will be...
9 CFR 79.2 - Identification of sheep and goats in interstate commerce.
Code of Federal Regulations, 2012 CFR
2012-01-01
... prefix that has been linked in the National Scrapie Database with the assigned premises identification... official identification method or device approved by the Administrator. (3) The owner of the flock of... premises identification if they are linked to the premises in the National Scrapie Database) will be...
Linkage disequilibrium matches forensic genetic records to disjoint genomic marker sets.
Edge, Michael D; Algee-Hewitt, Bridget F B; Pemberton, Trevor J; Li, Jun Z; Rosenberg, Noah A
2017-05-30
Combining genotypes across datasets is central in facilitating advances in genetics. Data aggregation efforts often face the challenge of record matching-the identification of dataset entries that represent the same individual. We show that records can be matched across genotype datasets that have no shared markers based on linkage disequilibrium between loci appearing in different datasets. Using two datasets for the same 872 people-one with 642,563 genome-wide SNPs and the other with 13 short tandem repeats (STRs) used in forensic applications-we find that 90-98% of forensic STR records can be connected to corresponding SNP records and vice versa. Accuracy increases to 99-100% when ∼30 STRs are used. Our method expands the potential of data aggregation, but it also suggests privacy risks intrinsic in maintenance of databases containing even small numbers of markers-including databases of forensic significance.
[An application of the strategy results cycle to HIV/AIDS strategic planning in Latin America].
Rodríguez-García, Rosalía; Rosenberg, Hernán
2013-07-01
To describe the Strategy Results Cycle (SRC), a model that approaches planning as an ongoing cycle of seven phases that continually responds and adapts to existing evidence. Reliable sources were used for the preparation of databases and expenditure-costing data for resources needs analysis. The planning process 6-9 months to complete a national strategic plan that was informed by evidence, focused on results and costed. Knowledge transfer facilitated national leadership and stakeholders' participation. Between 2007 and 2011, 13 of 16 countries adopted the Strategy Results Cycle model. The evidence supported the identification of results and the expenditure-costing analysis improved budget allocation efficiency. The SRC facilitated purposeful participation and added value to previous planning approaches by connecting "thinking" and "doing" which resulted in national strategic plans that are designed by stakeholders, relevant to local conditions, and can guide implementation and resource mobilization.
CSGRqtl: A Comparative Quantitative Trait Locus Database for Saccharinae Grasses.
Zhang, Dong; Paterson, Andrew H
2017-01-01
Conventional biparental quantitative trait locus (QTL) mapping has led to some successes in the identification of causal genes in many organisms. QTL likelihood intervals not only provide "prior information" for finer-resolution approaches such as GWAS but also provide better statistical power than GWAS to detect variants with low/rare frequency in a natural population. Here, we describe a new element of an ongoing effort to provide online resources to facilitate study and improvement of the important Saccharinae clade. The primary goal of this new resource is the anchoring of published QTLs for this clade to the Sorghum genome. Genetic map alignments translate a wealth of genomic information from sorghum to Saccharum spp., Miscanthus spp., and other taxa. In addition, genome alignments facilitate comparison of the Saccharinae QTL sets to those of other taxa that enjoy comparable resources, exemplified herein by rice.
NASA Technical Reports Server (NTRS)
Wassil-Grimm, Andrew D.
1997-01-01
More effective electronic communication processes are needed to transfer contractor and international partner data into NASA and prime contractor baseline database systems. It is estimated that the International Space Station Alpha (ISSA) parts database will contain up to one million parts each of which may require database capabilities for approximately one thousand bytes of data for each part. The resulting gigabyte database must provide easy access to users who will be preparing multiple analyses and reports in order to verify as-designed, as-built, launch, on-orbit, and return configurations for up to 45 missions associated with the construction of the ISSA. Additionally, Internet access to this data base is strongly indicated to allow multiple user access from clients located in many foreign countries. This summer's project involved familiarization and evaluation of the ISSA Electrical, Electronic, and Electromechanical (EEE) Parts data and the process of electronically managing these data. Particular attention was devoted to improving the interfaces among the many elements of the ISSA information system and its global customers and suppliers. Additionally, prototype queries were developed to facilitate the identification of data changes in the data base, verifications that the designs used only approved parts, and certifications that the flight hardware containing EEE parts was ready for flight. This project also resulted in specific recommendations to NASA for further development in the area of EEE parts database development and usage.
Yamamoto, Naoki; Suzuki, Tomohiro; Kobayashi, Masaaki; Dohra, Hideo; Sasaki, Yohei; Hirai, Hirofumi; Yokoyama, Koji; Kawagishi, Hirokazu; Yano, Kentaro
2014-12-03
The angel's wing oyster mushroom (Pleurocybella porrigens, Sugihiratake) is a well-known delicacy. However, its potential risk in acute encephalopathy was recently revealed by a food poisoning incident. To disclose the genes underlying the accident and provide mechanistic insight, we seek to develop an information infrastructure containing omics data. In our previous work, we sequenced the genome and transcriptome using next-generation sequencing techniques. The next step in achieving our goal is to develop a web database to facilitate the efficient mining of large-scale omics data and identification of genes specifically expressed in the mushroom. This paper introduces a web database A-WINGS (http://bioinf.mind.meiji.ac.jp/a-wings/) that provides integrated genomic and transcriptomic information for the angel's wing oyster mushroom. The database contains structure and functional annotations of transcripts and gene expressions. Functional annotations contain information on homologous sequences from NCBI nr and UniProt, Gene Ontology, and KEGG Orthology. Digital gene expression profiles were derived from RNA sequencing (RNA-seq) analysis in the fruiting bodies and mycelia. The omics information stored in the database is freely accessible through interactive and graphical interfaces by search functions that include 'GO TREE VIEW' browsing, keyword searches, and BLAST searches. The A-WINGS database will accelerate omics studies on specific aspects of the angel's wing oyster mushroom and the family Tricholomataceae.
DTREEv2, a computer-based support system for the risk assessment of genetically modified plants.
Pertry, Ine; Nothegger, Clemens; Sweet, Jeremy; Kuiper, Harry; Davies, Howard; Iserentant, Dirk; Hull, Roger; Mezzetti, Bruno; Messens, Kathy; De Loose, Marc; de Oliveira, Dulce; Burssens, Sylvia; Gheysen, Godelieve; Tzotzos, George
2014-03-25
Risk assessment of genetically modified organisms (GMOs) remains a contentious area and a major factor influencing the adoption of agricultural biotech. Methodologically, in many countries, risk assessment is conducted by expert committees with little or no recourse to databases and expert systems that can facilitate the risk assessment process. In this paper we describe DTREEv2, a computer-based decision support system for the identification of hazards related to the introduction of GM-crops into the environment. DTREEv2 structures hazard identification and evaluation by means of an Event-Tree type of analysis. The system produces an output flagging identified hazards and potential risks. It is intended to be used for the preparation and evaluation of biosafety dossiers and, as such, its usefulness extends to researchers, risk assessors and regulators in government and industry. Copyright © 2013 Elsevier B.V. All rights reserved.
SwePep, a database designed for endogenous peptides and mass spectrometry.
Fälth, Maria; Sköld, Karl; Norrman, Mathias; Svensson, Marcus; Fenyö, David; Andren, Per E
2006-06-01
A new database, SwePep, specifically designed for endogenous peptides, has been constructed to significantly speed up the identification process from complex tissue samples utilizing mass spectrometry. In the identification process the experimental peptide masses are compared with the peptide masses stored in the database both with and without possible post-translational modifications. This intermediate identification step is fast and singles out peptides that are potential endogenous peptides and can later be confirmed with tandem mass spectrometry data. Successful applications of this methodology are presented. The SwePep database is a relational database developed using MySql and Java. The database contains 4180 annotated endogenous peptides from different tissues originating from 394 different species as well as 50 novel peptides from brain tissue identified in our laboratory. Information about the peptides, including mass, isoelectric point, sequence, and precursor protein, is also stored in the database. This new approach holds great potential for removing the bottleneck that occurs during the identification process in the field of peptidomics. The SwePep database is available to the public.
Kennedy, Amy E; Khoury, Muin J; Ioannidis, John P A; Brotzman, Michelle; Miller, Amy; Lane, Crystal; Lai, Gabriel Y; Rogers, Scott D; Harvey, Chinonye; Elena, Joanne W; Seminara, Daniela
2016-10-01
We report on the establishment of a web-based Cancer Epidemiology Descriptive Cohort Database (CEDCD). The CEDCD's goals are to enhance awareness of resources, facilitate interdisciplinary research collaborations, and support existing cohorts for the study of cancer-related outcomes. Comprehensive descriptive data were collected from large cohorts established to study cancer as primary outcome using a newly developed questionnaire. These included an inventory of baseline and follow-up data, biospecimens, genomics, policies, and protocols. Additional descriptive data extracted from publicly available sources were also collected. This information was entered in a searchable and publicly accessible database. We summarized the descriptive data across cohorts and reported the characteristics of this resource. As of December 2015, the CEDCD includes data from 46 cohorts representing more than 6.5 million individuals (29% ethnic/racial minorities). Overall, 78% of the cohorts have collected blood at least once, 57% at multiple time points, and 46% collected tissue samples. Genotyping has been performed by 67% of the cohorts, while 46% have performed whole-genome or exome sequencing in subsets of enrolled individuals. Information on medical conditions other than cancer has been collected in more than 50% of the cohorts. More than 600,000 incident cancer cases and more than 40,000 prevalent cases are reported, with 24 cancer sites represented. The CEDCD assembles detailed descriptive information on a large number of cancer cohorts in a searchable database. Information from the CEDCD may assist the interdisciplinary research community by facilitating identification of well-established population resources and large-scale collaborative and integrative research. Cancer Epidemiol Biomarkers Prev; 25(10); 1392-401. ©2016 AACR. ©2016 American Association for Cancer Research.
Zhou, Jin; Heim, Derek; Monk, Rebecca; Levy, Andrew; Pollard, Paul
2018-06-01
The "social lubrication" function of alcohol during interpersonal interactions is well documented. However, less is known about the effects of alcohol consumption on group-level behavior. Empirical findings from social psychological literature suggest that individuals tend to favor those who are considered as members of their own social group. Not yet evaluated is how alcohol intoxication interacts with this group-level bias. Therefore, the current study examined experimentally the effects of intoxication on group bias. Ninety-four individuals (M age = 20.18, SD = 2.36, 55 women, 39 men) were randomly assigned to consume an alcoholic (n = 48) or a placebo (n = 46) drink before completing manipulated allocation matrices, a task which measured the distribution of hypothetical monetary awards based on social groups. Results point to an interaction between drink condition and social group identification, whereby identification was significantly associated with in-group favoritism among intoxicated individuals only. Following alcohol consumption, participants with higher identification with their social group were more likely to demonstrate allocation strategies that favored their own group members. However, nonsignificant effects were observed for those in the placebo condition. The findings highlight how alcohol intoxication may facilitate group bias that results from social group identification. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Genetic identification of missing persons: DNA analysis of human remains and compromised samples.
Alvarez-Cubero, M J; Saiz, M; Martinez-Gonzalez, L J; Alvarez, J C; Eisenberg, A J; Budowle, B; Lorente, J A
2012-01-01
Human identification has made great strides over the past 2 decades due to the advent of DNA typing. Forensic DNA typing provides genetic data from a variety of materials and individuals, and is applied to many important issues that confront society. Part of the success of DNA typing is the generation of DNA databases to help identify missing persons and to develop investigative leads to assist law enforcement. DNA databases house DNA profiles from convicted felons (and in some jurisdictions arrestees), forensic evidence, human remains, and direct and family reference samples of missing persons. These databases are essential tools, which are becoming quite large (for example the US Database contains 10 million profiles). The scientific, governmental and private communities continue to work together to standardize genetic markers for more effective worldwide data sharing, to develop and validate robust DNA typing kits that contain the reagents necessary to type core identity genetic markers, to develop technologies that facilitate a number of analytical processes and to develop policies to make human identity testing more effective. Indeed, DNA typing is integral to resolving a number of serious criminal and civil concerns, such as solving missing person cases and identifying victims of mass disasters and children who may have been victims of human trafficking, and provides information for historical studies. As more refined capabilities are still required, novel approaches are being sought, such as genetic testing by next-generation sequencing, mass spectrometry, chip arrays and pyrosequencing. Single nucleotide polymorphisms offer the potential to analyze severely compromised biological samples, to determine the facial phenotype of decomposed human remains and to predict the bioancestry of individuals, a new focus in analyzing this type of markers. Copyright © 2012 S. Karger AG, Basel.
A Generic Nonlinear Aerodynamic Model for Aircraft
NASA Technical Reports Server (NTRS)
Grauer, Jared A.; Morelli, Eugene A.
2014-01-01
A generic model of the aerodynamic coefficients was developed using wind tunnel databases for eight different aircraft and multivariate orthogonal functions. For each database and each coefficient, models were determined using polynomials expanded about the state and control variables, and an othgonalization procedure. A predicted squared-error criterion was used to automatically select the model terms. Modeling terms picked in at least half of the analyses, which totalled 45 terms, were retained to form the generic nonlinear aerodynamic (GNA) model. Least squares was then used to estimate the model parameters and associated uncertainty that best fit the GNA model to each database. Nonlinear flight simulations were used to demonstrate that the GNA model produces accurate trim solutions, local behavior (modal frequencies and damping ratios), and global dynamic behavior (91% accurate state histories and 80% accurate aerodynamic coefficient histories) under large-amplitude excitation. This compact aerodynamics model can be used to decrease on-board memory storage requirements, quickly change conceptual aircraft models, provide smooth analytical functions for control and optimization applications, and facilitate real-time parametric system identification.
Childs, Kevin L; Konganti, Kranti; Buell, C Robin
2012-01-01
Major feedstock sources for future biofuel production are likely to be high biomass producing plant species such as poplar, pine, switchgrass, sorghum and maize. One active area of research in these species is genome-enabled improvement of lignocellulosic biofuel feedstock quality and yield. To facilitate genomic-based investigations in these species, we developed the Biofuel Feedstock Genomic Resource (BFGR), a database and web-portal that provides high-quality, uniform and integrated functional annotation of gene and transcript assembly sequences from species of interest to lignocellulosic biofuel feedstock researchers. The BFGR includes sequence data from 54 species and permits researchers to view, analyze and obtain annotation at the gene, transcript, protein and genome level. Annotation of biochemical pathways permits the identification of key genes and transcripts central to the improvement of lignocellulosic properties in these species. The integrated nature of the BFGR in terms of annotation methods, orthologous/paralogous relationships and linkage to seven species with complete genome sequences allows comparative analyses for biofuel feedstock species with limited sequence resources. Database URL: http://bfgr.plantbiology.msu.edu.
Multi-Level Sequential Pattern Mining Based on Prime Encoding
NASA Astrophysics Data System (ADS)
Lianglei, Sun; Yun, Li; Jiang, Yin
Encoding is not only to express the hierarchical relationship, but also to facilitate the identification of the relationship between different levels, which will directly affect the efficiency of the algorithm in the area of mining the multi-level sequential pattern. In this paper, we prove that one step of division operation can decide the parent-child relationship between different levels by using prime encoding and present PMSM algorithm and CROSS-PMSM algorithm which are based on prime encoding for mining multi-level sequential pattern and cross-level sequential pattern respectively. Experimental results show that the algorithm can effectively extract multi-level and cross-level sequential pattern from the sequence database.
NASA's Aviation Safety and Modeling Project
NASA Technical Reports Server (NTRS)
Chidester, Thomas R.; Statler, Irving C.
2006-01-01
The Aviation Safety Monitoring and Modeling (ASMM) Project of NASA's Aviation Safety program is cultivating sources of data and developing automated computer hardware and software to facilitate efficient, comprehensive, and accurate analyses of the data collected from large, heterogeneous databases throughout the national aviation system. The ASMM addresses the need to provide means for increasing safety by enabling the identification and correcting of predisposing conditions that could lead to accidents or to incidents that pose aviation risks. A major component of the ASMM Project is the Aviation Performance Measuring System (APMS), which is developing the next generation of software tools for analyzing and interpreting flight data.
Employment discrimination, segregation, and health.
Darity, William A
2003-02-01
The author examines available evidence on the effects of exposure to joblessness on emotional well-being according to race and sex. The impact of racism on general health outcomes also is considered, particularly racism in the specific form of wage discrimination. Perceptions of racism and measured exposures to racism may be distinct triggers for adverse health outcomes. Whether the effects of racism are best evaluated on the basis of self-classification or social classification of racial identity is unclear. Some research sorts between the effects of race and socioeconomic status on health. The development of a new longitudinal database will facilitate more accurate identification of connections between racism and negative health effects.
Employment Discrimination, Segregation, and Health
Darity, William A.
2003-01-01
The author examines available evidence on the effects of exposure to joblessness on emotional well-being according to race and sex. The impact of racism on general health outcomes also is considered, particularly racism in the specific form of wage discrimination. Perceptions of racism and measured exposures to racism may be distinct triggers for adverse health outcomes. Whether the effects of racism are best evaluated on the basis of self-classification or social classification of racial identity is unclear. Some research sorts between the effects of race and socioeconomic status on health. The development of a new longitudinal database will facilitate more accurate identification of connections between racism and negative health effects. PMID:12554574
Phytochemica: a platform to explore phytochemicals of medicinal plants
Pathania, Shivalika; Ramakrishnan, Sai Mukund; Bagler, Ganesh
2015-01-01
Plant-derived molecules (PDMs) are known to be a rich source of diverse scaffolds that could serve as the basis for rational drug design. Structured compilation of phytochemicals from traditional medicinal plants can facilitate prospection for novel PDMs and their analogs as therapeutic agents. Atropa belladonna, Catharanthus roseus, Heliotropium indicum, Picrorhiza kurroa and Podophyllum hexandrum are important Himalayan medicinal plants, reported to have immense therapeutic properties against various diseases. We present Phytochemica, a structured compilation of 963 PDMs from these plants, inclusive of their plant part source, chemical classification, IUPAC names, SMILES notations, physicochemical properties and 3-dimensional structures with associated references. Phytochemica is an exhaustive resource of natural molecules facilitating prospection for therapeutic molecules from medicinally important plants. It also offers refined search option to explore the neighbourhood of chemical space against ZINC database to identify analogs of natural molecules at user-defined cut-off. Availability of phytochemical structured dataset may enable their direct use in in silico drug discovery which will hasten the process of lead identification from natural products under proposed hypothesis, and may overcome urgent need for phytomedicines. Compilation and accessibility of indigenous phytochemicals and their derivatives can be a source of considerable advantage to research institutes as well as industries. Database URL: home.iitj.ac.in/∼bagler/webservers/Phytochemica PMID:26255307
Buske, Orion J.; Schiettecatte, François; Hutton, Benjamin; Dumitriu, Sergiu; Misyura, Andriy; Huang, Lijia; Hartley, Taila; Girdea, Marta; Sobreira, Nara; Mungall, Chris; Brudno, Michael
2016-01-01
Despite the increasing prevalence of clinical sequencing, the difficulty of identifying additional affected families is a key obstacle to solving many rare diseases. There may only be a handful of similar patients worldwide, and their data may be stored in diverse clinical and research databases. Computational methods are necessary to enable finding similar patients across the growing number of patient repositories and registries. We present the Matchmaker Exchange Application Programming Interface (MME API), a protocol and data format for exchanging phenotype and genotype profiles to enable matchmaking among patient databases, facilitate the identification of additional cohorts, and increase the rate with which rare diseases can be researched and diagnosed. We designed the API to be straightforward and flexible in order to simplify its adoption on a large number of data types and workflows. We also provide a public test data set, curated from the literature, to facilitate implementation of the API and development of new matching algorithms. The initial version of the API has been successfully implemented by three members of the Matchmaker Exchange and was immediately able to reproduce previously-identified matches and generate several new leads currently being validated. The API is available at https://github.com/ga4gh/mme-apis. PMID:26255989
Buske, Orion J; Schiettecatte, François; Hutton, Benjamin; Dumitriu, Sergiu; Misyura, Andriy; Huang, Lijia; Hartley, Taila; Girdea, Marta; Sobreira, Nara; Mungall, Chris; Brudno, Michael
2015-10-01
Despite the increasing prevalence of clinical sequencing, the difficulty of identifying additional affected families is a key obstacle to solving many rare diseases. There may only be a handful of similar patients worldwide, and their data may be stored in diverse clinical and research databases. Computational methods are necessary to enable finding similar patients across the growing number of patient repositories and registries. We present the Matchmaker Exchange Application Programming Interface (MME API), a protocol and data format for exchanging phenotype and genotype profiles to enable matchmaking among patient databases, facilitate the identification of additional cohorts, and increase the rate with which rare diseases can be researched and diagnosed. We designed the API to be straightforward and flexible in order to simplify its adoption on a large number of data types and workflows. We also provide a public test data set, curated from the literature, to facilitate implementation of the API and development of new matching algorithms. The initial version of the API has been successfully implemented by three members of the Matchmaker Exchange and was immediately able to reproduce previously identified matches and generate several new leads currently being validated. The API is available at https://github.com/ga4gh/mme-apis. © 2015 WILEY PERIODICALS, INC.
A multilocus database for the identification of Aspergillus and Penicillium species
USDA-ARS?s Scientific Manuscript database
Identification of Aspergillus and Penicillium isolates using phenotypic methods is increasingly complex and difficult but genetic tools allow recognition and description of species formerly unrecognized or cryptic. We constructed a web-based taxonomic database using BIGSdb for the identification of ...
Ruskova, Lenka; Raclavsky, Vladislav
2011-09-01
Routine medical microbiology diagnostics relies on conventional cultivation followed by phenotypic techniques for identification of pathogenic bacteria and fungi. This is not only due to tradition and economy but also because it provides pure culture needed for antibiotic susceptibility testing. This review focuses on the potential of High Resolution Melting Analysis (HRMA) of double-stranded DNA for future routine medical microbiology. Search of MEDLINE database for publications showing the advantages of HRMA in routine medical microbiology for identification, strain typing and further characterization of pathogenic bacteria and fungi in particular. The results show increasing numbers of newly-developed and more tailor-made assays in this field. For microbiologists unfamiliar with technical aspects of HRMA, we also provide insight into the technique from the perspective of microbial characterization. We can anticipate that the routine availability of HRMA in medical microbiology laboratories will provide a strong stimulus to this field. This is already envisioned by the growing number of medical microbiology applications published recently. The speed, power, convenience and cost effectiveness of this technology virtually predestine that it will advance genetic characterization of microbes and streamline, facilitate and enrich diagnostics in routine medical microbiology without interfering with the proven advantages of conventional cultivation.
Mitchell, Joshua M.; Fan, Teresa W.-M.; Lane, Andrew N.; Moseley, Hunter N. B.
2014-01-01
Large-scale identification of metabolites is key to elucidating and modeling metabolism at the systems level. Advances in metabolomics technologies, particularly ultra-high resolution mass spectrometry (MS) enable comprehensive and rapid analysis of metabolites. However, a significant barrier to meaningful data interpretation is the identification of a wide range of metabolites including unknowns and the determination of their role(s) in various metabolic networks. Chemoselective (CS) probes to tag metabolite functional groups combined with high mass accuracy provide additional structural constraints for metabolite identification and quantification. We have developed a novel algorithm, Chemically Aware Substructure Search (CASS) that efficiently detects functional groups within existing metabolite databases, allowing for combined molecular formula and functional group (from CS tagging) queries to aid in metabolite identification without a priori knowledge. Analysis of the isomeric compounds in both Human Metabolome Database (HMDB) and KEGG Ligand demonstrated a high percentage of isomeric molecular formulae (43 and 28%, respectively), indicating the necessity for techniques such as CS-tagging. Furthermore, these two databases have only moderate overlap in molecular formulae. Thus, it is prudent to use multiple databases in metabolite assignment, since each major metabolite database represents different portions of metabolism within the biosphere. In silico analysis of various CS-tagging strategies under different conditions for adduct formation demonstrate that combined FT-MS derived molecular formulae and CS-tagging can uniquely identify up to 71% of KEGG and 37% of the combined KEGG/HMDB database vs. 41 and 17%, respectively without adduct formation. This difference between database isomer disambiguation highlights the strength of CS-tagging for non-lipid metabolite identification. However, unique identification of complex lipids still needs additional information. PMID:25120557
Wang, Penghao; Wilson, Susan R
2013-01-01
Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.
Gerlt, John A
2017-08-22
The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.
2017-01-01
The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of “genomic enzymology” web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence–function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems. PMID:28826221
An Initial Design of ISO 19152:2012 LADM Based Valuation and Taxation Data Model
NASA Astrophysics Data System (ADS)
Çağdaş, V.; Kara, A.; van Oosterom, P.; Lemmen, C.; Işıkdağ, Ü.; Kathmann, R.; Stubkjær, E.
2016-10-01
A fiscal registry or database is supposed to record geometric, legal, physical, economic, and environmental characteristics in relation to property units, which are subject to immovable property valuation and taxation. Apart from procedural standards, there is no internationally accepted data standard that defines the semantics of fiscal databases. The ISO 19152:2012 Land Administration Domain Model (LADM), as an international land administration standard focuses on legal requirements, but considers out of scope specifications of external information systems including valuation and taxation databases. However, it provides a formalism which allows for an extension that responds to the fiscal requirements. This paper introduces an initial version of a LADM - Fiscal Extension Module for the specification of databases used in immovable property valuation and taxation. The extension module is designed to facilitate all stages of immovable property taxation, namely the identification of properties and taxpayers, assessment of properties through single or mass appraisal procedures, automatic generation of sales statistics, and the management of tax collection, dealing with arrears and appeals. It is expected that the initial version will be refined through further activities held by a possible joint working group under FIG Commission 7 (Cadastre and Land Management) and FIG Commission 9 (Valuation and the Management of Real Estate) in collaboration with other relevant international bodies.
Definition of the Beijing/W lineage of Mycobacterium tuberculosis on the basis of genetic markers.
Kremer, Kristin; Glynn, Judith R; Lillebaek, Troels; Niemann, Stefan; Kurepina, Natalia E; Kreiswirth, Barry N; Bifani, Pablo J; van Soolingen, Dick
2004-09-01
Mycobacterium tuberculosis Beijing genotype strains are highly prevalent in Asian countries and in the territory of the former Soviet Union. They are increasingly reported in other areas of the world and are frequently associated with tuberculosis outbreaks and drug resistance. Beijing genotype strains, including W strains, have been characterized by their highly similar multicopy IS6110 restriction fragment length polymorphism (RFLP) patterns, deletion of spacers 1 to 34 in the direct repeat region (Beijing spoligotype), and insertion of IS6110 in the genomic dnaA-dnaN locus. In this study the suitability and comparability of these three genetic markers to identify members of the Beijing lineage were evaluated. In a well-characterized collection of 1,020 M. tuberculosis isolates representative of the IS6110 RFLP genotypes found in The Netherlands, strains of two clades had spoligotypes characteristic of the Beijing lineage. A set of 19 Beijing reference RFLP patterns was selected to retrieve all Beijing strains from the Dutch database. These reference patterns gave a sensitivity of 98.1% and a specificity of 99.7% for identifying Beijing strains (defined by spoligotyping) in an international database of 1,084 strains. The usefulness of the reference patterns was also assessed with large DNA fingerprint databases in two other European countries and for identification strains from the W lineage found in the United States. A standardized definition for the identification of M. tuberculosis strains belonging to the Beijing/W lineage, as described in this work, will facilitate further studies on the spread and characterization of this widespread genotype family of M. tuberculosis strains.
Wang, Han; Harrison, Sandy P; Prentice, Iain C; Yang, Yanzheng; Bai, Fan; Togashi, Henrique F; Wang, Meng; Zhou, Shuangxi; Ni, Jian
2018-02-01
Plant functional traits provide information about adaptations to climate and environmental conditions, and can be used to explore the existence of alternative plant strategies within ecosystems. Trait data are also increasingly being used to provide parameter estimates for vegetation models. Here we present a new database of plant functional traits from China. Most global climate and vegetation types can be found in China, and thus the database is relevant for global modeling. The China Plant Trait Database contains information on morphometric, physical, chemical, and photosynthetic traits from 122 sites spanning the range from boreal to tropical, and from deserts and steppes through woodlands and forests, including montane vegetation. Data collection at each site was based either on sampling the dominant species or on a stratified sampling of each ecosystem layer. The database contains information on 1,215 unique species, though many species have been sampled at multiple sites. The original field identifications have been taxonomically standardized to the Flora of China. Similarly, derived photosynthetic traits, such as electron-transport and carboxylation capacities, were calculated using a standardized method. To facilitate trait-environment analyses, the database also contains detailed climate and vegetation information for each site. The data set is released under a Creative Commons BY license. When using the data set, we kindly request that you cite this article, recognizing the hard work that went into collecting the data and the authors' willingness to make it publicly available. © 2017 by the Ecological Society of America.
Reading sky and seeing a cloud: On the relevance of events for perceptual simulation.
Ostarek, Markus; Vigliocco, Gabriella
2017-04-01
Previous research has shown that processing words with an up/down association (e.g., bird, foot) can influence the subsequent identification of visual targets in congruent location (at the top/bottom of the screen). However, as facilitation and interference were found under similar conditions, the nature of the underlying mechanisms remained unclear. We propose that word comprehension relies on the perceptual simulation of a prototypical event involving the entity denoted by a word in order to provide a general account of the different findings. In 3 experiments, participants had to discriminate between 2 target pictures appearing at the top or the bottom of the screen by pressing the left versus right button. Immediately before the targets appeared, they saw an up/down word belonging to the target's event, an up/down word unrelated to the target, or a spatially neutral control word. Prime words belonging to target event facilitated identification of targets at a stimulus onset asynchrony (SOA) of 250 ms (Experiment 1), but only when presented in the vertical location where they are typically seen, indicating that targets were integrated in the simulations activated by the prime words. Moreover, at the same SOA, there was a robust facilitation effect for targets appearing in their typical location regardless of the prime type. However, when words were presented for 100 ms (Experiment 2) or 800 ms (Experiment 3), only a location nonspecific priming effect was found, suggesting that the visual system was not activated. Implications for theories of semantic processing are discussed. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Automated classification of radiology reports to facilitate retrospective study in radiology.
Zhou, Yihua; Amundson, Per K; Yu, Fang; Kessler, Marcus M; Benzinger, Tammie L S; Wippold, Franz J
2014-12-01
Retrospective research is an import tool in radiology. Identifying imaging examinations appropriate for a given research question from the unstructured radiology reports is extremely useful, but labor-intensive. Using the machine learning text-mining methods implemented in LingPipe [1], we evaluated the performance of the dynamic language model (DLM) and the Naïve Bayesian (NB) classifiers in classifying radiology reports to facilitate identification of radiological examinations for research projects. The training dataset consisted of 14,325 sentences from 11,432 radiology reports randomly selected from a database of 5,104,594 reports in all disciplines of radiology. The training sentences were categorized manually into six categories (Positive, Differential, Post Treatment, Negative, Normal, and History). A 10-fold cross-validation [2] was used to evaluate the performance of the models, which were tested in classification of radiology reports for cases of sellar or suprasellar masses and colloid cysts. The average accuracies for the DLM and NB classifiers were 88.5% with 95% confidence interval (CI) of 1.9% and 85.9% with 95% CI of 2.0%, respectively. The DLM performed slightly better and was used to classify 1,397 radiology reports containing the keywords "sellar or suprasellar mass", or "colloid cyst". The DLM model produced an accuracy of 88.2% with 95% CI of 2.1% for 959 reports that contain "sellar or suprasellar mass" and an accuracy of 86.3% with 95% CI of 2.5% for 437 reports of "colloid cyst". We conclude that automated classification of radiology reports using machine learning techniques can effectively facilitate the identification of cases suitable for retrospective research.
Vilar, Santiago; Harpaz, Rave; Chase, Herbert S; Costanzi, Stefano; Rabadan, Raul
2011-01-01
Background Adverse drug events (ADE) cause considerable harm to patients, and consequently their detection is critical for patient safety. The US Food and Drug Administration maintains an adverse event reporting system (AERS) to facilitate the detection of ADE in drugs. Various data mining approaches have been developed that use AERS to detect signals identifying associations between drugs and ADE. The signals must then be monitored further by domain experts, which is a time-consuming task. Objective To develop a new methodology that combines existing data mining algorithms with chemical information by analysis of molecular fingerprints to enhance initial ADE signals generated from AERS, and to provide a decision support mechanism to facilitate the identification of novel adverse events. Results The method achieved a significant improvement in precision in identifying known ADE, and a more than twofold signal enhancement when applied to the ADE rhabdomyolysis. The simplicity of the method assists in highlighting the etiology of the ADE by identifying structurally similar drugs. A set of drugs with strong evidence from both AERS and molecular fingerprint-based modeling is constructed for further analysis. Conclusion The results demonstrate that the proposed methodology could be used as a pharmacovigilance decision support tool to facilitate ADE detection. PMID:21946238
HAEdb: a novel interactive, locus-specific mutation database for the C1 inhibitor gene.
Kalmár, Lajos; Hegedüs, Tamás; Farkas, Henriette; Nagy, Melinda; Tordai, Attila
2005-01-01
Hereditary angioneurotic edema (HAE) is an autosomal dominant disorder characterized by episodic local subcutaneous and submucosal edema and is caused by the deficiency of the activated C1 esterase inhibitor protein (C1-INH or C1INH; approved gene symbol SERPING1). Published C1-INH mutations are represented in large universal databases (e.g., OMIM, HGMD), but these databases update their data rather infrequently, they are not interactive, and they do not allow searches according to different criteria. The HAEdb, a C1-INH gene mutation database (http://hae.biomembrane.hu) was created to contribute to the following expectations: 1) help the comprehensive collection of information on genetic alterations of the C1-INH gene; 2) create a database in which data can be searched and compared according to several flexible criteria; and 3) provide additional help in new mutation identification. The website uses MySQL, an open-source, multithreaded, relational database management system. The user-friendly graphical interface was written in the PHP web programming language. The website consists of two main parts, the freely browsable search function, and the password-protected data deposition function. Mutations of the C1-INH gene are divided in two parts: gross mutations involving DNA fragments >1 kb, and micro mutations encompassing all non-gross mutations. Several attributes (e.g., affected exon, molecular consequence, family history) are collected for each mutation in a standardized form. This database may facilitate future comprehensive analyses of C1-INH mutations and also provide regular help for molecular diagnostic testing of HAE patients in different centers.
Flanagan, Keith; Cockell, Simon; Harwood, Colin; Hallinan, Jennifer; Nakjang, Sirintra; Lawry, Beth; Wipat, Anil
2014-06-30
The rapid and cost-effective identification of bacterial species is crucial, especially for clinical diagnosis and treatment. Peptide aptamers have been shown to be valuable for use as a component of novel, direct detection methods. These small peptides have a number of advantages over antibodies, including greater specificity and longer shelf life. These properties facilitate their use as the detector components of biosensor devices. However, the identification of suitable aptamer targets for particular groups of organisms is challenging. We present a semi-automated processing pipeline for the identification of candidate aptamer targets from whole bacterial genome sequences. The pipeline can be configured to search for protein sequence fragments that uniquely identify a set of strains of interest. The system is also capable of identifying additional organisms that may be of interest due to their possession of protein fragments in common with the initial set. Through the use of Cloud computing technology and distributed databases, our system is capable of scaling with the rapidly growing genome repositories, and consequently of keeping the resulting data sets up-to-date. The system described is also more generically applicable to the discovery of specific targets for other diagnostic approaches such as DNA probes, PCR primers and antibodies.
Christner, Martin; Trusch, Maria; Rohde, Holger; Kwiatkowski, Marcel; Schlüter, Hartmut; Wolters, Manuel; Aepfelbacher, Martin; Hentschke, Moritz
2014-01-01
In 2011 northern Germany experienced a large outbreak of Shiga-Toxigenic Escherichia coli O104:H4. The large amount of samples sent to microbiology laboratories for epidemiological assessment highlighted the importance of fast and inexpensive typing procedures. We have therefore evaluated the applicability of a MALDI-TOF mass spectrometry based strategy for outbreak strain identification. Specific peaks in the outbreak strain's spectrum were identified by comparative analysis of archived pre-outbreak spectra that had been acquired for routine species-level identification. Proteins underlying these discriminatory peaks were identified by liquid chromatography tandem mass spectrometry and validated against publicly available databases. The resulting typing scheme was evaluated against PCR genotyping with 294 E. coli isolates from clinical samples collected during the outbreak. Comparative spectrum analysis revealed two characteristic peaks at m/z 6711 and m/z 10883. The underlying proteins were found to be of low prevalence among genome sequenced E. coli strains. Marker peak detection correctly classified 292 of 293 study isolates, including all 104 outbreak isolates. MALDI-TOF mass spectrometry allowed for reliable outbreak strain identification during a large outbreak of Shiga-Toxigenic E. coli. The applied typing strategy could probably be adapted to other typing tasks and might facilitate epidemiological surveys as part of the routine pathogen identification workflow.
Flanagan, Keith; Cockell, Simon; Harwood, Colin; Hallinan, Jennifer; Nakjang, Sirintra; Lawry, Beth; Wipat, Anil
2014-06-01
The rapid and cost-effective identification of bacterial species is crucial, especially for clinical diagnosis and treatment. Peptide aptamers have been shown to be valuable for use as a component of novel, direct detection methods. These small peptides have a number of advantages over antibodies, including greater specificity and longer shelf life. These properties facilitate their use as the detector components of biosensor devices. However, the identification of suitable aptamer targets for particular groups of organisms is challenging. We present a semi-automated processing pipeline for the identification of candidate aptamer targets from whole bacterial genome sequences. The pipeline can be configured to search for protein sequence fragments that uniquely identify a set of strains of interest. The system is also capable of identifying additional organisms that may be of interest due to their possession of protein fragments in common with the initial set. Through the use of Cloud computing technology and distributed databases, our system is capable of scaling with the rapidly growing genome repositories, and consequently of keeping the resulting data sets up-to-date. The system described is also more generically applicable to the discovery of specific targets for other diagnostic approaches such as DNA probes, PCR primers and antibodies.
Development of a conceptual integrated traffic safety problem identification database
DOT National Transportation Integrated Search
1999-12-01
The project conceptualized a traffic safety risk management information system and statistical database for improved problem-driver identification, countermeasure development, and resource allocation. The California Department of Motor Vehicles Drive...
9 CFR 55.25 - Animal identification.
Code of Federal Regulations, 2011 CFR
2011-01-01
... Database. The second animal identification must be unique for the individual animal within the herd and also must be linked to that animal and herd in the CWD National Database. (Approved by the Office of...
9 CFR 55.25 - Animal identification.
Code of Federal Regulations, 2012 CFR
2012-01-01
... Database. The second animal identification must be unique for the individual animal within the herd and also must be linked to that animal and herd in the CWD National Database. (Approved by the Office of...
9 CFR 55.25 - Animal identification.
Code of Federal Regulations, 2010 CFR
2010-01-01
... Database. The second animal identification must be unique for the individual animal within the herd and also must be linked to that animal and herd in the CWD National Database. (Approved by the Office of...
Seng, Piseth; Drancourt, Michel; Gouriet, Frédérique; La Scola, Bernard; Fournier, Pierre-Edouard; Rolain, Jean Marc; Raoult, Didier
2009-08-15
Matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry accurately identifies both selected bacteria and bacteria in select clinical situations. It has not been evaluated for routine use in the clinic. We prospectively analyzed routine MALDI-TOF mass spectrometry identification in parallel with conventional phenotypic identification of bacteria regardless of phylum or source of isolation. Discrepancies were resolved by 16S ribosomal RNA and rpoB gene sequence-based molecular identification. Colonies (4 spots per isolate directly deposited on the MALDI-TOF plate) were analyzed using an Autoflex II Bruker Daltonik mass spectrometer. Peptidic spectra were compared with the Bruker BioTyper database, version 2.0, and the identification score was noted. Delays and costs of identification were measured. Of 1660 bacterial isolates analyzed, 95.4% were correctly identified by MALDI-TOF mass spectrometry; 84.1% were identified at the species level, and 11.3% were identified at the genus level. In most cases, absence of identification (2.8% of isolates) and erroneous identification (1.7% of isolates) were due to improper database entries. Accurate MALDI-TOF mass spectrometry identification was significantly correlated with having 10 reference spectra in the database (P=.01). The mean time required for MALDI-TOF mass spectrometry identification of 1 isolate was 6 minutes for an estimated 22%-32% cost of current methods of identification. MALDI-TOF mass spectrometry is a cost-effective, accurate method for routine identification of bacterial isolates in <1 h using a database comprising > or =10 reference spectra per bacterial species and a 1.9 identification score (Brucker system). It may replace Gram staining and biochemical identification in the near future.
FunRich proteomics software analysis, let the fun begin!
Benito-Martin, Alberto; Peinado, Héctor
2015-08-01
Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Song, Yang; Laskay, Ünige A.; Vilcins, Inger-Marie E.; Barbour, Alan G.; Wysocki, Vicki H.
2015-11-01
Ticks are vectors for disease transmission because they are indiscriminant in their feeding on multiple vertebrate hosts, transmitting pathogens between their hosts. Identifying the hosts on which ticks have fed is important for disease prevention and intervention. We have previously shown that hemoglobin (Hb) remnants from a host on which a tick fed can be used to reveal the host's identity. For the present research, blood was collected from 33 bird species that are common in the U.S. as hosts for ticks but that have unknown Hb sequences. A top-down-assisted bottom-up mass spectrometry approach with a customized searching database, based on variability in known bird hemoglobin sequences, has been devised to facilitate fast and complete sequencing of hemoglobin from birds with unknown sequences. These hemoglobin sequences will be added to a hemoglobin database and used for tick host identification. The general approach has the potential to sequence any set of homologous proteins completely in a rapid manner.
Singh, Anushikha; Dutta, Malay Kishore; Sharma, Dilip Kumar
2016-10-01
Identification of fundus images during transmission and storage in database for tele-ophthalmology applications is an important issue in modern era. The proposed work presents a novel accurate method for generation of unique identification code for identification of fundus images for tele-ophthalmology applications and storage in databases. Unlike existing methods of steganography and watermarking, this method does not tamper the medical image as nothing is embedded in this approach and there is no loss of medical information. Strategic combination of unique blood vessel pattern and patient ID is considered for generation of unique identification code for the digital fundus images. Segmented blood vessel pattern near the optic disc is strategically combined with patient ID for generation of a unique identification code for the image. The proposed method of medical image identification is tested on the publically available DRIVE and MESSIDOR database of fundus image and results are encouraging. Experimental results indicate the uniqueness of identification code and lossless recovery of patient identity from unique identification code for integrity verification of fundus images. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
miRToolsGallery: a tag-based and rankable microRNA bioinformatics resources database portal
Chen, Liang; Heikkinen, Liisa; Wang, ChangLiang; Yang, Yang; Knott, K Emily
2018-01-01
Abstract Hundreds of bioinformatics tools have been developed for MicroRNA (miRNA) investigations including those used for identification, target prediction, structure and expression profile analysis. However, finding the correct tool for a specific application requires the tedious and laborious process of locating, downloading, testing and validating the appropriate tool from a group of nearly a thousand. In order to facilitate this process, we developed a novel database portal named miRToolsGallery. We constructed the portal by manually curating > 950 miRNA analysis tools and resources. In the portal, a query to locate the appropriate tool is expedited by being searchable, filterable and rankable. The ranking feature is vital to quickly identify and prioritize the more useful from the obscure tools. Tools are ranked via different criteria including the PageRank algorithm, date of publication, number of citations, average of votes and number of publications. miRToolsGallery provides links and data for the comprehensive collection of currently available miRNA tools with a ranking function which can be adjusted using different criteria according to specific requirements. Database URL: http://www.mirtoolsgallery.org PMID:29688355
dbCPG: A web resource for cancer predisposition genes.
Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng
2016-06-21
Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes.
Shah, Eric D; Fisch, Brandon M A; Arceci, Robert J; Buckley, Jonathan D; Reaman, Gregory H; Sorensen, Poul H; Triche, Timothy J; Reynolds, C Patrick
2014-05-01
Academic laboratories are developing increasingly large amounts of data that describe the genomic landscape and gene expression patterns of various types of cancers. Such data can potentially identify novel oncology molecular targets in cancer types that may not be the primary focus of a drug sponsor's initial research for an investigational new drug. Obtaining preclinical data that point toward the potential for a given molecularly targeted agent, or a novel combination of agents requires knowledge of drugs currently in development in both the academic and commercial sectors. We have developed the DrugPath database ( http://www.drugpath.org ) as a comprehensive, free-of-charge resource for academic investigators to identify agents being developed in academics or industry that may act against molecular targets of interest. DrugPath data on molecular targets overlay the Michigan Molecular Interactions ( http://mimi.ncibi.org ) gene-gene interaction map to facilitate identification of related agents in the same pathway. The database catalogs 2,081 drug development programs representing 751 drug sponsors and 722 molecular and genetic targets. DrugPath should assist investigators in identifying and obtaining drugs acting on specific molecular targets for biological and preclinical therapeutic studies.
Palm-Vein Classification Based on Principal Orientation Features
Zhou, Yujia; Liu, Yaqin; Feng, Qianjin; Yang, Feng; Huang, Jing; Nie, Yixiao
2014-01-01
Personal recognition using palm–vein patterns has emerged as a promising alternative for human recognition because of its uniqueness, stability, live body identification, flexibility, and difficulty to cheat. With the expanding application of palm–vein pattern recognition, the corresponding growth of the database has resulted in a long response time. To shorten the response time of identification, this paper proposes a simple and useful classification for palm–vein identification based on principal direction features. In the registration process, the Gaussian-Radon transform is adopted to extract the orientation matrix and then compute the principal direction of a palm–vein image based on the orientation matrix. The database can be classified into six bins based on the value of the principal direction. In the identification process, the principal direction of the test sample is first extracted to ascertain the corresponding bin. One-by-one matching with the training samples is then performed in the bin. To improve recognition efficiency while maintaining better recognition accuracy, two neighborhood bins of the corresponding bin are continuously searched to identify the input palm–vein image. Evaluation experiments are conducted on three different databases, namely, PolyU, CASIA, and the database of this study. Experimental results show that the searching range of one test sample in PolyU, CASIA and our database by the proposed method for palm–vein identification can be reduced to 14.29%, 14.50%, and 14.28%, with retrieval accuracy of 96.67%, 96.00%, and 97.71%, respectively. With 10,000 training samples in the database, the execution time of the identification process by the traditional method is 18.56 s, while that by the proposed approach is 3.16 s. The experimental results confirm that the proposed approach is more efficient than the traditional method, especially for a large database. PMID:25383715
Suchard, Marc A; Zorych, Ivan; Simpson, Shawn E; Schuemie, Martijn J; Ryan, Patrick B; Madigan, David
2013-10-01
The self-controlled case series (SCCS) offers potential as an statistical method for risk identification involving medical products from large-scale observational healthcare data. However, analytic design choices remain in encoding the longitudinal health records into the SCCS framework and its risk identification performance across real-world databases is unknown. To evaluate the performance of SCCS and its design choices as a tool for risk identification in observational healthcare data. We examined the risk identification performance of SCCS across five design choices using 399 drug-health outcome pairs in five real observational databases (four administrative claims and one electronic health records). In these databases, the pairs involve 165 positive controls and 234 negative controls. We also consider several synthetic databases with known relative risks between drug-outcome pairs. We evaluate risk identification performance through estimating the area under the receiver-operator characteristics curve (AUC) and bias and coverage probability in the synthetic examples. The SCCS achieves strong predictive performance. Twelve of the twenty health outcome-database scenarios return AUCs >0.75 across all drugs. Including all adverse events instead of just the first per patient and applying a multivariate adjustment for concomitant drug use are the most important design choices. However, the SCCS as applied here returns relative risk point-estimates biased towards the null value of 1 with low coverage probability. The SCCS recently extended to apply a multivariate adjustment for concomitant drug use offers promise as a statistical tool for risk identification in large-scale observational healthcare databases. Poor estimator calibration dampens enthusiasm, but on-going work should correct this short-coming.
IRIS: A database application system for diseases identification using FTIR spectroscopy
NASA Astrophysics Data System (ADS)
Arshad, Ahmad Zulhilmi; Munajat, Yusof; Ibrahim, Raja Kamarulzaman Raja; Mahmood, Nasrul Humaimi
2015-05-01
Infrared information on diseases identification system (IRIS) is an application for diseases identification and analysis by using Fourier transform infrared (FTIR) spectroscopy. This is the preliminary step to gather information from the secondary data which was extracted from recognized various research and scientific paper, which are combined into a single database as in IRIS for our purpose of study. The importance of this database is to examine the fingerprint differences between normal and diseases cell or tissue. With the implementation of this application is it hopes that the diseases identification using FTIR spectroscopy would be more reliable and may assist either physicians, pathologists, or researchers to diagnose the certain type of disease efficiently.
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2011 CFR
2011-01-01
... is linked to that animal in the CWD National Database. The second animal identification must be... CWD National Database. (Approved by the Office of Management and Budget under control number 0579-0237) ...
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2010 CFR
2010-01-01
... is linked to that animal in the CWD National Database. The second animal identification must be... CWD National Database. (Approved by the Office of Management and Budget under control number 0579-0237) ...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-31
... Boating Accident Report Database AGENCY: Coast Guard, DHS. ACTION: Rule; information collection approval... Identification System, and Boating Accident Report Database rule became effective on April 27, 2012. Under the...
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2012 CFR
2012-01-01
... is linked to that animal in the CWD National Database. The second animal identification must be... CWD National Database. (Approved by the Office of Management and Budget under control number 0579-0237) ...
Taverna, Constanza Giselle; Mazza, Mariana; Bueno, Nadia Soledad; Alvarez, Christian; Amigot, Susana; Andreani, Mariana; Azula, Natalia; Barrios, Rubén; Fernández, Norma; Fox, Barbara; Guelfand, Liliana; Maldonado, Ivana; Murisengo, Omar Alejandro; Relloso, Silvia; Vivot, Matias; Davel, Graciela
2018-05-11
Matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) has revolutionized the identification of microorganisms in clinical laboratories because it is rapid, relatively simple to use, accurate, and can be used for a wide number of microorganisms. Several studies have demonstrated the utility of this technique in the identification of yeasts; however, its performance is usually improved by the extension of the database. Here we developed an in-house database of 143 strains belonging to 42 yeast species in the MALDI Biotyper platform, and we validated the extended database with 388 regional strains and 15 reference strains belonging to 55 yeast species. We also performed an intra- and interlaboratory study to assess reproducibility and analyzed the use of the cutoff values of 1.700 and 2.000 to correctly identify at species level. The creation of an in-house database that extended the manufacturer's database was successful in view of no incorrect identification was introduced. The best performance was observed by using the extended database and a cutoff value of 1.700 with a sensitivity of .94 and specificity of .96. A reproducibility study showed utility to detect deviations and could be used for external quality control. The extended database was able to differentiate closely related species and it has potential in distinguishing the molecular genotypes of Cryptococcus neoformans and Cryptococcus gattii.
Identification of Enzyme Genes Using Chemical Structure Alignments of Substrate-Product Pairs.
Moriya, Yuki; Yamada, Takuji; Okuda, Shujiro; Nakagawa, Zenichi; Kotera, Masaaki; Tokimatsu, Toshiaki; Kanehisa, Minoru; Goto, Susumu
2016-03-28
Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies that estimate the number of candidate enzyme genes, these studies required some additional information aside from the structures of metabolites such as gene expression and order in the genome. In this study, we developed a novel method to identify a candidate enzyme gene of a reaction using the chemical structures of the substrate-product pair (reactant pair). The proposed method is based on a search for similar reactant pairs in a reference database and offers ortholog groups that possibly mediate the given reaction. We applied the proposed method to two experimentally validated reactions. As a result, we confirmed that the histidine transaminase was correctly identified. Although our method could not directly identify the asparagine oxo-acid transaminase, we successfully found the paralog gene most similar to the correct enzyme gene. We also applied our method to infer candidate enzyme genes in the mesaconate pathway. The advantage of our method lies in the prediction of possible genes for orphan enzyme reactions where any associated gene sequences are not determined yet. We believe that this approach will facilitate experimental identification of genes for orphan enzymes.
Barrett, Aileen; Galvin, Rose; Steinert, Yvonne; Scherpbier, Albert; O'Shaughnessy, Ann; Horgan, Mary; Horsley, Tanya
2016-12-01
The extent to which workplace-based assessment (WBA) can be used as a facilitator of change among trainee doctors has not been established; this is particularly important in the case of underperforming trainees. The aim of this review is to examine the use of WBA in identifying and remediating performance among this cohort. Following publication of a review protocol a comprehensive search of eight databases took place to identify relevant articles published prior to November 2015. All screening, data extraction and analysis procedures were performed in duplicate or with quality checks and necessary consensus methods throughout. Given the study-level heterogeneity, a descriptive synthesis approach informed the study analysis. Twenty studies met the inclusion criteria. The use of WBA within the context of remediation is not supported within the existing literature. The identification of underperformance is not supported by the use of stand-alone, single-assessor WBA events although specific areas of underperformance may be identified. Multisource feedback (MSF) tools may facilitate identification of underperformance. The extent to which WBA can be used to detect and manage underperformance in postgraduate trainees is unclear although evidence to date suggests that multirater assessments (i.e. MSF) may be of more use than single-rater judgments (e.g. mini-clinical evaluation exercise).
The Porifera Ontology (PORO): enhancing sponge systematics with an anatomy ontology.
Thacker, Robert W; Díaz, Maria Cristina; Kerner, Adeline; Vignes-Lebbe, Régine; Segerdell, Erik; Haendel, Melissa A; Mungall, Christopher J
2014-01-01
Porifera (sponges) are ancient basal metazoans that lack organs. They provide insight into key evolutionary transitions, such as the emergence of multicellularity and the nervous system. In addition, their ability to synthesize unusual compounds offers potential biotechnical applications. However, much of the knowledge of these organisms has not previously been codified in a machine-readable way using modern web standards. The Porifera Ontology is intended as a standardized coding system for sponge anatomical features currently used in systematics. The ontology is available from http://purl.obolibrary.org/obo/poro.owl, or from the project homepage http://porifera-ontology.googlecode.com/. The version referred to in this manuscript is permanently available from http://purl.obolibrary.org/obo/poro/releases/2014-03-06/. By standardizing character representations, we hope to facilitate more rapid description and identification of sponge taxa, to allow integration with other evolutionary database systems, and to perform character mapping across the major clades of sponges to better understand the evolution of morphological features. Future applications of the ontology will focus on creating (1) ontology-based species descriptions; (2) taxonomic keys that use the nested terms of the ontology to more quickly facilitate species identifications; and (3) methods to map anatomical characters onto molecular phylogenies of sponges. In addition to modern taxa, the ontology is being extended to include features of fossil taxa.
Vlek, Anneloes; Kolecka, Anna; Khayhan, Kantarawee; Theelen, Bart; Groenewald, Marizeth; Boel, Edwin
2014-01-01
An interlaboratory study using matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) to determine the identification of clinically important yeasts (n = 35) was performed at 11 clinical centers, one company, and one reference center using the Bruker Daltonics MALDI Biotyper system. The optimal cutoff for the MALDI-TOF MS score was investigated using receiver operating characteristic (ROC) curve analyses. The percentages of correct identifications were compared for different sample preparation methods and different databases. Logistic regression analysis was performed to analyze the association between the number of spectra in the database and the percentage of strains that were correctly identified. A total of 5,460 MALDI-TOF MS results were obtained. Using all results, the area under the ROC curve was 0.95 (95% confidence interval [CI], 0.94 to 0.96). With a sensitivity of 0.84 and a specificity of 0.97, a cutoff value of 1.7 was considered optimal. The overall percentage of correct identifications (formic acid-ethanol extraction method, score ≥ 1.7) was 61.5% when the commercial Bruker Daltonics database (BDAL) was used, and it increased to 86.8% by using an extended BDAL supplemented with a Centraalbureau voor Schimmelcultures (CBS)-KNAW Fungal Biodiversity Centre in-house database (BDAL+CBS in-house). A greater number of main spectra (MSP) in the database was associated with a higher percentage of correct identifications (odds ratio [OR], 1.10; 95% CI, 1.05 to 1.15; P < 0.01). The results from the direct transfer method ranged from 0% to 82.9% correct identifications, with the results of the top four centers ranging from 71.4% to 82.9% correct identifications. This study supports the use of a cutoff value of 1.7 for the identification of yeasts using MALDI-TOF MS. The inclusion of enough isolates of the same species in the database can enhance the proportion of correctly identified strains. Further optimization of the preparation methods, especially of the direct transfer method, may contribute to improved diagnosis of yeast-related infections. PMID:24920782
Introducing the Forensic Research/Reference on Genetics knowledge base, FROG-kb.
Rajeevan, Haseena; Soundararajan, Usha; Pakstis, Andrew J; Kidd, Kenneth K
2012-09-01
Online tools and databases based on multi-allelic short tandem repeat polymorphisms (STRPs) are actively used in forensic teaching, research, and investigations. The Fst value of each CODIS marker tends to be low across the populations of the world and most populations typically have all the common STRP alleles present diminishing the ability of these systems to discriminate ethnicity. Recently, considerable research is being conducted on single nucleotide polymorphisms (SNPs) to be considered for human identification and description. However, online tools and databases that can be used for forensic research and investigation are limited. The back end DBMS (Database Management System) for FROG-kb is Oracle version 10. The front end is implemented with specific code using technologies such as Java, Java Servlet, JSP, JQuery, and GoogleCharts. We present an open access web application, FROG-kb (Forensic Research/Reference on Genetics-knowledge base, http://frog.med.yale.edu), that is useful for teaching and research relevant to forensics and can serve as a tool facilitating forensic practice. The underlying data for FROG-kb are provided by the already extensively used and referenced ALlele FREquency Database, ALFRED (http://alfred.med.yale.edu). In addition to displaying data in an organized manner, computational tools that use the underlying allele frequencies with user-provided data are implemented in FROG-kb. These tools are organized by the different published SNP/marker panels available. This web tool currently has implemented general functions possible for two types of SNP panels, individual identification and ancestry inference, and a prediction function specific to a phenotype informative panel for eye color. The current online version of FROG-kb already provides new and useful functionality. We expect FROG-kb to grow and expand in capabilities and welcome input from the forensic community in identifying datasets and functionalities that will be most helpful and useful. Thus, the structure and functionality of FROG-kb will be revised in an ongoing process of improvement. This paper describes the state as of early June 2012.
Prakash, Peralam Yegneswaran; Irinyi, Laszlo; Halliday, Catriona; Chen, Sharon; Robert, Vincent
2017-01-01
ABSTRACT The increase in public online databases dedicated to fungal identification is noteworthy. This can be attributed to improved access to molecular approaches to characterize fungi, as well as to delineate species within specific fungal groups in the last 2 decades, leading to an ever-increasing complexity of taxonomic assortments and nomenclatural reassignments. Thus, well-curated fungal databases with substantial accurate sequence data play a pivotal role for further research and diagnostics in the field of mycology. This minireview aims to provide an overview of currently available online databases for the taxonomy and identification of human and animal-pathogenic fungi and calls for the establishment of a cloud-based dynamic data network platform. PMID:28179406
Pattin, Kristine A.; Moore, Jason H.
2009-01-01
One of the central goals of human genetics is the identification of loci with alleles or genotypes that confer increased susceptibility. The availability of dense maps of single-nucleotide polymorphisms (SNPs) along with high-throughput genotyping technologies has set the stage for routine genome-wide association studies that are expected to significantly improve our ability to identify susceptibility loci. Before this promise can be realized, there are some significant challenges that need to be addressed. We address here the challenge of detecting epistasis or gene-gene interactions in genome-wide association studies. Discovering epistatic interactions in high dimensional datasets remains a challenge due to the computational complexity resulting from the analysis of all possible combinations of SNPs. One potential way to overcome the computational burden of a genome-wide epistasis analysis would be to devise a logical way to prioritize the many SNPs in a dataset so that the data may be analyzed more efficiently and yet still retain important biological information. One of the strongest demonstrations of the functional relationship between genes is protein-protein interaction. Thus, it is plausible that the expert knowledge extracted from protein interaction databases may allow for a more efficient analysis of genome-wide studies as well as facilitate the biological interpretation of the data. In this review we will discuss the challenges of detecting epistasis in genome-wide genetic studies and the means by which we propose to apply expert knowledge extracted from protein interaction databases to facilitate this process. We explore some of the fundamentals of protein interactions and the databases that are publicly available. PMID:18551320
DOE Office of Scientific and Technical Information (OSTI.GOV)
Webb-Robertson, Bobbie-Jo M.
Accurate identification of peptides is a current challenge in mass spectrometry (MS) based proteomics. The standard approach uses a search routine to compare tandem mass spectra to a database of peptides associated with the target organism. These database search routines yield multiple metrics associated with the quality of the mapping of the experimental spectrum to the theoretical spectrum of a peptide. The structure of these results make separating correct from false identifications difficult and has created a false identification problem. Statistical confidence scores are an approach to battle this false positive problem that has led to significant improvements in peptidemore » identification. We have shown that machine learning, specifically support vector machine (SVM), is an effective approach to separating true peptide identifications from false ones. The SVM-based peptide statistical scoring method transforms a peptide into a vector representation based on database search metrics to train and validate the SVM. In practice, following the database search routine, a peptides is denoted in its vector representation and the SVM generates a single statistical score that is then used to classify presence or absence in the sample« less
Agustini, Bruna Carla; Silva, Luciano Paulino; Bloch, Carlos; Bonfim, Tania M B; da Silva, Gildo Almeida
2014-06-01
Yeast identification using traditional methods which employ morphological, physiological, and biochemical characteristics can be considered a hard task as it requires experienced microbiologists and a rigorous control in culture conditions that could implicate in different outcomes. Considering clinical or industrial applications, the fast and accurate identification of microorganisms is a crescent demand. Hence, molecular biology approaches has been extensively used and, more recently, protein profiling using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has proved to be an even more efficient tool for taxonomic purposes. Nonetheless, concerning to mass spectrometry, data available for the differentiation of yeast species for industrial purpose is limited and reference databases commercially available comprise almost exclusively clinical microorganisms. In this context, studies focusing on environmental isolates are required to extend the existing databases. The development of a supplementary database and the assessment of a commercial database for taxonomic identifications of environmental yeast are the aims of this study. We challenge MALDI-TOF MS to create protein profiles for 845 yeast strains isolated from grape must and 67.7 % of the strains were successfully identified according to previously available manufacturer database. The remaining 32.3 % strains were not identified due to the absence of a reference spectrum. After matching the correct taxon for these strains by using molecular biology approaches, the spectra concerning the missing species were added in a supplementary database. This new library was able to accurately predict unidentified species at first instance by MALDI-TOF MS, proving it is a powerful tool for the identification of environmental yeasts.
Nilsson, R. Henrik; Taylor, Andy F. S.; Adams, Rachel I.; Baschien, Christiane; Johan Bengtsson-Palme; Cangren, Patrik; Coleine, Claudia; Heide-Marie Daniel; Glassman, Sydney I.; Hirooka, Yuuri; Irinyi, Laszlo; Reda Iršėnaitė; Pedro M. Martin-Sanchez; Meyer, Wieland; Seung-Yoon Oh; Jose Paulo Sampaio; Seifert, Keith A.; Sklenář, Frantisek; Dirk Stubbe; Suh, Sung-Oui; Summerbell, Richard; Svantesson, Sten; Martin Unterseher; Cobus M. Visagie; Weiss, Michael; Woudenberg, Joyce HC; Christian Wurzbacher; den Wyngaert, Silke Van; Yilmaz, Neriman; Andrey Yurkov; Kõljalg, Urmas; Abarenkov, Kessy
2018-01-01
Abstract Recent DNA-based studies have shown that the built environment is surprisingly rich in fungi. These indoor fungi – whether transient visitors or more persistent residents – may hold clues to the rising levels of human allergies and other medical and building-related health problems observed globally. The taxonomic identity of these fungi is crucial in such pursuits. Molecular identification of the built mycobiome is no trivial undertaking, however, given the large number of unidentified, misidentified, and technically compromised fungal sequences in public sequence databases. In addition, the sequence metadata required to make informed taxonomic decisions – such as country and host/substrate of collection – are often lacking even from reference and ex-type sequences. Here we report on a taxonomic annotation workshop (April 10–11, 2017) organized at the James Hutton Institute/University of Aberdeen (UK) to facilitate reproducible studies of the built mycobiome. The 32 participants went through public fungal ITS barcode sequences related to the built mycobiome for taxonomic and nomenclatural correctness, technical quality, and metadata availability. A total of 19,508 changes – including 4,783 name changes, 14,121 metadata annotations, and the removal of 99 technically compromised sequences – were implemented in the UNITE database for molecular identification of fungi (https://unite.ut.ee/) and shared with a range of other databases and downstream resources. Among the genera that saw the largest number of changes were Penicillium, Talaromyces, Cladosporium, Acremonium, and Alternaria, all of them of significant importance in both culture-based and culture-independent surveys of the built environment. PMID:29559822
USE OF EXISTING DATABASES FOR THE PURPOSE OF HAZARD IDENTIFICATION: AN EXAMPLE
Keywords: existing databases, hazard identification, cancer mortality, birth malformations
Background: Associations between adverse health effects and environmental exposures are difficult to study, because exposures may be widespread, low-dose in nature, and common thro...
Soetens, Oriane; De Bel, Annelies; Echahidi, Fedoua; Vancutsem, Ellen; Vandoorslaer, Kristof; Piérard, Denis
2012-01-01
The performance of matrix-assisted laser desorption–ionization time of flight mass spectrometry (MALDI-TOF MS) for species identification of Prevotella was evaluated and compared with 16S rRNA gene sequencing. Using a Bruker database, 62.7% of the 102 clinical isolates were identified to the species level and 73.5% to the genus level. Extension of the commercial database improved these figures to, respectively, 83.3% and 89.2%. MALDI-TOF MS identification of Prevotella is reliable but needs a more extensive database. PMID:22301022
Kaduk, James A.
1996-01-01
The crystallographic databases are powerful and cost-effective tools for solving materials identification problems, both individually and in combination. Examples of the conventional and unconventional use of the databases in solving practical problems involving organic, coordination, and inorganic compounds are provided. The creation and use of fully-relational versions of the Powder Diffraction File and NIST Crystal Data are described. PMID:27805165
ILDgenDB: integrated genetic knowledge resource for interstitial lung diseases (ILDs).
Mishra, Smriti; Shah, Mohammad I; Sarkar, Malay; Asati, Nimisha; Rout, Chittaranjan
2018-01-01
Interstitial lung diseases (ILDs) are a diverse group of ∼200 acute and chronic pulmonary disorders that are characterized by variable amounts of inflammation, fibrosis and architectural distortion with substantial morbidity and mortality. Inaccurate and delayed diagnoses increase the risk, especially in developing countries. Studies have indicated the significant roles of genetic elements in ILDs pathogenesis. Therefore, the first genetic knowledge resource, ILDgenDB, has been developed with an objective to provide ILDs genetic data and their integrated analyses for the better understanding of disease pathogenesis and identification of diagnostics-based biomarkers. This resource contains literature-curated disease candidate genes (DCGs) enriched with various regulatory elements that have been generated using an integrated bioinformatics workflow of databases searches, literature-mining and DCGs-microRNA (miRNAs)-single nucleotide polymorphisms (SNPs) association analyses. To provide statistical significance to disease-gene association, ILD-specificity index and hypergeomatric test scores were also incorporated. Association analyses of miRNAs, SNPs and pathways responsible for the pathogenesis of different sub-classes of ILDs were also incorporated. Manually verified 299 DCGs and their significant associations with 1932 SNPs, 2966 miRNAs and 9170 miR-polymorphisms were also provided. Furthermore, 216 literature-mined and proposed biomarkers were identified. The ILDgenDB resource provides user-friendly browsing and extensive query-based information retrieval systems. Additionally, this resource also facilitates graphical view of predicted DCGs-SNPs/miRNAs and literature associated DCGs-ILDs interactions for each ILD to facilitate efficient data interpretation. Outcomes of analyses suggested the significant involvement of immune system and defense mechanisms in ILDs pathogenesis. This resource may potentially facilitate genetic-based disease monitoring and diagnosis.Database URL: http://14.139.240.55/ildgendb/index.php.
Zhang, Jia-yu; Wang, Zi-jian; Li, Yun; Liu, Ying; Cai, Wei; Li, Chen; Lu, Jian-qiu; Qiao, Yan-jiang
2016-01-15
The analytical methodologies for evaluation of multi-component system in traditional Chinese medicines (TCMs) have been inadequate or unacceptable. As a result, the unclarity of multi-component hinders the sufficient interpretation of their bioactivities. In this paper, an ultra-high-performance liquid chromatography coupled with linear ion trap-Orbitrap (UPLC-LTQ-Orbitrap)-based strategy focused on the comprehensive identification of TCM sequential constituents was developed. The strategy was characterized by molecular design, multiple ion monitoring (MIM), targeted database hits and mass spectral trees similarity filter (MTSF), and even more isomerism discrimination. It was successfully applied in the HRMS data-acquisition and processing of chlorogenic acids (CGAs) in Flos Lonicerae Japonicae (FLJ), and a total of 115 chromatographic peaks attributed to 18 categories were characterized, allowing a comprehensive revelation of CGAs in FLJ for the first time. This demonstrated that MIM based on molecular design could improve the efficiency to trigger MS/MS fragmentation reactions. Targeted database hits and MTSF searching greatly facilitated the processing of extremely large information data. Besides, the introduction of diagnostic product ions (DPIs) discrimination, ClogP analysis, and molecular simulation, raised the efficiency and accuracy to characterize sequential constituents especially position and geometric isomers. In conclusion, the results expanded our understanding on CGAs in FLJ, and the strategy could be exemplary for future research on the comprehensive identification of sequential constituents in TCMs. Meanwhile, it may propose a novel idea for analyzing sequential constituents, and is promising for quality control and evaluation of TCMs. Copyright © 2015 Elsevier B.V. All rights reserved.
Species identification of corynebacteria by cellular fatty acid analysis.
Van den Velde, Sandra; Lagrou, Katrien; Desmet, Koen; Wauters, Georges; Verhaegen, Jan
2006-02-01
We evaluated the usefulness of cellular fatty acid analysis for the identification of corynebacteria. Therefore, 219 well-characterized strains belonging to 21 Corynebacterium species were analyzed with the Sherlock System of MIDI (Newark, DE). Most Corynebacterium species have a qualitative different fatty acid profile. Corynebacterium coyleae (subgroup 1), Corynebacterium riegelii, Corynebacterium simulans, and Corynebacterium imitans differ only quantitatively. Corynebacterium afermentans afermentans and C. coyleae (subgroup 2) have both a similar qualitative and quantitative profile. The commercially available database (CLIN 40, MIDI) identified only one third of the 219 strains correctly at the species level. We created a new database with these 219 strains. This new database was tested with 34 clinical isolates and could identify 29 strains correctly. Strains that remained unidentified were 2 Corynebacterium aurimucosum (not included in our database), 1 C. afermentans afermentans, and 2 Corynebacterium pseudodiphtheriticum. Cellular fatty acid analysis with a self-created database can be used for the identification and differentiation of corynebacteria.
Tran, Trung T; Bollineni, Ravi C; Strozynski, Margarita; Koehler, Christian J; Thiede, Bernd
2017-07-07
Alternative splicing is a mechanism in eukaryotes by which different forms of mRNAs are generated from the same gene. Identification of alternative splice variants requires the identification of peptides specific for alternative splice forms. For this purpose, we generated a human database that contains only unique tryptic peptides specific for alternative splice forms from Swiss-Prot entries. Using this database allows an easy access to splice variant-specific peptide sequences that match to MS data. Furthermore, we combined this database without alternative splice variant-1-specific peptides with human Swiss-Prot. This combined database can be used as a general database for searching of LC-MS data. LC-MS data derived from in-solution digests of two different cell lines (LNCaP, HeLa) and phosphoproteomics studies were analyzed using these two databases. Several nonalternative splice variant-1-specific peptides were found in both cell lines, and some of them seemed to be cell-line-specific. Control and apoptotic phosphoproteomes from Jurkat T cells revealed several nonalternative splice variant-1-specific peptides, and some of them showed clear quantitative differences between the two states.
Jabbarian, Lea J; Zwakman, Marieke; van der Heide, Agnes; Kars, Marijke C; Janssen, Daisy J A; van Delden, Johannes J; Rietjens, Judith A C; Korfage, Ida J
2018-03-01
Advance care planning (ACP) supports patients in identifying and documenting their preferences and timely discussing them with their relatives and healthcare professionals (HCPs). Since the British Thoracic Society encourages ACP in chronic respiratory disease, the objective was to systematically review ACP practice in chronic respiratory disease, attitudes of patients and HCPs and barriers and facilitators related to engagement in ACP. We systematically searched 12 electronic databases for empirical studies on ACP in adults with chronic respiratory diseases. Identified studies underwent full review and data extraction. Of 2509 studies, 21 were eligible: 10 were quantitative studies. Although a majority of patients was interested in engaging in ACP, ACP was rarely carried out. Many HCPs acknowledged the importance of ACP but were hesitant to initiate it. Barriers to engagement in ACP were the complex disease course of patients with chronic respiratory diseases, HCPs' concern of taking away patients' hopes and lack of continuity of care. The identification of trigger points and training of HCPs on how to communicate sensitive topics were identified as facilitators to engagement in ACP. In conclusion, ACP is surprisingly uncommon in chronic respiratory disease, possibly due to the complex disease course of chronic respiratory diseases and ambivalence of both patients and HCPs to engage in ACP. Providing patients with information about their disease can help meeting their needs. Additionally, support of HCPs through identification of trigger points, training and system-related changes can facilitate engagement in ACP. CRD42016039787. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Li, Honglan; Joh, Yoon Sung; Kim, Hyunwoo; Paek, Eunok; Lee, Sang-Won; Hwang, Kyu-Baek
2016-12-22
Proteogenomics is a promising approach for various tasks ranging from gene annotation to cancer research. Databases for proteogenomic searches are often constructed by adding peptide sequences inferred from genomic or transcriptomic evidence to reference protein sequences. Such inflation of databases has potential of identifying novel peptides. However, it also raises concerns on sensitive and reliable peptide identification. Spurious peptides included in target databases may result in underestimated false discovery rate (FDR). On the other hand, inflation of decoy databases could decrease the sensitivity of peptide identification due to the increased number of high-scoring random hits. Although several studies have addressed these issues, widely applicable guidelines for sensitive and reliable proteogenomic search have hardly been available. To systematically evaluate the effect of database inflation in proteogenomic searches, we constructed a variety of real and simulated proteogenomic databases for yeast and human tandem mass spectrometry (MS/MS) data, respectively. Against these databases, we tested two popular database search tools with various approaches to search result validation: the target-decoy search strategy (with and without a refined scoring-metric) and a mixture model-based method. The effect of separate filtering of known and novel peptides was also examined. The results from real and simulated proteogenomic searches confirmed that separate filtering increases the sensitivity and reliability in proteogenomic search. However, no one method consistently identified the largest (or the smallest) number of novel peptides from real proteogenomic searches. We propose to use a set of search result validation methods with separate filtering, for sensitive and reliable identification of peptides in proteogenomic search.
Genome-wide association as a means to understanding the mammary gland
USDA-ARS?s Scientific Manuscript database
Next-generation sequencing and related technologies have facilitated the creation of enormous public databases that catalogue genomic variation. These databases have facilitated a variety of approaches to discover new genes that regulate normal biology as well as disease. Genome wide association (...
Fischer, Guido; Braun, Silvia; Thissen, Ralf; Dott, Wolfgang
2006-01-01
Identification of microfungi is time-consuming due to cultivation and microscopic examination and can be influenced by the interpretation of the macro- and micro-morphological characters observed. Fungal conidia contain mycotoxins that may be present in bioaerosols and thus the capacity for production of mycotoxins (and allergens) needs to be investigated to create a basis for reliable risk assessment in environmental and occupational hygiene. The present investigation aimed to create a simple but sophisticated method for the preparation of samples and the identification of airborne fungi by FT-IR spectroscopy. The method was suited to reproducibly differentiate Aspergillus and Penicillium species on the generic, the species, and the strain level. There are strong indications that strains of one taxon differing in metabolite production can be reliably distinguished by FT-IR spectroscopy (e.g. Aspergillus parasiticus). On the other hand, species from different taxa being similar in secondary metabolite production showed comparably higher similarities. The results obtained here can serve as a basis for the development of a database for species identification and strain characterization of microfungi. The method presented here will improve and facilitate the risk assessment in case of bioaerosol exposure, as strains with different physiological properties (e.g. toxic, non-toxic) could be differentiated. Moreover, it has the potential to significantly improve the identification of microfungi in various fields of applied microbiological research, e.g. high throughput screening in view of specific physiological properties, biodiversity studies, inventories in environmental microbiology, and quality control measures.
How effective are DNA barcodes in the identification of African rainforest trees?
Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W; Kenfack, David; Chuyong, George B; Cruaud, Corinne; Hardy, Olivier J
2013-01-01
DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95-100% success), but less for species identification (71-88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84-90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications.
How Effective Are DNA Barcodes in the Identification of African Rainforest Trees?
Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W.; Kenfack, David; Chuyong, George B.; Cruaud, Corinne; Hardy, Olivier J.
2013-01-01
Background DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. Methodology/Principal Findings We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95–100% success), but less for species identification (71–88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84–90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Conclusions/Significance Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications. PMID:23565134
Prakash, Peralam Yegneswaran; Irinyi, Laszlo; Halliday, Catriona; Chen, Sharon; Robert, Vincent; Meyer, Wieland
2017-04-01
The increase in public online databases dedicated to fungal identification is noteworthy. This can be attributed to improved access to molecular approaches to characterize fungi, as well as to delineate species within specific fungal groups in the last 2 decades, leading to an ever-increasing complexity of taxonomic assortments and nomenclatural reassignments. Thus, well-curated fungal databases with substantial accurate sequence data play a pivotal role for further research and diagnostics in the field of mycology. This minireview aims to provide an overview of currently available online databases for the taxonomy and identification of human and animal-pathogenic fungi and calls for the establishment of a cloud-based dynamic data network platform. Copyright © 2017 American Society for Microbiology.
37 CFR 1.105 - Requirements for information.
Code of Federal Regulations, 2010 CFR
2010-07-01
... databases: The existence of any particularly relevant commercial database known to any of the inventors that... improvement, identification of what is being improved. (vii) In use: Identification of any use of the claimed... the use. (viii) Technical information known to applicant. Technical information known to applicant...
Ruppitsch, W; Stöger, A; Indra, A; Grif, K; Schabereiter-Gurtner, C; Hirschl, A; Allerberger, F
2007-03-01
In a bioterrorism event a rapid tool is needed to identify relevant dangerous bacteria. The aim of the study was to assess the usefulness of partial 16S rRNA gene sequence analysis and the suitability of diverse databases for identifying dangerous bacterial pathogens. For rapid identification purposes a 500-bp fragment of the 16S rRNA gene of 28 isolates comprising Bacillus anthracis, Brucella melitensis, Burkholderia mallei, Burkholderia pseudomallei, Francisella tularensis, Yersinia pestis, and eight genus-related and unrelated control strains was amplified and sequenced. The obtained sequence data were submitted to three public and two commercial sequence databases for species identification. The most frequent reason for incorrect identification was the lack of the respective 16S rRNA gene sequences in the database. Sequence analysis of a 500-bp 16S rDNA fragment allows the rapid identification of dangerous bacterial species. However, for discrimination of closely related species sequencing of the entire 16S rRNA gene, additional sequencing of the 23S rRNA gene or sequencing of the 16S-23S rRNA intergenic spacer is essential. This work provides comprehensive information on the suitability of partial 16S rDNA analysis and diverse databases for rapid and accurate identification of dangerous bacterial pathogens.
Christner, Martin; Trusch, Maria; Rohde, Holger; Kwiatkowski, Marcel; Schlüter, Hartmut; Wolters, Manuel; Aepfelbacher, Martin; Hentschke, Moritz
2014-01-01
Background In 2011 northern Germany experienced a large outbreak of Shiga-Toxigenic Escherichia coli O104:H4. The large amount of samples sent to microbiology laboratories for epidemiological assessment highlighted the importance of fast and inexpensive typing procedures. We have therefore evaluated the applicability of a MALDI-TOF mass spectrometry based strategy for outbreak strain identification. Methods Specific peaks in the outbreak strain’s spectrum were identified by comparative analysis of archived pre-outbreak spectra that had been acquired for routine species-level identification. Proteins underlying these discriminatory peaks were identified by liquid chromatography tandem mass spectrometry and validated against publicly available databases. The resulting typing scheme was evaluated against PCR genotyping with 294 E. coli isolates from clinical samples collected during the outbreak. Results Comparative spectrum analysis revealed two characteristic peaks at m/z 6711 and m/z 10883. The underlying proteins were found to be of low prevalence among genome sequenced E. coli strains. Marker peak detection correctly classified 292 of 293 study isolates, including all 104 outbreak isolates. Conclusions MALDI-TOF mass spectrometry allowed for reliable outbreak strain identification during a large outbreak of Shiga-Toxigenic E. coli. The applied typing strategy could probably be adapted to other typing tasks and might facilitate epidemiological surveys as part of the routine pathogen identification workflow. PMID:25003758
NASA Astrophysics Data System (ADS)
Leveuf, Louis; Navrátil, Libor; Le Saux, Vincent; Marco, Yann; Olhagaray, Jérôme; Leclercq, Sylvain
2018-01-01
A constitutive model for the cyclic behaviour of short carbon fibre-reinforced thermoplastics for aeronautical applications is proposed. First, an extended experimental database is generated in order to highlight the specificities of the studied material. This database is composed of complex tests and is used to design a relevant constitutive model able to capture the cyclic behaviour of the material. A general 3D formulation of the model is then proposed, and an identification strategy is defined to identify its parameters. Finally, a validation of the identification is performed by challenging the prediction of the model to the tests that were not used for the identification. An excellent agreement between the numerical results and the experimental data is observed revealing the capabilities of the model.
An online database for informing ecological network models: http://kelpforest.ucsc.edu.
Beas-Luna, Rodrigo; Novak, Mark; Carr, Mark H; Tinker, Martin T; Black, August; Caselle, Jennifer E; Hoban, Michael; Malone, Dan; Iles, Alison
2014-01-01
Ecological network models and analyses are recognized as valuable tools for understanding the dynamics and resiliency of ecosystems, and for informing ecosystem-based approaches to management. However, few databases exist that can provide the life history, demographic and species interaction information necessary to parameterize ecological network models. Faced with the difficulty of synthesizing the information required to construct models for kelp forest ecosystems along the West Coast of North America, we developed an online database (http://kelpforest.ucsc.edu/) to facilitate the collation and dissemination of such information. Many of the database's attributes are novel yet the structure is applicable and adaptable to other ecosystem modeling efforts. Information for each taxonomic unit includes stage-specific life history, demography, and body-size allometries. Species interactions include trophic, competitive, facilitative, and parasitic forms. Each data entry is temporally and spatially explicit. The online data entry interface allows researchers anywhere to contribute and access information. Quality control is facilitated by attributing each entry to unique contributor identities and source citations. The database has proven useful as an archive of species and ecosystem-specific information in the development of several ecological network models, for informing management actions, and for education purposes (e.g., undergraduate and graduate training). To facilitate adaptation of the database by other researches for other ecosystems, the code and technical details on how to customize this database and apply it to other ecosystems are freely available and located at the following link (https://github.com/kelpforest-cameo/databaseui).
An Online Database for Informing Ecological Network Models: http://kelpforest.ucsc.edu
Beas-Luna, Rodrigo; Novak, Mark; Carr, Mark H.; Tinker, Martin T.; Black, August; Caselle, Jennifer E.; Hoban, Michael; Malone, Dan; Iles, Alison
2014-01-01
Ecological network models and analyses are recognized as valuable tools for understanding the dynamics and resiliency of ecosystems, and for informing ecosystem-based approaches to management. However, few databases exist that can provide the life history, demographic and species interaction information necessary to parameterize ecological network models. Faced with the difficulty of synthesizing the information required to construct models for kelp forest ecosystems along the West Coast of North America, we developed an online database (http://kelpforest.ucsc.edu/) to facilitate the collation and dissemination of such information. Many of the database's attributes are novel yet the structure is applicable and adaptable to other ecosystem modeling efforts. Information for each taxonomic unit includes stage-specific life history, demography, and body-size allometries. Species interactions include trophic, competitive, facilitative, and parasitic forms. Each data entry is temporally and spatially explicit. The online data entry interface allows researchers anywhere to contribute and access information. Quality control is facilitated by attributing each entry to unique contributor identities and source citations. The database has proven useful as an archive of species and ecosystem-specific information in the development of several ecological network models, for informing management actions, and for education purposes (e.g., undergraduate and graduate training). To facilitate adaptation of the database by other researches for other ecosystems, the code and technical details on how to customize this database and apply it to other ecosystems are freely available and located at the following link (https://github.com/kelpforest-cameo/databaseui). PMID:25343723
An online database for informing ecological network models: http://kelpforest.ucsc.edu
Beas-Luna, Rodrigo; Tinker, M. Tim; Novak, Mark; Carr, Mark H.; Black, August; Caselle, Jennifer E.; Hoban, Michael; Malone, Dan; Iles, Alison C.
2014-01-01
Ecological network models and analyses are recognized as valuable tools for understanding the dynamics and resiliency of ecosystems, and for informing ecosystem-based approaches to management. However, few databases exist that can provide the life history, demographic and species interaction information necessary to parameterize ecological network models. Faced with the difficulty of synthesizing the information required to construct models for kelp forest ecosystems along the West Coast of North America, we developed an online database (http://kelpforest.ucsc.edu/) to facilitate the collation and dissemination of such information. Many of the database's attributes are novel yet the structure is applicable and adaptable to other ecosystem modeling efforts. Information for each taxonomic unit includes stage-specific life history, demography, and body-size allometries. Species interactions include trophic, competitive, facilitative, and parasitic forms. Each data entry is temporally and spatially explicit. The online data entry interface allows researchers anywhere to contribute and access information. Quality control is facilitated by attributing each entry to unique contributor identities and source citations. The database has proven useful as an archive of species and ecosystem-specific information in the development of several ecological network models, for informing management actions, and for education purposes (e.g., undergraduate and graduate training). To facilitate adaptation of the database by other researches for other ecosystems, the code and technical details on how to customize this database and apply it to other ecosystems are freely available and located at the following link (https://github.com/kelpforest-cameo/databaseui).
The Effects of Signal Erosion and Core Genome Reduction on the Identification of Diagnostic Markers
2016-09-20
31 diagnostics for the identification of bacterial pathogens. To do this effectively, 32 genomics databases must be comprehensive to identify the...diverse B. 118 pseudomallei/mallei strains were sequenced, assembled, and deposited in public 119 databases (Supplemental Table 1); these genomes were...combined with 160 B. 120 pseudomallei/mallei genome assemblies already in public databases . Most of the 121 genomes (n=779) in this study were
Cameron, M; Perry, J; Middleton, J R; Chaffer, M; Lewis, J; Keefe, G P
2018-01-01
This study evaluated MALDI-TOF mass spectrometry and a custom reference spectra expanded database for the identification of bovine-associated coagulase-negative staphylococci (CNS). A total of 861 CNS isolates were used in the study, covering 21 different CNS species. The majority of the isolates were previously identified by rpoB gene sequencing (n = 804) and the remainder were identified by sequencing of hsp60 (n = 56) and tuf (n = 1). The genotypic identification was considered the gold standard identification. Using a direct transfer protocol and the existing commercial database, MALDI-TOF mass spectrometry showed a typeability of 96.5% (831/861) and an accuracy of 99.2% (824/831). Using a custom reference spectra expanded database, which included an additional 13 in-house created reference spectra, isolates were identified by MALDI-TOF mass spectrometry with 99.2% (854/861) typeability and 99.4% (849/854) accuracy. Overall, MALDI-TOF mass spectrometry using the direct transfer method was shown to be a highly reliable tool for the identification of bovine-associated CNS. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Zooplankton community analysis in the Changjiang River estuary by single-gene-targeted metagenomics
NASA Astrophysics Data System (ADS)
Cheng, Fangping; Wang, Minxiao; Li, Chaolun; Sun, Song
2014-07-01
DNA barcoding provides accurate identification of zooplankton species through all life stages. Single-gene-targeted metagenomic analysis based on DNA barcode databases can facilitate longterm monitoring of zooplankton communities. With the help of the available zooplankton databases, the zooplankton community of the Changjiang (Yangtze) River estuary was studied using a single-gene-targeted metagenomic method to estimate the species richness of this community. A total of 856 mitochondrial cytochrome oxidase subunit 1 (cox1) gene sequences were determined. The environmental barcodes were clustered into 70 molecular operational taxonomic units (MOTUs). Forty-two MOTUs matched barcoded marine organisms with more than 90% similarity and were assigned to either the species (similarity>96%) or genus level (similarity<96%). Sibling species could also be distinguished. Many species that were overlooked by morphological methods were identified by molecular methods, especially gelatinous zooplankton and merozooplankton that were likely sampled at different life history phases. Zooplankton community structures differed significantly among all of the samples. The MOTU spatial distributions were influenced by the ecological habits of the corresponding species. In conclusion, single-gene-targeted metagenomic analysis is a useful tool for zooplankton studies, with which specimens from all life history stages can be identified quickly and effectively with a comprehensive database.
dbCPG: A web resource for cancer predisposition genes
Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng
2016-01-01
Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes. PMID:27192119
SilkPathDB: a comprehensive resource for the study of silkworm pathogens
Pan, Guo-Qing; Vossbrinck, Charles R.; Xu, Jin-Shan; Li, Chun-Feng; Chen, Jie; Long, Meng-Xian; Yang, Ming; Xu, Xiao-Fei; Xu, Chen; Debrunner-Vossbrinck, Bettina A.
2017-01-01
Silkworm pathogens have been heavily impeding the development of sericultural industry and play important roles in lepidopteran ecology, and some of which are used as biological insecticides. Rapid advances in studies on the omics of silkworm pathogens have produced a large amount of data, which need to be brought together centrally in a coherent and systematic manner. This will facilitate the reuse of these data for further analysis. We have collected genomic data for 86 silkworm pathogens from 4 taxa (fungi, microsporidia, bacteria and viruses) and from 4 lepidopteran hosts, and developed the open-access Silkworm Pathogen Database (SilkPathDB) to make this information readily available. The implementation of SilkPathDB involves integrating Drupal and GBrowse as a graphic interface for a Chado relational database which houses all of the datasets involved. The genomes have been assembled and annotated for comparative purposes and allow the search and analysis of homologous sequences, transposable elements, protein subcellular locations, including secreted proteins, and gene ontology. We believe that the SilkPathDB will aid researchers in the identification of silkworm parasites, understanding the mechanisms of silkworm infections, and the developmental ecology of silkworm parasites (gene expression) and their hosts. Database URL: http://silkpathdb.swu.edu.cn PMID:28365723
He, Ying; Chang, Tsung C; Li, Haijing; Shi, Gongyi; Tang, Yi-Wei
2011-07-01
More than 20 species of Legionella have been identified in relation to human infections. Rapid detection and identification of Legionella isolates is clinically useful to differentiate between infection and contamination and to determine treatment regimens. We explored the use of matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) Biotyper system (Bruker Daltonik GmbH, Bremen, Germany) for the identification of Legionella species. The MALDI MS spectra were generated and compared with the Biotyper database, which includes 25 Legionella strains covering 22 species and four Legionella pneumophila serogroups. A total of 83 blind-coded Legionella strains, consisting of 54 reference and 29 clinical strains, were analyzed in the study. Overall, the Biotyper system correctly identified 51 (61.4%) of all strains and isolates to the species level. For species included in the Biotyper database, the method identified 51 (86.4%) strains out of 59 Legionella strains to the correct species level, including 24 (100%) L. pneumophila and 27 (77.1%) non-L. pneumophila strains. The remaining 24 Legionella strains, belonging to species not covered by the Biotyper database, were either identified to the Legionella genus level or had no reliable identification. The Biotyper system produces constant and reproducible MALDI MS spectra for Legionella strains and can be used for rapid and accurate Legionella identification. More Legionella strains, especially the non-L. pneumophila strains, need to be included in the current Biotyper database to cover varieties of Legionella species and to increase identification accuracy.
Target-Pathogen: a structural bioinformatic approach to prioritize drug targets in pathogens
Sosa, Ezequiel J; Burguener, Germán; Lanzarotti, Esteban; Radusky, Leandro; Pardo, Agustín M; Marti, Marcelo
2018-01-01
Abstract Available genomic data for pathogens has created new opportunities for drug discovery and development to fight them, including new resistant and multiresistant strains. In particular structural data must be integrated with both, gene information and experimental results. In this sense, there is a lack of an online resource that allows genome wide-based data consolidation from diverse sources together with thorough bioinformatic analysis that allows easy filtering and scoring for fast target selection for drug discovery. Here, we present Target-Pathogen database (http://target.sbg.qb.fcen.uba.ar/patho), designed and developed as an online resource that allows the integration and weighting of protein information such as: function, metabolic role, off-targeting, structural properties including druggability, essentiality and omic experiments, to facilitate the identification and prioritization of candidate drug targets in pathogens. We include in the database 10 genomes of some of the most relevant microorganisms for human health (Mycobacterium tuberculosis, Mycobacterium leprae, Klebsiella pneumoniae, Plasmodium vivax, Toxoplasma gondii, Leishmania major, Wolbachia bancrofti, Trypanosoma brucei, Shigella dysenteriae and Schistosoma Smanosoni) and show its applicability. New genomes can be uploaded upon request. PMID:29106651
Pham, Nikki T.; Wei, Tong; Schackwitz, Wendy S.; Lipzen, Anna M.; Duong, Phat Q.; Jones, Kyle C.; Ruan, Deling; Bauer, Diane; Peng, Yi; Schmutz, Jeremy
2017-01-01
The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportion of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. This work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations. PMID:28576844
Tripathi, Anita; Goswami, Kavita; Sanan-Mishra, Neeti
2015-01-01
microRNAs (miRs) are a class of 21–24 nucleotide long non-coding RNAs responsible for regulating the expression of associated genes mainly by cleavage or translational inhibition of the target transcripts. With this characteristic of silencing, miRs act as an important component in regulation of plant responses in various stress conditions. In recent years, with drastic change in environmental and soil conditions different type of stresses have emerged as a major challenge for plants growth and productivity. The identification and profiling of miRs has itself been a challenge for research workers given their small size and large number of many probable sequences in the genome. Application of computational approaches has expedited the process of identification of miRs and their expression profiling in different conditions. The development of High-Throughput Sequencing (HTS) techniques has facilitated to gain access to the global profiles of the miRs for understanding their mode of action in plants. Introduction of various bioinformatics databases and tools have revolutionized the study of miRs and other small RNAs. This review focuses the role of bioinformatics approaches in the identification and study of the regulatory roles of plant miRs in the adaptive response to stresses. PMID:26578966
Aryanto, K Y E; Broekema, A; Langenhuysen, R G A; Oudkerk, M; van Ooijen, P M A
2015-05-01
To develop and test a fast and easy rule-based web-environment with optional de-identification of imaging data to facilitate data distribution within a hospital environment. A web interface was built using Hypertext Preprocessor (PHP), an open source scripting language for web development, and Java with SQL Server to handle the database. The system allows for the selection of patient data and for de-identifying these when necessary. Using the services provided by the RSNA Clinical Trial Processor (CTP), the selected images were pushed to the appropriate services using a protocol based on the module created for the associated task. Five pipelines, each performing a different task, were set up in the server. In a 75 month period, more than 2,000,000 images are transferred and de-identified in a proper manner while 20,000,000 images are moved from one node to another without de-identification. While maintaining a high level of security and stability, the proposed system is easy to setup, it integrate well with our clinical and research practice and it provides a fast and accurate vendor-neutral process of transferring, de-identifying, and storing DICOM images. Its ability to run different de-identification processes in parallel pipelines is a major advantage in both clinical and research setting.
Barriers and enablers that influence sustainable interprofessional education: a literature review.
Lawlis, Tanya Rechael; Anson, Judith; Greenfield, David
2014-07-01
The effective incorporation of interprofessional education (IPE) within health professional curricula requires the synchronised and systematic collaboration between and within the various stakeholders. Higher education institutions, as primary health education providers, have the capacity to advocate and facilitate this collaboration. However, due to the diversity of stakeholders, facilitating the pedagogical change can be challenging and complex, and brings a degree of uncertainty and resistance. This review, through an analysis of the barriers and enablers investigates the involvement of stakeholders in higher education IPE through three primary stakeholder levels: Government and Professional, Institutional and Individual. A review of eight primary databases using 21 search terms resulted in 40 papers for review. While the barriers to IPE are widely reported within the higher education IPE literature, little is documented about the enablers of IPE. Similarly, the specific identification and importance of enablers for IPE sustainability and the dual nature of some barriers and enablers have not been previously reported. An analysis of the barriers and enablers of IPE across the different stakeholder levels reveals five key "fundamental elements" critical to achieving sustainable IPE in higher education curricula.
Evaluation of Automated Yeast Identification System
NASA Technical Reports Server (NTRS)
McGinnis, M. R.
1996-01-01
One hundred and nine teleomorphic and anamorphic yeast isolates representing approximately 30 taxa were used to evaluate the accuracy of the Biolog yeast identification system. Isolates derived from nomenclatural types, environmental, and clinica isolates of known identity were tested in the Biolog system. Of the isolates tested, 81 were in the Biolog database. The system correctly identified 40, incorrectly identified 29, and was unable to identify 12. Of the 28 isolates not in the database, 18 were given names, whereas 10 were not. The Biolog yeast identification system is inadequate for the identification of yeasts originating from the environment during space program activities.
DNA barcoding as a tool for coral reef conservation
NASA Astrophysics Data System (ADS)
Neigel, J.; Domingo, A.; Stake, J.
2007-09-01
DNA Barcoding (DBC) is a method for taxonomic identification of animals that is based entirely on the 5' portion of the mitochondrial gene, cytochrome oxidase subunit I ( COI-5). It can be especially useful for identification of larval forms or incomplete specimens lacking diagnostic morphological characters. DBC can also facilitate the discovery of species and in defining “molecular taxonomic units” in problematic groups. However, DBC is not a panacea for coral reef taxonomy. In two of the most ecologically important groups on coral reefs, the Anthozoa and Porifera, COI-5 sequences have diverged too little to be diagnostic for all species. Other problems for DBC include paraphyly in mitochondrial gene trees and lack of differentiation between hybrids and their maternal ancestors. DBC also depends on the availability of databases of COI-5 sequences, which are still in early stages of development. A global effort to barcode all fish species has demonstrated the importance of large-scale coordination and is yielding promising results. Whether or not COI-5 by itself is sufficient for species assignments has become a contentious question; it is generally advantageous to use sequences from multiple loci.
Geovisualization of Local and Regional Migration Using Web-mined Demographics
NASA Astrophysics Data System (ADS)
Schuermann, R. T.; Chow, T. E.
2014-11-01
The intent of this research was to augment and facilitate analyses, which gauges the feasibility of web-mined demographics to study spatio-temporal dynamics of migration. As a case study, we explored the spatio-temporal dynamics of Vietnamese Americans (VA) in Texas through geovisualization of mined demographic microdata from the World Wide Web. Based on string matching across all demographic attributes, including full name, address, date of birth, age and phone number, multiple records of the same entity (i.e. person) over time were resolved and reconciled into a database. Migration trajectories were geovisualized through animated sprites by connecting the different addresses associated with the same person and segmenting the trajectory into small fragments. Intra-metropolitan migration patterns appeared at the local scale within many metropolitan areas. At the scale of metropolitan area, varying degrees of immigration and emigration manifest different types of migration clusters. This paper presents a methodology incorporating GIS methods and cartographic design to produce geovisualization animation, enabling the cognitive identification of migration patterns at multiple scales. Identification of spatio-temporal patterns often stimulates further research to better understand the phenomenon and enhance subsequent modeling.
Ghosh, Soma; Prava, Jyoti; Samal, Himanshu Bhusan; Suar, Mrutyunjay; Mahapatra, Rajani Kanta
2014-06-01
Now-a-days increasing emergence of antibiotic-resistant pathogenic microorganisms is one of the biggest challenges for management of disease. In the present study comparative genomics, metabolic pathways analysis and additional parameters were defined for the identification of 94 non-homologous essential proteins in Staphylococcus aureus genome. Further study prioritized 19 proteins as vaccine candidates where as druggability study reports 34 proteins suitable as drug targets. Enzymes from peptidoglycan biosynthesis, folate biosynthesis were identified as candidates for drug development. Furthermore, bacterial secretory proteins and few hypothetical proteins identified in our analysis fulfill the criteria of vaccine candidates. As a case study, we built a homology model of one of the potential drug target, MurA ligase, using MODELLER (9v12) software. The model has been further selected for in silico docking study with inhibitors from the DrugBank database. Results from this study could facilitate selection of proteins for entry into drug design and vaccine production pipelines. Copyright © 2014 Elsevier B.V. All rights reserved.
Martinez-Urtaza, Jaime; Lozano-Leon, Antonio; Viña-Feas, Alejandro; de Novoa, Jacobo; Garcia-Martin, Oscar
2006-02-01
Genetic differences in clinical and environmental strains of Vibrio parahaemolyticus have been widely used as criteria in identifying pathogenic isolates. However, few studies have been carried out to assess the differences in biochemical characteristics of V. parahaemolyticus isolates from human and environmental sources. We compared the biochemical profiles obtained by the characterization of V. parahaemolyticus isolates from human infections and the marine environment using the API 20E system. Environmental and clinical isolates showed significant differences in the gelatin and arabinose tests. Additionally, clinical isolates were correctly identified according to the API 20E profile using 0.85% NaCl diluent, but they presented nonspecific profiles with 2% NaCl diluent. In contrast, use of 2% NaCl diluent facilitated correct identification of the environmental isolates. Clinical isolates showed significant differences in up to five biochemical tests with respect to the API 20E database. The API 20E system is widely used in routine identification of bacteria in clinical laboratories, and this discrepancy in an important number of biochemical tests may lead to misidentification of V. parahaemolyticus infection.
Winsor, Geoffrey L; Van Rossum, Thea; Lo, Raymond; Khaira, Bhavjinder; Whiteside, Matthew D; Hancock, Robert E W; Brinkman, Fiona S L
2009-01-01
Pseudomonas aeruginosa is a well-studied opportunistic pathogen that is particularly known for its intrinsic antimicrobial resistance, diverse metabolic capacity, and its ability to cause life threatening infections in cystic fibrosis patients. The Pseudomonas Genome Database (http://www.pseudomonas.com) was originally developed as a resource for peer-reviewed, continually updated annotation for the Pseudomonas aeruginosa PAO1 reference strain genome. In order to facilitate cross-strain and cross-species genome comparisons with other Pseudomonas species of importance, we have now expanded the database capabilities to include all Pseudomonas species, and have developed or incorporated methods to facilitate high quality comparative genomics. The database contains robust assessment of orthologs, a novel ortholog clustering method, and incorporates five views of the data at the sequence and annotation levels (Gbrowse, Mauve and custom views) to facilitate genome comparisons. A choice of simple and more flexible user-friendly Boolean search features allows researchers to search and compare annotations or sequences within or between genomes. Other features include more accurate protein subcellular localization predictions and a user-friendly, Boolean searchable log file of updates for the reference strain PAO1. This database aims to continue to provide a high quality, annotated genome resource for the research community and is available under an open source license.
Non-targeted analysis (NTA) workflows in high-resolution mass spectrometry require mechanisms for compound identification. One strategy for tentative identification is the use of online chemical databases such as ChemSpider. Databases like this use molecular formulae and monois...
Speech identification in noise: Contribution of temporal, spectral, and visual speech cues.
Kim, Jeesun; Davis, Chris; Groot, Christopher
2009-12-01
This study investigated the degree to which two types of reduced auditory signals (cochlear implant simulations) and visual speech cues combined for speech identification. The auditory speech stimuli were filtered to have only amplitude envelope cues or both amplitude envelope and spectral cues and were presented with/without visual speech. In Experiment 1, IEEE sentences were presented in quiet and noise. For in-quiet presentation, speech identification was enhanced by the addition of both spectral and visual speech cues. Due to a ceiling effect, the degree to which these effects combined could not be determined. In noise, these facilitation effects were more marked and were additive. Experiment 2 examined consonant and vowel identification in the context of CVC or VCV syllables presented in noise. For consonants, both spectral and visual speech cues facilitated identification and these effects were additive. For vowels, the effect of combined cues was underadditive, with the effect of spectral cues reduced when presented with visual speech cues. Analysis indicated that without visual speech, spectral cues facilitated the transmission of place information and vowel height, whereas with visual speech, they facilitated lip rounding, with little impact on the transmission of place information.
Diway, Bibian; Khoo, Eyen
2017-01-01
The development of timber tracking methods based on genetic markers can provide scientific evidence to verify the origin of timber products and fulfill the growing requirement for sustainable forestry practices. In this study, the origin of an important Dark Red Meranti wood, Shorea platyclados, was studied by using the combination of seven chloroplast DNA and 15 short tandem repeats (STRs) markers. A total of 27 natural populations of S. platyclados were sampled throughout Malaysia to establish population level and individual level identification databases. A haplotype map was generated from chloroplast DNA sequencing for population identification, resulting in 29 multilocus haplotypes, based on 39 informative intraspecific variable sites. Subsequently, a DNA profiling database was developed from 15 STRs allowing for individual identification in Malaysia. Cluster analysis divided the 27 populations into two genetic clusters, corresponding to the region of Eastern and Western Malaysia. The conservativeness tests showed that the Malaysia database is conservative after removal of bias from population subdivision and sampling effects. Independent self-assignment tests correctly assigned individuals to the database in an overall 60.60−94.95% of cases for identified populations, and in 98.99−99.23% of cases for identified regions. Both the chloroplast DNA database and the STRs appear to be useful for tracking timber originating in Malaysia. Hence, this DNA-based method could serve as an effective addition tool to the existing forensic timber identification system for ensuring the sustainably management of this species into the future. PMID:28430826
Semi Automated Land Cover Layer Updating Process Utilizing Spectral Analysis and GIS Data Fusion
NASA Astrophysics Data System (ADS)
Cohen, L.; Keinan, E.; Yaniv, M.; Tal, Y.; Felus, A.; Regev, R.
2018-04-01
Technological improvements made in recent years of mass data gathering and analyzing, influenced the traditional methods of updating and forming of the national topographic database. It has brought a significant increase in the number of use cases and detailed geo information demands. Processes which its purpose is to alternate traditional data collection methods developed in many National Mapping and Cadaster Agencies. There has been significant progress in semi-automated methodologies aiming to facilitate updating of a topographic national geodatabase. Implementation of those is expected to allow a considerable reduction of updating costs and operation times. Our previous activity has focused on building automatic extraction (Keinan, Zilberstein et al, 2015). Before semiautomatic updating method, it was common that interpreter identification has to be as detailed as possible to hold most reliable database eventually. When using semi-automatic updating methodologies, the ability to insert human insights based knowledge is limited. Therefore, our motivations were to reduce the created gap by allowing end-users to add their data inputs to the basic geometric database. In this article, we will present a simple Land cover database updating method which combines insights extracted from the analyzed image, and a given spatial data of vector layers. The main stages of the advanced practice are multispectral image segmentation and supervised classification together with given vector data geometric fusion while maintaining the principle of low shape editorial work to be done. All coding was done utilizing open source software components.
Piehowski, Paul D; Petyuk, Vladislav A; Sandoval, John D; Burnum, Kristin E; Kiebel, Gary R; Monroe, Matthew E; Anderson, Gordon A; Camp, David G; Smith, Richard D
2013-03-01
For bottom-up proteomics, there are wide variety of database-searching algorithms in use for matching peptide sequences to tandem MS spectra. Likewise, there are numerous strategies being employed to produce a confident list of peptide identifications from the different search algorithm outputs. Here we introduce a grid-search approach for determining optimal database filtering criteria in shotgun proteomics data analyses that is easily adaptable to any search. Systematic Trial and Error Parameter Selection--referred to as STEPS--utilizes user-defined parameter ranges to test a wide array of parameter combinations to arrive at an optimal "parameter set" for data filtering, thus maximizing confident identifications. The benefits of this approach in terms of numbers of true-positive identifications are demonstrated using datasets derived from immunoaffinity-depleted blood serum and a bacterial cell lysate, two common proteomics sample types. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Pitfalls of Establishing DNA Barcoding Systems in Protists: The Cryptophyceae as a Test Case
Hoef-Emden, Kerstin
2012-01-01
A DNA barcode is a preferrably short and highly variable region of DNA supposed to facilitate a rapid identification of species. In many protistan lineages, a lack of species-specific morphological characters hampers an identification of species by light or electron microscopy, and difficulties to perform mating experiments in laboratory cultures also do not allow for an identification of biological species. Thus, testing candidate barcode markers as well as establishment of accurately working species identification systems are more challenging than in multicellular organisms. In cryptic species complexes the performance of a potential barcode marker can not be monitored using morphological characters as a feedback, but an inappropriate choice of DNA region may result in artifactual species trees for several reasons. Therefore a priori knowledge of the systematics of a group is required. In addition to identification of known species, methods for an automatic delimitation of species with DNA barcodes have been proposed. The Cryptophyceae provide a mixture of systematically well characterized as well as badly characterized groups and are used in this study to test the suitability of some of the methods for protists. As species identification method the performance of blast in searches against badly to well-sampled reference databases has been tested with COI-5P and 5′-partial LSU rDNA (domains A to D of the nuclear LSU rRNA gene). In addition the performance of two different methods for automatic species delimitation, fixed thresholds of genetic divergence and the general mixed Yule-coalescent model (GMYC), have been examined. The study demonstrates some pitfalls of barcoding methods that have to be taken care of. Also a best-practice approach towards establishing a DNA barcode system in protists is proposed. PMID:22970104
Pitfalls of establishing DNA barcoding systems in protists: the cryptophyceae as a test case.
Hoef-Emden, Kerstin
2012-01-01
A DNA barcode is a preferrably short and highly variable region of DNA supposed to facilitate a rapid identification of species. In many protistan lineages, a lack of species-specific morphological characters hampers an identification of species by light or electron microscopy, and difficulties to perform mating experiments in laboratory cultures also do not allow for an identification of biological species. Thus, testing candidate barcode markers as well as establishment of accurately working species identification systems are more challenging than in multicellular organisms. In cryptic species complexes the performance of a potential barcode marker can not be monitored using morphological characters as a feedback, but an inappropriate choice of DNA region may result in artifactual species trees for several reasons. Therefore a priori knowledge of the systematics of a group is required. In addition to identification of known species, methods for an automatic delimitation of species with DNA barcodes have been proposed. The Cryptophyceae provide a mixture of systematically well characterized as well as badly characterized groups and are used in this study to test the suitability of some of the methods for protists. As species identification method the performance of blast in searches against badly to well-sampled reference databases has been tested with COI-5P and 5'-partial LSU rDNA (domains A to D of the nuclear LSU rRNA gene). In addition the performance of two different methods for automatic species delimitation, fixed thresholds of genetic divergence and the general mixed Yule-coalescent model (GMYC), have been examined. The study demonstrates some pitfalls of barcoding methods that have to be taken care of. Also a best-practice approach towards establishing a DNA barcode system in protists is proposed.
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2013 CFR
2013-01-01
... is linked to that animal in the CWD National Database or in an approved State database. The second... that animal and herd in the CWD National Database or in an approved State database. (Approved by the...
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2014 CFR
2014-01-01
... is linked to that animal in the CWD National Database or in an approved State database. The second... that animal and herd in the CWD National Database or in an approved State database. (Approved by the...
Wang, Shirley V; Schneeweiss, Sebastian; Berger, Marc L; Brown, Jeffrey; de Vries, Frank; Douglas, Ian; Gagne, Joshua J; Gini, Rosa; Klungel, Olaf; Mullins, C Daniel; Nguyen, Michael D; Rassen, Jeremy A; Smeeth, Liam; Sturkenboom, Miriam
2017-09-01
Defining a study population and creating an analytic dataset from longitudinal healthcare databases involves many decisions. Our objective was to catalogue scientific decisions underpinning study execution that should be reported to facilitate replication and enable assessment of validity of studies conducted in large healthcare databases. We reviewed key investigator decisions required to operate a sample of macros and software tools designed to create and analyze analytic cohorts from longitudinal streams of healthcare data. A panel of academic, regulatory, and industry experts in healthcare database analytics discussed and added to this list. Evidence generated from large healthcare encounter and reimbursement databases is increasingly being sought by decision-makers. Varied terminology is used around the world for the same concepts. Agreeing on terminology and which parameters from a large catalogue are the most essential to report for replicable research would improve transparency and facilitate assessment of validity. At a minimum, reporting for a database study should provide clarity regarding operational definitions for key temporal anchors and their relation to each other when creating the analytic dataset, accompanied by an attrition table and a design diagram. A substantial improvement in reproducibility, rigor and confidence in real world evidence generated from healthcare databases could be achieved with greater transparency about operational study parameters used to create analytic datasets from longitudinal healthcare databases. © 2017 The Authors. Pharmacoepidemiology & Drug Safety Published by John Wiley & Sons Ltd.
Planning the data transition of a VLDB: a case study
NASA Astrophysics Data System (ADS)
Finken, Shirley J.
1997-02-01
This paper describes the technical and programmatic plans for moving and checking certain data from the IDentification Automated Services (IDAS) system to the new Interstate Identification Index/Federal Bureau of Investigation (III/FBI) Segment database--one of the three components of the Integrated Automated Fingerprint Identification System (IAFIS) being developed by the Federal Bureau of Investigation, Criminal Justice Information Services Division. Transitioning IDAS to III/FBI includes putting the data into an entirely new target database structure (i.e. from IBM VSAM files to ORACLE7 RDBMS tables). Only four IDAS files were transitioned (CCN, CCR, CCA, and CRS), but their total estimated size is at 500 Gb of data. Transitioning of this Very Large Database is planned as two processes.
Irinyi, Laszlo; Serena, Carolina; Garcia-Hermoso, Dea; Arabatzis, Michael; Desnos-Ollivier, Marie; Vu, Duong; Cardinali, Gianluigi; Arthur, Ian; Normand, Anne-Cécile; Giraldo, Alejandra; da Cunha, Keith Cassia; Sandoval-Denis, Marcelo; Hendrickx, Marijke; Nishikaku, Angela Satie; de Azevedo Melo, Analy Salles; Merseguel, Karina Bellinghausen; Khan, Aziza; Parente Rocha, Juliana Alves; Sampaio, Paula; da Silva Briones, Marcelo Ribeiro; e Ferreira, Renata Carmona; de Medeiros Muniz, Mauro; Castañón-Olivares, Laura Rosio; Estrada-Barcenas, Daniel; Cassagne, Carole; Mary, Charles; Duan, Shu Yao; Kong, Fanrong; Sun, Annie Ying; Zeng, Xianyu; Zhao, Zuotao; Gantois, Nausicaa; Botterel, Françoise; Robbertse, Barbara; Schoch, Conrad; Gams, Walter; Ellis, David; Halliday, Catriona; Chen, Sharon; Sorrell, Tania C; Piarroux, Renaud; Colombo, Arnaldo L; Pais, Célia; de Hoog, Sybren; Zancopé-Oliveira, Rosely Maria; Taylor, Maria Lucia; Toriello, Conchita; de Almeida Soares, Célia Maria; Delhaes, Laurence; Stubbe, Dirk; Dromer, Françoise; Ranque, Stéphane; Guarro, Josep; Cano-Lira, Jose F; Robert, Vincent; Velegraki, Aristea; Meyer, Wieland
2015-05-01
Human and animal fungal pathogens are a growing threat worldwide leading to emerging infections and creating new risks for established ones. There is a growing need for a rapid and accurate identification of pathogens to enable early diagnosis and targeted antifungal therapy. Morphological and biochemical identification methods are time-consuming and require trained experts. Alternatively, molecular methods, such as DNA barcoding, a powerful and easy tool for rapid monophasic identification, offer a practical approach for species identification and less demanding in terms of taxonomical expertise. However, its wide-spread use is still limited by a lack of quality-controlled reference databases and the evolving recognition and definition of new fungal species/complexes. An international consortium of medical mycology laboratories was formed aiming to establish a quality controlled ITS database under the umbrella of the ISHAM working group on "DNA barcoding of human and animal pathogenic fungi." A new database, containing 2800 ITS sequences representing 421 fungal species, providing the medical community with a freely accessible tool at http://www.isham.org/ and http://its.mycologylab.org/ to rapidly and reliably identify most agents of mycoses, was established. The generated sequences included in the new database were used to evaluate the variation and overall utility of the ITS region for the identification of pathogenic fungi at intra-and interspecies level. The average intraspecies variation ranged from 0 to 2.25%. This highlighted selected pathogenic fungal species, such as the dermatophytes and emerging yeast, for which additional molecular methods/genetic markers are required for their reliable identification from clinical and veterinary specimens. © The Author 2015. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The Impact of Online Bibliographic Databases on Teaching and Research in Political Science.
ERIC Educational Resources Information Center
Reichel, Mary
The availability of online bibliographic databases greatly facilitates literature searching in political science. The advantages to searching databases online include combination of concepts, comprehensiveness, multiple database searching, free-text searching, currency, current awareness services, document delivery service, and convenience.…
Lee, Wonmok; Kim, Myungsook; Yong, Dongeun; Jeong, Seok Hoon; Lee, Kyungwon; Chong, Yunsop
2015-01-01
By conventional methods, the identification of anaerobic bacteria is more time consuming and requires more expertise than the identification of aerobic bacteria. Although the matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) systems are relatively less studied, they have been reported to be a promising method for the identification of anaerobes. We evaluated the performance of the VITEK MS in vitro diagnostic (IVD; 1.1 database; bioMérieux, France) in the identification of anaerobes. We used 274 anaerobic bacteria isolated from various clinical specimens. The results for the identification of the bacteria by VITEK MS were compared to those obtained by phenotypic methods and 16S rRNA gene sequencing. Among the 249 isolates included in the IVD database, the VITEK MS correctly identified 209 (83.9%) isolates to the species level and an additional 18 (7.2%) at the genus level. In particular, the VITEK MS correctly identified clinically relevant and frequently isolated anaerobic bacteria to the species level. The remaining 22 isolates (8.8%) were either not identified or misidentified. The VITEK MS could not identify the 25 isolates absent from the IVD database to the species level. The VITEK MS showed reliable identifications for clinically relevant anaerobic bacteria.
Phytophthora-ID.org: A sequence-based Phytophthora identification tool
N.J. Grünwald; F.N. Martin; M.M. Larsen; C.M. Sullivan; C.M. Press; M.D. Coffey; E.M. Hansen; J.L. Parke
2010-01-01
Contemporary species identification relies strongly on sequence-based identification, yet resources for identification of many fungal and oomycete pathogens are rare. We developed two web-based, searchable databases for rapid identification of Phytophthora spp. based on sequencing of the internal transcribed spacer (ITS) or the cytochrome oxidase...
Person identification in irregular cardiac conditions using electrocardiogram signals.
Sidek, Khairul Azami; Khalil, Ibrahim
2011-01-01
This paper presents a person identification mechanism in irregular cardiac conditions using ECG signals. A total of 30 subjects were used in the study from three different public ECG databases containing various abnormal heart conditions from the Paroxysmal Atrial Fibrillation Predicition Challenge database (AFPDB), MIT-BIH Supraventricular Arrthymia database (SVDB) and T-Wave Alternans Challenge database (TWADB). Cross correlation (CC) was used as the biometric matching algorithm with defined threshold values to evaluate the performance. In order to measure the efficiency of this simple yet effective matching algorithm, two biometric performance metrics were used which are false acceptance rate (FAR) and false reject rate (FRR). Our experimentation results suggest that ECG based biometric identification with irregular cardiac condition gives a higher recognition rate of different ECG signals when tested for three different abnormal cardiac databases yielding false acceptance rate (FAR) of 2%, 3% and 2% and false reject rate (FRR) of 1%, 2% and 0% for AFPDB, SVDB and TWADB respectively. These results also indicate the existence of salient biometric characteristics in the ECG morphology within the QRS complex that tends to differentiate individuals.
Missing persons-missing data: the need to collect antemortem dental records of missing persons.
Blau, Soren; Hill, Anthony; Briggs, Christopher A; Cordner, Stephen M
2006-03-01
The subject of missing persons is of great concern to the community with numerous associated emotional, financial, and health costs. This paper examines the forensic medical issues raised by the delayed identification of individuals classified as "missing" and highlights the importance of including dental data in the investigation of missing persons. Focusing on Australia, the current approaches employed in missing persons investigations are outlined. Of particular significance is the fact that each of the eight Australian states and territories has its own Missing Persons Unit that operates within distinct state and territory legislation. Consequently, there is a lack of uniformity within Australia about the legal and procedural framework within which investigations of missing persons are conducted, and the interaction of that framework with coronial law procedures. One of the main investigative problems in missing persons investigations is the lack of forensic medical, particularly, odontological input. Forensic odontology has been employed in numerous cases in Australia where identity is unknown or uncertain because of remains being skeletonized, incinerated, or partly burnt. The routine employment of the forensic odontologist to assist in missing person inquiries, has however, been ignored. The failure to routinely employ forensic odontology in missing persons inquiries has resulted in numerous delays in identification. Three Australian cases are presented where the investigation of individuals whose identity was uncertain or unknown was prolonged due to the failure to utilize the appropriate (and available) dental resources. In light of the outcomes of these cases, we suggest that a national missing persons dental records database be established for future missing persons investigations. Such a database could be easily managed between a coronial system and a forensic medical institute. In Australia, a national missing persons dental records database could be incorporated into the National Coroners Information System (NCIS) managed, on behalf of Australia's Coroners, by the Victorian Institute of Forensic Medicine. The existence of the NCIS would ensure operational collaboration in the implementation of the system and cost savings to Australian policing agencies involved in missing person inquiries. The implementation of such a database would facilitate timely and efficient reconciliation of clinical and postmortem dental records and have subsequent social and financial benefits.
Introducing the Forensic Research/Reference on Genetics knowledge base, FROG-kb
2012-01-01
Background Online tools and databases based on multi-allelic short tandem repeat polymorphisms (STRPs) are actively used in forensic teaching, research, and investigations. The Fst value of each CODIS marker tends to be low across the populations of the world and most populations typically have all the common STRP alleles present diminishing the ability of these systems to discriminate ethnicity. Recently, considerable research is being conducted on single nucleotide polymorphisms (SNPs) to be considered for human identification and description. However, online tools and databases that can be used for forensic research and investigation are limited. Methods The back end DBMS (Database Management System) for FROG-kb is Oracle version 10. The front end is implemented with specific code using technologies such as Java, Java Servlet, JSP, JQuery, and GoogleCharts. Results We present an open access web application, FROG-kb (Forensic Research/Reference on Genetics-knowledge base, http://frog.med.yale.edu), that is useful for teaching and research relevant to forensics and can serve as a tool facilitating forensic practice. The underlying data for FROG-kb are provided by the already extensively used and referenced ALlele FREquency Database, ALFRED (http://alfred.med.yale.edu). In addition to displaying data in an organized manner, computational tools that use the underlying allele frequencies with user-provided data are implemented in FROG-kb. These tools are organized by the different published SNP/marker panels available. This web tool currently has implemented general functions possible for two types of SNP panels, individual identification and ancestry inference, and a prediction function specific to a phenotype informative panel for eye color. Conclusion The current online version of FROG-kb already provides new and useful functionality. We expect FROG-kb to grow and expand in capabilities and welcome input from the forensic community in identifying datasets and functionalities that will be most helpful and useful. Thus, the structure and functionality of FROG-kb will be revised in an ongoing process of improvement. This paper describes the state as of early June 2012. PMID:22938150
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.
Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun
2012-01-01
Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
False discovery rates in spectral identification.
Jeong, Kyowon; Kim, Sangtae; Bandeira, Nuno
2012-01-01
Automated database search engines are one of the fundamental engines of high-throughput proteomics enabling daily identifications of hundreds of thousands of peptides and proteins from tandem mass (MS/MS) spectrometry data. Nevertheless, this automation also makes it humanly impossible to manually validate the vast lists of resulting identifications from such high-throughput searches. This challenge is usually addressed by using a Target-Decoy Approach (TDA) to impose an empirical False Discovery Rate (FDR) at a pre-determined threshold x% with the expectation that at most x% of the returned identifications would be false positives. But despite the fundamental importance of FDR estimates in ensuring the utility of large lists of identifications, there is surprisingly little consensus on exactly how TDA should be applied to minimize the chances of biased FDR estimates. In fact, since less rigorous TDA/FDR estimates tend to result in more identifications (at higher 'true' FDR), there is often little incentive to enforce strict TDA/FDR procedures in studies where the major metric of success is the size of the list of identifications and there are no follow up studies imposing hard cost constraints on the number of reported false positives. Here we address the problem of the accuracy of TDA estimates of empirical FDR. Using MS/MS spectra from samples where we were able to define a factual FDR estimator of 'true' FDR we evaluate several popular variants of the TDA procedure in a variety of database search contexts. We show that the fraction of false identifications can sometimes be over 10× higher than reported and may be unavoidably high for certain types of searches. In addition, we further report that the two-pass search strategy seems the most promising database search strategy. While unavoidably constrained by the particulars of any specific evaluation dataset, our observations support a series of recommendations towards maximizing the number of resulting identifications while controlling database searches with robust and reproducible TDA estimation of empirical FDR.
Ng, Kevin Kit Siong; Lee, Soon Leong; Tnah, Lee Hong; Nurul-Farhanah, Zakaria; Ng, Chin Hong; Lee, Chai Ting; Tani, Naoki; Diway, Bibian; Lai, Pei Sing; Khoo, Eyen
2016-07-01
Illegal logging and smuggling of Gonystylus bancanus (Thymelaeaceae) poses a serious threat to this fragile valuable peat swamp timber species. Using G. bancanus as a case study, DNA markers were used to develop identification databases at the species, population and individual level. The species level database for Gonystylus comprised of an rDNA (ITS2) and two cpDNA (trnH-psbA and trnL) markers based on a 20 Gonystylus species database. When concatenated, taxonomic species recognition was achieved with a resolution of 90% (18 out of the 20 species). In addition, based on 17 natural populations of G. bancanus throughout West (Peninsular Malaysia) and East (Sabah and Sarawak) Malaysia, population and individual identification databases were developed using cpDNA and STR markers respectively. A haplotype distribution map for Malaysia was generated using six cpDNA markers, resulting in 12 unique multilocus haplotypes, from 24 informative intraspecific variable sites. These unique haplotypes suggest a clear genetic structuring of West and East regions. A simulation procedure based on the composition of the samples was used to test whether a suspected sample conformed to a given regional origin. Overall, the observed type I and II errors of the databases showed good concordance with the predicted 5% threshold which indicates that the databases were useful in revealing provenance and establishing conformity of samples from West and East Malaysia. Sixteen STRs were used to develop the DNA profiling databases for individual identification. Bayesian clustering analyses divided the 17 populations into two main genetic clusters, corresponding to the regions of West and East Malaysia. Population substructuring (K=2) was observed within each region. After removal of bias resulting from sampling effects and population subdivision, conservativeness tests showed that the West and East Malaysia databases were conservative. This suggests that both databases can be used independently for random match probability estimation within respective regions. The reliability of the databases was further determined by independent self-assignment tests based on the likelihood of each individual's multilocus genotype occurring in each identified population, genetic cluster and region with an average percentage of correctly assigned individuals of 54.80%, 99.60% and 100% respectively. Thus, after appropriate validation, the genetic identification databases developed for G. bancanus in this study could support forensic applications and help safeguard this valuable species into the future. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Sleiman, Sue; Halliday, Catriona L.; Chapman, Belinda; Brown, Mitchell; Nitschke, Joanne; Lau, Anna F.
2016-01-01
We developed an Australian database for the identification of Aspergillus, Scedosporium, and Fusarium species (n = 28) by matrix-assisted laser desorption ionization−time of flight mass spectrometry (MALDI-TOF MS). In a challenge against 117 isolates, species identification significantly improved when the in-house-built database was combined with the Bruker Filamentous Fungi Library compared with that for the Bruker library alone (Aspergillus, 93% versus 69%; Fusarium, 84% versus 42%; and Scedosporium, 94% versus 18%, respectively). PMID:27252460
Evaluation of the Biolog MicroStation system for yeast identification
NASA Technical Reports Server (NTRS)
McGinnis, M. R.; Molina, T. C.; Pierson, D. L.; Mishra, S. K.
1996-01-01
One hundred and fifty-nine isolates representing 16 genera and 53 species of yeasts were processed with the Biolog MicroStation System for yeast identification. Thirteen genera and 38 species were included in the Biolog database. For these 129 isolates, correct identifications to the species level were 13.2, 39.5 and 48.8% after 24, 48 and 72 hours incubation at 30 degrees C, respectively. Three genera and 15 species which were not included in the Biolog database were also tested. Of the 30 isolates studied, 16.7, 53.3 and 56.7% of the isolates were given incorrect names from the system's database after 24,48 and 72 h incubation at 30 degrees C, respectively. The remaining isolates of this group were not identified.
Facilitating Science Discoveries from NED Today and in the 2020s
NASA Astrophysics Data System (ADS)
Mazzarella, Joseph M.; NED Team
2018-06-01
I will review recent developments, work in progress, and major challenges that lie ahead as we enhance the capabilities of the NASA/IPAC Extragalactic Database (NED) to facilitate and accelerate multi-wavelength research on objects beyond our Milky Way galaxy. The recent fusion of data for over 470 million sources from the 2MASS Point Source Catalog and approximately 750 million sources from the AllWISE Source Catalog (next up) with redshifts from the SDSS and other data in NED is increasing the holdings to over a billion distinct objects with cross-identifications, providing a rich resource for multi-wavelength research. Combining data across such large surveys, as well as integrating data from over 110,000 smaller but scientifically important catalogs and journal articles, presents many challanges including the need to update the computing infrastructure and re-tool production and operations on a regular basis. Integration of the Firefly toolkit into the new user interface is ushering in a new phase of interative data visualization in NED, with features and capabilities familiar to users of IRSA and the emerging LSST science user interface. Graphical characterizations of NED content and estimates of completeness in different sky and spectral regions are also being developed. A newly implemented service that follows the Table Access Protocol (TAP) enables astronomers to issue queries to the NED object directory using Astronomical Data Language (ADQL), a standard shared in common with the NASA mission archives and other virtual observatories around the world. A brief review will be given of new science capabilities under development and planned for 2019-2020, as well as initiatives underway involving deployment of a parallel database, cloud technologies, machine learning, and first steps in bringing analysis capabilities close to the database in collaboration with IRSA. I will close with some questions for the community to consider in helping us plan future science capabilities and directions for NED in the 2020s.
[Integrated DNA barcoding database for identifying Chinese animal medicine].
Shi, Lin-Chun; Yao, Hui; Xie, Li-Fang; Zhu, Ying-Jie; Song, Jing-Yuan; Zhang, Hui; Chen, Shi-Lin
2014-06-01
In order to construct an integrated DNA barcoding database for identifying Chinese animal medicine, the authors and their cooperators have completed a lot of researches for identifying Chinese animal medicines using DNA barcoding technology. Sequences from GenBank have been analyzed simultaneously. Three different methods, BLAST, barcoding gap and Tree building, have been used to confirm the reliabilities of barcode records in the database. The integrated DNA barcoding database for identifying Chinese animal medicine has been constructed using three different parts: specimen, sequence and literature information. This database contained about 800 animal medicines and the adulterants and closely related species. Unknown specimens can be identified by pasting their sequence record into the window on the ID page of species identification system for traditional Chinese medicine (www. tcmbarcode. cn). The integrated DNA barcoding database for identifying Chinese animal medicine is significantly important for animal species identification, rare and endangered species conservation and sustainable utilization of animal resources.
ERIC Educational Resources Information Center
Bjerregaard, Kirstien; Haslam, S. Alexander; Morton, Thomas
2016-01-01
Worldwide, organizations are keen to ensure that they achieve a performance return from the large investment they make in employee training. This study examines the way in which workgroup identification facilitates trainees' motivation to transfer learning into workplace performance. A 2 × 2 longitudinal study evaluated the effects of a new…
Developing a national strategy to prevent dementia: Leon Thal Symposium 2009.
Khachaturian, Zaven S; Barnes, Deborah; Einstein, Richard; Johnson, Sterling; Lee, Virginia; Roses, Allen; Sager, Mark A; Shankle, William R; Snyder, Peter J; Petersen, Ronald C; Schellenberg, Gerard; Trojanowski, John; Aisen, Paul; Albert, Marilyn S; Breitner, John C S; Buckholtz, Neil; Carrillo, Maria; Ferris, Steven; Greenberg, Barry D; Grundman, Michael; Khachaturian, Ara S; Kuller, Lewis H; Lopez, Oscar L; Maruff, Paul; Mohs, Richard C; Morrison-Bogorad, Marcelle; Phelps, Creighton; Reiman, Eric; Sabbagh, Marwan; Sano, Mary; Schneider, Lon S; Siemers, Eric; Tariot, Pierre; Touchon, Jacques; Vellas, Bruno; Bain, Lisa J
2010-03-01
Among the major impediments to the design of clinical trials for the prevention of Alzheimer's disease (AD), the most critical is the lack of validated biomarkers, assessment tools, and algorithms that would facilitate identification of asymptomatic individuals with elevated risk who might be recruited as study volunteers. Thus, the Leon Thal Symposium 2009 (LTS'09), on October 27-28, 2009 in Las Vegas, Nevada, was convened to explore strategies to surmount the barriers in designing a multisite, comparative study to evaluate and validate various approaches for detecting and selecting asymptomatic people at risk for cognitive disorders/dementia. The deliberations of LTS'09 included presentations and reviews of different approaches (algorithms, biomarkers, or measures) for identifying asymptomatic individuals at elevated risk for AD who would be candidates for longitudinal or prevention studies. The key nested recommendations of LTS'09 included: (1) establishment of a National Database for Longitudinal Studies as a shared research core resource; (2) launch of a large collaborative study that will compare multiple screening approaches and biomarkers to determine the best method for identifying asymptomatic people at risk for AD; (3) initiation of a Global Database that extends the concept of the National Database for Longitudinal Studies for longitudinal studies beyond the United States; and (4) development of an educational campaign that will address public misconceptions about AD and promote healthy brain aging. 2010. Published by Elsevier Inc.
Proteomic analysis of the Theileria annulata schizont
Witschi, M.; Xia, D.; Sanderson, S.; Baumgartner, M.; Wastling, J.M.; Dobbelaere, D.A.E.
2013-01-01
The apicomplexan parasite, Theileria annulata, is the causative agent of tropical theileriosis, a devastating lymphoproliferative disease of cattle. The schizont stage transforms bovine leukocytes and provides an intriguing model to study host/pathogen interactions. The genome of T. annulata has been sequenced and transcriptomic data are rapidly accumulating. In contrast, little is known about the proteome of the schizont, the pathogenic, transforming life cycle stage of the parasite. Using one-dimensional (1-D) gel LC-MS/MS, a proteomic analysis of purified T. annulata schizonts was carried out. In whole parasite lysates, 645 proteins were identified. Proteins with transmembrane domains (TMDs) were under-represented and no proteins with more than four TMDs could be detected. To tackle this problem, Triton X-114 treatment was applied, which facilitates the extraction of membrane proteins, followed by 1-D gel LC-MS/MS. This resulted in the identification of an additional 153 proteins. Half of those had one or more TMD and 30 proteins with more than four TMDs were identified. This demonstrates that Triton X-114 treatment can provide a valuable additional tool for the identification of new membrane proteins in proteomic studies. With two exceptions, all proteins involved in glycolysis and the citric acid cycle were identified. For at least 29% of identified proteins, the corresponding transcripts were not present in the existing expressed sequence tag databases. The proteomics data were integrated into the publicly accessible database resource at EuPathDB (www.eupathdb.org) so that mass spectrometry-based protein expression evidence for T. annulata can be queried alongside transcriptional and other genomics data available for these parasites. PMID:23178997
Gasc, Cyrielle; Constantin, Antony; Jaziri, Faouzi; Peyret, Pierre
2017-01-01
The detection and identification of bacterial pathogens involved in acts of bio- and agroterrorism are essential to avoid pathogen dispersal in the environment and propagation within the population. Conventional molecular methods, such as PCR amplification, DNA microarrays or shotgun sequencing, are subject to various limitations when assessing environmental samples, which can lead to inaccurate findings. We developed a hybridization capture strategy that uses a set of oligonucleotide probes to target and enrich biomarkers of interest in environmental samples. Here, we present Oligonucleotide Capture Probes for Pathogen Identification Database (OCaPPI-Db), an online capture probe database containing a set of 1,685 oligonucleotide probes allowing for the detection and identification of 30 biothreat agents up to the species level. This probe set can be used in its entirety as a comprehensive diagnostic tool or can be restricted to a set of probes targeting a specific pathogen or virulence factor according to the user's needs. : http://ocappidb.uca.works. © The Author(s) 2017. Published by Oxford University Press.
Bai, Haihong; Pan, Yiting; Guo, Cong; Zhao, Xinyuan; Shen, Bingquan; Wang, Xinghe; Liu, Zeyuan; Cheng, Yuanguo; Qin, Weijie; Qian, Xiaohong
2017-08-15
Protein N-glycosylation is one of the most important post-translational modifications, participating in many key biological and pathological processes. Large-scale and precise identification of N-glycosylated proteins and peptides is especially beneficial for understanding their biological functions and for discovery of new clinical biomarkers and therapeutic drug targets. However, protein N-glycosylation is microheterogeneous and low abundant in living organisms, therefore specific enrichment of N-glycosylated proteins/peptides before mass spectrometry analysis is a prerequisite. In this work, we developed a new type of polymer hybrid graphene oxide (GO) by in situ growth of hydrazide-functionalized hydrophilic polymer chains on the GO surface (GO-PAAH) for selective N-glycopeptide enrichment and identification by mass spectrometry. The densely attached and low steric hindrance hydrazide groups as well as the highly hydrophilic nature of GO-PAAH facilitate N-glycopeptide enrichment by the combination of hydrazide capturing and HILIC interaction. Taking advantage of the unique features of GO-PAAH, all of the three N-glycopeptides of bovine fetuin were successfully enriched and identified with significantly enhanced signal intensities from a digest mixture of bovine fetuin and bovine serum albumin at a mass ratio of 1:100, demonstrating the excellent enrichment selectivity of GO-PAAH. Furthermore, a total of 507 N-glycosylation sites and 480 N-glycopeptides in 232 N-glycoproteins were enriched and identified from 10μL of human serum by three replicates using this novel enrichment material, which is nearly two times higher than the commercial hydrazide resin based method (280 N-glycosylation sites, 261 N-glycopeptides and 144 N-glycoproteins in three experiments). Among the identified, 95 N-glycosylation sites were not reported in the Uniprot database, and 106 N-glycoproteins were disease related in the Nextprot database, indicating the potential of this new enrichment material in global mapping of protein N-glycosylation. Copyright © 2017 Elsevier B.V. All rights reserved.
Li, Guotian; Jain, Rashmi; Chern, Mawsheng; Pham, Nikki T; Martin, Joel A; Wei, Tong; Schackwitz, Wendy S; Lipzen, Anna M; Duong, Phat Q; Jones, Kyle C; Jiang, Liangrong; Ruan, Deling; Bauer, Diane; Peng, Yi; Barry, Kerrie W; Schmutz, Jeremy; Ronald, Pamela C
2017-06-01
The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake ( Oryza sativa ssp japonica ), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportion of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. This work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations. © 2017 American Society of Plant Biologists. All rights reserved.
Li, Guotian; Jain, Rashmi; Chern, Mawsheng; ...
2017-06-02
The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportionmore » of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. In conclusion, this work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations.« less
Costa, Fabrizio; Alba, Rob; Schouten, Henk; Soglio, Valeria; Gianfranceschi, Luca; Serra, Sara; Musacchi, Stefano; Sansavini, Silviero; Costa, Guglielmo; Fei, Zhangjun; Giovannoni, James
2010-10-25
Fruit development, maturation and ripening consists of a complex series of biochemical and physiological changes that in climacteric fruits, including apple and tomato, are coordinated by the gaseous hormone ethylene. These changes lead to final fruit quality and understanding of the functional machinery underlying these processes is of both biological and practical importance. To date many reports have been made on the analysis of gene expression in apple. In this study we focused our investigation on the role of ethylene during apple maturation, specifically comparing transcriptomics of normal ripening with changes resulting from application of the hormone receptor competitor 1-methylcyclopropene. To gain insight into the molecular process regulating ripening in apple, and to compare to tomato (model species for ripening studies), we utilized both homologous and heterologous (tomato) microarray to profile transcriptome dynamics of genes involved in fruit development and ripening, emphasizing those which are ethylene regulated.The use of both types of microarrays facilitated transcriptome comparison between apple and tomato (for the later using data previously published and available at the TED: tomato expression database) and highlighted genes conserved during ripening of both species, which in turn represent a foundation for further comparative genomic studies. The cross-species analysis had the secondary aim of examining the efficiency of heterologous (specifically tomato) microarray hybridization for candidate gene identification as related to the ripening process. The resulting transcriptomics data revealed coordinated gene expression during fruit ripening of a subset of ripening-related and ethylene responsive genes, further facilitating the analysis of ethylene response during fruit maturation and ripening. Our combined strategy based on microarray hybridization enabled transcriptome characterization during normal climacteric apple ripening, as well as definition of ethylene-dependent transcriptome changes. Comparison with tomato fruit maturation and ethylene responsive transcriptome activity facilitated identification of putative conserved orthologous ripening-related genes, which serve as an initial set of candidates for assessing conservation of gene activity across genomes of fruit bearing plant species.
Use of big data in drug development for precision medicine
Kim, Rosa S.; Goossens, Nicolas; Hoshida, Yujin
2016-01-01
Summary Drug development has been a costly and lengthy process with an extremely low success rate and lack of consideration of individual diversity in drug response and toxicity. Over the past decade, an alternative “big data” approach has been expanding at an unprecedented pace based on the development of electronic databases of chemical substances, disease gene/protein targets, functional readouts, and clinical information covering inter-individual genetic variations and toxicities. This paradigm shift has enabled systematic, high-throughput, and accelerated identification of novel drugs or repurposed indications of existing drugs for pathogenic molecular aberrations specifically present in each individual patient. The exploding interest from the information technology and direct-to-consumer genetic testing industries has been further facilitating the use of big data to achieve personalized Precision Medicine. Here we overview currently available resources and discuss future prospects. PMID:27430024
Hierarchical content-based image retrieval by dynamic indexing and guided search
NASA Astrophysics Data System (ADS)
You, Jane; Cheung, King H.; Liu, James; Guo, Linong
2003-12-01
This paper presents a new approach to content-based image retrieval by using dynamic indexing and guided search in a hierarchical structure, and extending data mining and data warehousing techniques. The proposed algorithms include: a wavelet-based scheme for multiple image feature extraction, the extension of a conventional data warehouse and an image database to an image data warehouse for dynamic image indexing, an image data schema for hierarchical image representation and dynamic image indexing, a statistically based feature selection scheme to achieve flexible similarity measures, and a feature component code to facilitate query processing and guide the search for the best matching. A series of case studies are reported, which include a wavelet-based image color hierarchy, classification of satellite images, tropical cyclone pattern recognition, and personal identification using multi-level palmprint and face features.
Galisson, Frederic; Mahrouche, Louiza; Courcelles, Mathieu; Bonneil, Eric; Meloche, Sylvain; Chelbi-Alix, Mounira K.; Thibault, Pierre
2011-01-01
The small ubiquitin-related modifier (SUMO) is a small group of proteins that are reversibly attached to protein substrates to modify their functions. The large scale identification of protein SUMOylation and their modification sites in mammalian cells represents a significant challenge because of the relatively small number of in vivo substrates and the dynamic nature of this modification. We report here a novel proteomics approach to selectively enrich and identify SUMO conjugates from human cells. We stably expressed different SUMO paralogs in HEK293 cells, each containing a His6 tag and a strategically located tryptic cleavage site at the C terminus to facilitate the recovery and identification of SUMOylated peptides by affinity enrichment and mass spectrometry. Tryptic peptides with short SUMO remnants offer significant advantages in large scale SUMOylome experiments including the generation of paralog-specific fragment ions following CID and ETD activation, and the identification of modified peptides using conventional database search engines such as Mascot. We identified 205 unique protein substrates together with 17 precise SUMOylation sites present in 12 SUMO protein conjugates including three new sites (Lys-380, Lys-400, and Lys-497) on the protein promyelocytic leukemia. Label-free quantitative proteomics analyses on purified nuclear extracts from untreated and arsenic trioxide-treated cells revealed that all identified SUMOylated sites of promyelocytic leukemia were differentially SUMOylated upon stimulation. PMID:21098080
Seiwert, Bettina; Golan-Rozen, Naama; Weidauer, Cindy; Riemenschneider, Christina; Chefetz, Benny; Hadar, Yitzhak; Reemtsma, Thorsten
2015-10-20
Transformation products (TPs) of environmental pollutants must be identified to understand biodegradation processes and reaction mechanisms and to assess the efficiency of treatment processes. The combination of oxidation by an electrochemical cell (EC) with analysis by liquid chromatography-high-resolution mass spectrometry (LC-HRMS) is a rapid approach for the determination and identification of TPs generated by natural microbial processes. Electrochemically generated TPs of the recalcitrant pharmaceutical carbamazepine (CBZ) were used for a target screening for TPs formed by the white-rot fungus Pleurotus ostreatus. EC with LC-HRMS facilitates detection and identification of TPs because the product spectrum is not superimposed with biogenic metabolites and elevated substrate concentrations can be used. A group of 10 TPs formed in the microbial process were detected by target screening for molecular ions, and another 4 were detected by screening on the basis of characteristic fragment ions. Three of these TPs have never been reported before. For CBZ, EC with LC-HRMS was found to be more effective than software tools in defining targets for the screening and faster than nontarget screening alone in TP identification. EC with LC-HRMS may be used to feed MS databases with spectra of possible TPs of larger numbers of environmental contaminants for an efficient target screening.
de la Bastide, Paul Y; Leung, Wai Lam; Hintz, William E
2015-01-01
The ITS region of the rDNA gene was compared for Saprolegnia spp. in order to improve our understanding of nucleotide sequence variability within and between species of this genus, determine species composition in Canadian fin fish aquaculture facilities, and to assess the utility of ITS sequence variability in genetic marker development. From a collection of more than 400 field isolates, ITS region nucleotide sequences were studied and it was determined that there was sufficient consistent inter-specific variation to support the designation of species identity based on ITS sequence data. This non-subjective approach to species identification does not rely upon transient morphological features. Phylogenetic analyses comparing our ITS sequences and species designations with data from previous studies generally supported the clade scheme of Diéguez-Uribeondo et al. (2007) and found agreement with the molecular taxonomic cluster system of Sandoval-Sierra et al. (2014). Our Canadian ITS sequence collection will thus contribute to the public database and assist the clarification of Saprolegnia spp. taxonomy. The analysis of ITS region sequence variability facilitated genus- and species-level identification of unknown samples from aquaculture facilities and provided useful information on species composition. A unique ITS-RFLP for the identification of S. parasitica was also described. Copyright © 2014 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Heinrich, Andreas; Güttler, Felix; Wendt, Sebastian; Schenkl, Sebastian; Hubig, Michael; Wagner, Rebecca; Mall, Gita; Teichgräber, Ulf
2018-06-18
In forensic odontology the comparison between antemortem and postmortem panoramic radiographs (PRs) is a reliable method for person identification. The purpose of this study was to improve and automate identification of unknown people by comparison between antemortem and postmortem PR using computer vision. The study includes 43 467 PRs from 24 545 patients (46 % females/54 % males). All PRs were filtered and evaluated with Matlab R2014b including the toolboxes image processing and computer vision system. The matching process used the SURF feature to find the corresponding points between two PRs (unknown person and database entry) out of the whole database. From 40 randomly selected persons, 34 persons (85 %) could be reliably identified by corresponding PR matching points between an already existing scan in the database and the most recent PR. The systematic matching yielded a maximum of 259 points for a successful identification between two different PRs of the same person and a maximum of 12 corresponding matching points for other non-identical persons in the database. Hence 12 matching points are the threshold for reliable assignment. Operating with an automatic PR system and computer vision could be a successful and reliable tool for identification purposes. The applied method distinguishes itself by virtue of its fast and reliable identification of persons by PR. This Identification method is suitable even if dental characteristics were removed or added in the past. The system seems to be robust for large amounts of data. · Computer vision allows an automated antemortem and postmortem comparison of panoramic radiographs (PRs) for person identification.. · The present method is able to find identical matching partners among huge datasets (big data) in a short computing time.. · The identification method is suitable even if dental characteristics were removed or added.. · Heinrich A, Güttler F, Wendt S et al. Forensic Odontology: Automatic Identification of Persons Comparing Antemortem and Postmortem Panoramic Radiographs Using Computer Vision. Fortschr Röntgenstr 2018; DOI: 10.1055/a-0632-4744. © Georg Thieme Verlag KG Stuttgart · New York.
21 CFR 830.310 - Information required for unique device identification.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Information required for unique device identification. 830.310 Section 830.310 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND... Identification Database § 830.310 Information required for unique device identification. The contact for device...
Preparing a collection of radiology examinations for distribution and retrieval.
Demner-Fushman, Dina; Kohli, Marc D; Rosenman, Marc B; Shooshan, Sonya E; Rodriguez, Laritza; Antani, Sameer; Thoma, George R; McDonald, Clement J
2016-03-01
Clinical documents made available for secondary use play an increasingly important role in discovery of clinical knowledge, development of research methods, and education. An important step in facilitating secondary use of clinical document collections is easy access to descriptions and samples that represent the content of the collections. This paper presents an approach to developing a collection of radiology examinations, including both the images and radiologist narrative reports, and making them publicly available in a searchable database. The authors collected 3996 radiology reports from the Indiana Network for Patient Care and 8121 associated images from the hospitals' picture archiving systems. The images and reports were de-identified automatically and then the automatic de-identification was manually verified. The authors coded the key findings of the reports and empirically assessed the benefits of manual coding on retrieval. The automatic de-identification of the narrative was aggressive and achieved 100% precision at the cost of rendering a few findings uninterpretable. Automatic de-identification of images was not quite as perfect. Images for two of 3996 patients (0.05%) showed protected health information. Manual encoding of findings improved retrieval precision. Stringent de-identification methods can remove all identifiers from text radiology reports. DICOM de-identification of images does not remove all identifying information and needs special attention to images scanned from film. Adding manual coding to the radiologist narrative reports significantly improved relevancy of the retrieved clinical documents. The de-identified Indiana chest X-ray collection is available for searching and downloading from the National Library of Medicine (http://openi.nlm.nih.gov/). Published by Oxford University Press on behalf of the American Medical Informatics Association 2015. This work is written by US Government employees and is in the public domain in the US.
HMDB 4.0: the human metabolome database for 2018
Feunang, Yannick Djoumbou; Marcu, Ana; Guo, An Chi; Liang, Kevin; Vázquez-Fresno, Rosa; Sajed, Tanvir; Johnson, Daniel; Li, Carin; Karu, Naama; Sayeeda, Zinat; Lo, Elvis; Assempour, Nazanin; Berjanskii, Mark; Singhal, Sandeep; Arndt, David; Liang, Yonjie; Badran, Hasan; Grant, Jason; Serra-Cayuela, Arnau; Liu, Yifeng; Mandal, Rupa; Neveu, Vanessa; Pon, Allison; Knox, Craig; Wilson, Michael; Manach, Claudine; Scalbert, Augustin
2018-01-01
Abstract The Human Metabolome Database or HMDB (www.hmdb.ca) is a web-enabled metabolomic database containing comprehensive information about human metabolites along with their biological roles, physiological concentrations, disease associations, chemical reactions, metabolic pathways, and reference spectra. First described in 2007, the HMDB is now considered the standard metabolomic resource for human metabolic studies. Over the past decade the HMDB has continued to grow and evolve in response to emerging needs for metabolomics researchers and continuing changes in web standards. This year's update, HMDB 4.0, represents the most significant upgrade to the database in its history. For instance, the number of fully annotated metabolites has increased by nearly threefold, the number of experimental spectra has grown by almost fourfold and the number of illustrated metabolic pathways has grown by a factor of almost 60. Significant improvements have also been made to the HMDB’s chemical taxonomy, chemical ontology, spectral viewing, and spectral/text searching tools. A great deal of brand new data has also been added to HMDB 4.0. This includes large quantities of predicted MS/MS and GC–MS reference spectral data as well as predicted (physiologically feasible) metabolite structures to facilitate novel metabolite identification. Additional information on metabolite-SNP interactions and the influence of drugs on metabolite levels (pharmacometabolomics) has also been added. Many other important improvements in the content, the interface, and the performance of the HMDB website have been made and these should greatly enhance its ease of use and its potential applications in nutrition, biochemistry, clinical chemistry, clinical genetics, medicine, and metabolomics science. PMID:29140435
Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami
2011-02-01
Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named 'RiceFOX'. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/.
Visibiome: an efficient microbiome search engine based on a scalable, distributed architecture.
Azman, Syafiq Kamarul; Anwar, Muhammad Zohaib; Henschel, Andreas
2017-07-24
Given the current influx of 16S rRNA profiles of microbiota samples, it is conceivable that large amounts of them eventually are available for search, comparison and contextualization with respect to novel samples. This process facilitates the identification of similar compositional features in microbiota elsewhere and therefore can help to understand driving factors for microbial community assembly. We present Visibiome, a microbiome search engine that can perform exhaustive, phylogeny based similarity search and contextualization of user-provided samples against a comprehensive dataset of 16S rRNA profiles environments, while tackling several computational challenges. In order to scale to high demands, we developed a distributed system that combines web framework technology, task queueing and scheduling, cloud computing and a dedicated database server. To further ensure speed and efficiency, we have deployed Nearest Neighbor search algorithms, capable of sublinear searches in high-dimensional metric spaces in combination with an optimized Earth Mover Distance based implementation of weighted UniFrac. The search also incorporates pairwise (adaptive) rarefaction and optionally, 16S rRNA copy number correction. The result of a query microbiome sample is the contextualization against a comprehensive database of microbiome samples from a diverse range of environments, visualized through a rich set of interactive figures and diagrams, including barchart-based compositional comparisons and ranking of the closest matches in the database. Visibiome is a convenient, scalable and efficient framework to search microbiomes against a comprehensive database of environmental samples. The search engine leverages a popular but computationally expensive, phylogeny based distance metric, while providing numerous advantages over the current state of the art tool.
Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami
2011-01-01
Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named ‘RiceFOX’. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/. PMID:21186176
What Is New in Clinical Microbiology—Microbial Identification by MALDI-TOF Mass Spectrometry
Murray, Patrick R.
2012-01-01
Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) offers the possibility of accurate, rapid, inexpensive identification of bacteria, fungi, and mycobacteria isolated in clinical microbiology laboratories. The procedures for preanalytic processing of organisms and analysis by MALDI-TOF MS are technically simple and reproducible, and commercial databases and interpretive algorithms are available for the identification of a wide spectrum of clinically significant organisms. Although only limited work has been reported on the use of this technique to identify molds, perform strain typing, or determine antibiotic susceptibility results, these are fruitful areas of promising research. As experience is gained with MALDI-TOF MS, it is expected that the databases will be expanded to resolve many of the current inadequate identifications (eg, no identification, genus-level identification) and algorithms for potential misidentification will be developed. The current lack of Food and Drug Administration approval of any MALDI-TOF MS system for organism identification limits widespread use in the United States. PMID:22795961
NASA Astrophysics Data System (ADS)
Rodríguez-Rodríguez, Cristina; Rimola, Albert; Alí-Torres, Jorge; Sodupe, Mariona; González-Duarte, Pilar
2011-01-01
The development of new strategies to find commercial molecules with promising biochemical features is a main target in the field of biomedicine chemistry. In this work we present an in silico-based protocol that allows identifying commercial compounds with suitable metal coordinating and pharmacokinetic properties to act as metal-ion chelators in metal-promoted neurodegenerative diseases (MpND). Selection of the chelating ligands is done by combining quantum chemical calculations with the search of commercial compounds on different databases via virtual screening. Starting from different designed molecular frameworks, which mainly constitute the binding site, the virtual screening on databases facilitates the identification of different commercial molecules that enclose such scaffolds and, by imposing a set of chemical and pharmacokinetic filters, obey some drug-like requirements mandatory to deal with MpND. The quantum mechanical calculations are useful to gauge the chelating properties of the selected candidate molecules by determining the structure of metal complexes and evaluating their stability constants. With the proposed strategy, commercial compounds containing N and S donor atoms in the binding sites and capable to cross the BBB have been identified and their chelating properties analyzed.
Target-Pathogen: a structural bioinformatic approach to prioritize drug targets in pathogens.
Sosa, Ezequiel J; Burguener, Germán; Lanzarotti, Esteban; Defelipe, Lucas; Radusky, Leandro; Pardo, Agustín M; Marti, Marcelo; Turjanski, Adrián G; Fernández Do Porto, Darío
2018-01-04
Available genomic data for pathogens has created new opportunities for drug discovery and development to fight them, including new resistant and multiresistant strains. In particular structural data must be integrated with both, gene information and experimental results. In this sense, there is a lack of an online resource that allows genome wide-based data consolidation from diverse sources together with thorough bioinformatic analysis that allows easy filtering and scoring for fast target selection for drug discovery. Here, we present Target-Pathogen database (http://target.sbg.qb.fcen.uba.ar/patho), designed and developed as an online resource that allows the integration and weighting of protein information such as: function, metabolic role, off-targeting, structural properties including druggability, essentiality and omic experiments, to facilitate the identification and prioritization of candidate drug targets in pathogens. We include in the database 10 genomes of some of the most relevant microorganisms for human health (Mycobacterium tuberculosis, Mycobacterium leprae, Klebsiella pneumoniae, Plasmodium vivax, Toxoplasma gondii, Leishmania major, Wolbachia bancrofti, Trypanosoma brucei, Shigella dysenteriae and Schistosoma Smanosoni) and show its applicability. New genomes can be uploaded upon request. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Borries, Carola; Sandel, Aaron A; Koenig, Andreas; Fernandez-Duque, Eduardo; Kamilar, Jason M; Amoroso, Caroline R; Barton, Robert A; Bray, Joel; Di Fiore, Anthony; Gilby, Ian C; Gordon, Adam D; Mundry, Roger; Port, Markus; Powell, Lauren E; Pusey, Anne E; Spriggs, Amanda; Nunn, Charles L
2016-09-01
Recent decades have seen rapid development of new analytical methods to investigate patterns of interspecific variation. Yet these cutting-edge statistical analyses often rely on data of questionable origin, varying accuracy, and weak comparability, which seem to have reduced the reproducibility of studies. It is time to improve the transparency of comparative data while also making these improved data more widely available. We, the authors, met to discuss how transparency, usability, and reproducibility of comparative data can best be achieved. We propose four guiding principles: 1) data identification with explicit operational definitions and complete descriptions of methods; 2) inclusion of metadata that capture key characteristics of the data, such as sample size, geographic coordinates, and nutrient availability (for example, captive versus wild animals); 3) documentation of the original reference for each datum; and 4) facilitation of effective interactions with the data via user friendly and transparent interfaces. We urge reviewers, editors, publishers, database developers and users, funding agencies, researchers publishing their primary data, and those performing comparative analyses to embrace these standards to increase the transparency, usability, and reproducibility of comparative studies. © 2016 Wiley Periodicals, Inc.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-16
... Excluded Parties Listing System (EPLS) databases into the System for Award Management (SAM) database. DATES... combined the functional capabilities of the CCR, ORCA, and EPLS procurement systems into the SAM database... identification number and the type of organization from the System for Award Management database. 0 3. Revise the...
Sleiman, Sue; Halliday, Catriona L; Chapman, Belinda; Brown, Mitchell; Nitschke, Joanne; Lau, Anna F; Chen, Sharon C-A
2016-08-01
We developed an Australian database for the identification of Aspergillus, Scedosporium, and Fusarium species (n = 28) by matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS). In a challenge against 117 isolates, species identification significantly improved when the in-house-built database was combined with the Bruker Filamentous Fungi Library compared with that for the Bruker library alone (Aspergillus, 93% versus 69%; Fusarium, 84% versus 42%; and Scedosporium, 94% versus 18%, respectively). Copyright © 2016, American Society for Microbiology. All Rights Reserved.
McMullen, Allison R; Wallace, Meghan A; Pincus, David H; Wilkey, Kathy; Burnham, C A
2016-08-01
Invasive fungal infections have a high rate of morbidity and mortality, and accurate identification is necessary to guide appropriate antifungal therapy. With the increasing incidence of invasive disease attributed to filamentous fungi, rapid and accurate species-level identification of these pathogens is necessary. Traditional methods for identification of filamentous fungi can be slow and may lack resolution. Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) has emerged as a rapid and accurate method for identification of bacteria and yeasts, but a paucity of data exists on the performance characteristics of this method for identification of filamentous fungi. The objective of our study was to evaluate the accuracy of the Vitek MS for mold identification. A total of 319 mold isolates representing 43 genera recovered from clinical specimens were evaluated. Of these isolates, 213 (66.8%) were correctly identified using the Vitek MS Knowledge Base, version 3.0 database. When a modified SARAMIS (Spectral Archive and Microbial Identification System) database was used to augment the version 3.0 Knowledge Base, 245 (76.8%) isolates were correctly identified. Unidentified isolates were subcultured for repeat testing; 71/319 (22.3%) remained unidentified. Of the unidentified isolates, 69 were not in the database. Only 3 (0.9%) isolates were misidentified by MALDI-TOF MS (including Aspergillus amoenus [n = 2] and Aspergillus calidoustus [n = 1]) although 10 (3.1%) of the original phenotypic identifications were not correct. In addition, this methodology was able to accurately identify 133/144 (93.6%) Aspergillus sp. isolates to the species level. MALDI-TOF MS has the potential to expedite mold identification, and misidentifications are rare. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Sugawara, Ryota; Yamada, Sayumi; Tu, Zhihao; Sugawara, Akiko; Suzuki, Kousuke; Hoshiba, Toshihiro; Eisaka, Sadao; Yamaguchi, Akihiro
2016-08-31
Mushrooms are a favourite natural food in many countries. However, some wild species cause food poisoning, sometimes lethal, due to misidentification caused by confusing fruiting bodies similar to those of edible species. The morphological inspection of mycelia, spores and fruiting bodies have been traditionally used for the identification of mushrooms. More recently, DNA sequencing analysis has been successfully applied to mushrooms and to many other species. This study focuses on a simpler and more rapid methodology for the identification of wild mushrooms via protein profiling based on matrix-assisted laser desorption/ionization mass spectrometry (MALDI-TOF MS). A preliminary study using 6 commercially available cultivated mushrooms suggested that a more reproducible spectrum was obtained from a portion of the cap than from the stem of a fruiting body by the extraction of proteins with a formic acid-acetonitrile mixture (1 + 1). We used 157 wild mushroom-fruiting bodies collected in the centre of Hokkaido from June to November 2014. Sequencing analysis of a portion of the ribosomal RNA gene provided 134 identifications of mushrooms by genus or species, however 23 samples containing 10 unknown species that had lower concordance rate of the nucleotide sequences in a BLAST search (less than 97%) and 13 samples that had unidentifiable poor or mixed sequencing signals remained unknown. MALDI-TOF MS analysis yielded a reproducible spectrum (frequency of matching score ≥ 2.0 was ≥6 spectra from 12 spectra measurements) for 114 of 157 samples. Profiling scores that matched each other within the database gave correct species identification (with scores of ≥2.0) for 110 samples (96%). An in-house prepared database was constructed from 106 independent species, except for overlapping identifications. We used 48 wild mushrooms that were collected in autumn 2015 to validate the in-house database. As a result, 21 mushrooms were identified at the species level with scores ≥2.0 and 5 mushrooms at the genus level with scores ≥1.7, although the signals of 2 mushrooms were insufficient for analysis. The remaining 20 samples were recognized as "unreliable identification" with scores <1.7. Subsequent DNA analysis confirmed that the correct species or genus identifications were achieved by MALDI-TOF MS for the 26 former samples, whereas the 18 mushrooms with poorly matched scores were species that were not included in the database. Thus, the proposed MALDI-TOF MS coupled with our database could be a powerful tool for the rapid and reliable identification of mushrooms; however, continuous updating of the database is necessary to enrich it with more abundant species. Copyright © 2016 Elsevier B.V. All rights reserved.
Using the Proteomics Identifications Database (PRIDE).
Martens, Lennart; Jones, Phil; Côté, Richard
2008-03-01
The Proteomics Identifications Database (PRIDE) is a public data repository designed to store, disseminate, and analyze mass spectrometry based proteomics datasets. The PRIDE database can accommodate any level of detailed metadata about the submitted results, which can be queried, explored, viewed, or downloaded via the PRIDE Web interface. The PRIDE database also provides a simple, yet powerful, access control mechanism that fully supports confidential peer-reviewing of data related to a manuscript, ensuring that these results remain invisible to the general public while allowing referees and journal editors anonymized access to the data. This unit describes in detail the functionality that PRIDE provides with regards to searching, viewing, and comparing the available data, as well as different options for submitting data to PRIDE.
Vidal-Acuña, M Reyes; Ruiz-Pérez de Pipaón, Maite; Torres-Sánchez, María José; Aznar, Javier
2017-12-08
An expanded library of matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) has been constructed using the spectra generated from 42 clinical isolates and 11 reference strains, including 23 different species from 8 sections (16 cryptic plus 7 noncryptic species). Out of a total of 379 strains of Aspergillus isolated from clinical samples, 179 strains were selected to be identified by sequencing of beta-tubulin or calmodulin genes. Protein spectra of 53 strains, cultured in liquid medium, were used to construct an in-house reference database in the MALDI-TOF MS. One hundred ninety strains (179 clinical isolates previously identified by sequencing and the 11 reference strains), cultured on solid medium, were blindy analyzed by the MALDI-TOF MS technology to validate the generated in-house reference database. A 100% correlation was obtained with both identification methods, gene sequencing and MALDI-TOF MS, and no discordant identification was obtained. The HUVR database provided species level (score of ≥2.0) identification in 165 isolates (86.84%) and for the remaining 25 (13.16%) a genus level identification (score between 1.7 and 2.0) was obtained. The routine MALDI-TOF MS analysis with the new database, was then challenged with 200 Aspergillus clinical isolates grown on solid medium in a prospective evaluation. A species identification was obtained in 191 strains (95.5%), and only nine strains (4.5%) could not be identified at the species level. Among the 200 strains, A. tubingensis was the only cryptic species identified. We demonstrated the feasibility and usefulness of the new HUVR database in MALDI-TOF MS by the use of a standardized procedure for the identification of Aspergillus clinical isolates, including cryptic species, grown either on solid or liquid media. © The Author 2017. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Liao, Wenta; Draper, William M
2013-02-21
The mass-to-structure or MTS Search Engine is an Access 2010 database containing theoretical molecular mass information for 19,438 compounds assembled from common sources such as the Merck Index, pesticide and pharmaceutical compilations, and chemical catalogues. This database, which contains no experimental mass spectral data, was developed as an aid to identification of compounds in atmospheric pressure ionization (API)-LC-MS. This paper describes a powerful upgrade to this database, a fully integrated utility for filtering or ranking candidates based on isotope ratios and patterns. The new MTS Search Engine is applied here to the identification of volatile and semivolatile compounds including pesticides, nitrosoamines and other pollutants. Methane and isobutane chemical ionization (CI) GC-MS spectra were obtained from unit mass resolution mass spectrometers to determine MH(+) masses and isotope ratios. Isotopes were measured accurately with errors of <4% and <6%, respectively, for A + 1 and A + 2 peaks. Deconvolution of interfering isotope clusters (e.g., M(+) and [M - H](+)) was required for accurate determination of the A + 1 isotope in halogenated compounds. Integrating the isotope data greatly improved the speed and accuracy of the database identifications. The database accurately identified unknowns from isobutane CI spectra in 100% of cases where as many as 40 candidates satisfied the mass tolerance. The paper describes the development and basic operation of the new MTS Search Engine and details performance testing with over 50 model compounds.
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis. PMID:26884678
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis.
NASA Astrophysics Data System (ADS)
S. Al-Kaltakchi, Musab T.; Woo, Wai L.; Dlay, Satnam; Chambers, Jonathon A.
2017-12-01
In this study, a speaker identification system is considered consisting of a feature extraction stage which utilizes both power normalized cepstral coefficients (PNCCs) and Mel frequency cepstral coefficients (MFCC). Normalization is applied by employing cepstral mean and variance normalization (CMVN) and feature warping (FW), together with acoustic modeling using a Gaussian mixture model-universal background model (GMM-UBM). The main contributions are comprehensive evaluations of the effect of both additive white Gaussian noise (AWGN) and non-stationary noise (NSN) (with and without a G.712 type handset) upon identification performance. In particular, three NSN types with varying signal to noise ratios (SNRs) were tested corresponding to street traffic, a bus interior, and a crowded talking environment. The performance evaluation also considered the effect of late fusion techniques based on score fusion, namely, mean, maximum, and linear weighted sum fusion. The databases employed were TIMIT, SITW, and NIST 2008; and 120 speakers were selected from each database to yield 3600 speech utterances. As recommendations from the study, mean fusion is found to yield overall best performance in terms of speaker identification accuracy (SIA) with noisy speech, whereas linear weighted sum fusion is overall best for original database recordings.
Murugaiyan, J; Ahrholdt, J; Kowbel, V; Roesler, U
2012-05-01
The possibility of using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) for rapid identification of pathogenic and non-pathogenic species of the genus Prototheca has been recently demonstrated. A unique reference database of MALDI-TOF MS profiles for type and reference strains of the six generally accepted Prototheca species was established. The database quality was reinforced after the acquisition of 27 spectra for selected Prototheca strains, with three biological and technical replicates for each of 18 type and reference strains of Prototheca and four strains of Chlorella. This provides reproducible and unique spectra covering a wide m/z range (2000-20 000 Da) for each of the strains used in the present study. The reproducibility of the spectra was further confirmed by employing composite correlation index calculation and main spectra library (MSP) dendrogram creation, available with MALDI Biotyper software. The MSP dendrograms obtained were comparable with the 18S rDNA sequence-based dendrograms. These reference spectra were successfully added to the Bruker database, and the efficiency of identification was evaluated by cross-reference-based and unknown Prototheca identification. It is proposed that the addition of further strains would reinforce the reference spectra library for rapid identification of Prototheca strains to the genus and species/genotype level. © 2011 The Authors. Clinical Microbiology and Infection © 2011 European Society of Clinical Microbiology and Infectious Diseases.
Buckwalter, S. P.; Olson, S. L.; Connelly, B. J.; Lucas, B. C.; Rodning, A. A.; Walchak, R. C.; Deml, S. M.; Wohlfiel, S. L.
2015-01-01
The value of matrix-assisted laser desorption ionization−time of flight mass spectrometry (MALDI-TOF MS) for the identification of bacteria and yeasts is well documented in the literature. Its utility for the identification of mycobacteria and Nocardia spp. has also been reported in a limited scope. In this work, we report the specificity of MALDI-TOF MS for the identification of 162 Mycobacterium species and subspecies, 53 Nocardia species, and 13 genera (totaling 43 species) of other aerobic actinomycetes using both the MALDI-TOF MS manufacturer's supplied database(s) and a custom database generated in our laboratory. The performance of a simplified processing and extraction procedure was also evaluated, and, similar to the results in an earlier literature report, our viability studies confirmed the ability of this process to inactivate Mycobacterium tuberculosis prior to analysis. Following library construction and the specificity study, the performance of MALDI-TOF MS was directly compared with that of 16S rRNA gene sequencing for the evaluation of 297 mycobacteria isolates, 148 Nocardia species isolates, and 61 other aerobic actinomycetes isolates under routine clinical laboratory working conditions over a 6-month period. MALDI-TOF MS is a valuable tool for the identification of these groups of organisms. Limitations in the databases and in the ability of MALDI-TOF MS to rapidly identify slowly growing mycobacteria are discussed. PMID:26637381
Wang, Qi; Zhao, Xiao-Juan; Wang, Zi-Wei; Liu, Li; Wei, Yong-Xin; Han, Xiao; Zeng, Jing; Liao, Wan-Jin
2017-08-01
Rapid and precise identification of Cronobacter species is important for foodborne pathogen detection, however, commercial biochemical methods can only identify Cronobacter strains to genus level in most cases. To evaluate the power of mass spectrometry based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF MS) for Cronobacter species identification, 51 Cronobacter strains (eight reference and 43 wild strains) were identified by both MALDI-TOF MS and 16S rRNA gene sequencing. Biotyper RTC provided by Bruker identified all eight reference and 43 wild strains as Cronobacter species, which demonstrated the power of MALDI-TOF MS to identify Cronobacter strains to genus level. However, using the Bruker's database (6903 main spectra products) and Biotyper software, the MALDI-TOF MS analysis could not identify the investigated strains to species level. When MALDI-TOF MS analysis was performed using the combined in-house Cronobacter database and Bruker's database, bin setting, and unweighted pair group method with arithmetic mean (UPGMA) clustering, all the 51 strains were clearly identified into six Cronobacter species and the identification accuracy increased from 60% to 100%. We demonstrated that MALDI-TOF MS was reliable and easy-to-use for Cronobacter species identification and highlighted the importance of establishing a reliable database and improving the current data analysis methods by integrating the bin setting and UPGMA clustering. Copyright © 2017. Published by Elsevier B.V.
Li, Yuan Hang; Tottenham, Nim
2013-04-01
A growing literature suggests that the self-face is involved in processing the facial expressions of others. The authors experimentally activated self-face representations to assess its effects on the recognition of dynamically emerging facial expressions of others. They exposed participants to videos of either their own faces (self-face prime) or faces of others (nonself-face prime) prior to a facial expression judgment task. Their results show that experimentally activating self-face representations results in earlier recognition of dynamically emerging facial expression. As a group, participants in the self-face prime condition recognized expressions earlier (when less affective perceptual information was available) compared to participants in the nonself-face prime condition. There were individual differences in performance, such that poorer expression identification was associated with higher autism traits (in this neurocognitively healthy sample). However, when randomized into the self-face prime condition, participants with high autism traits performed as well as those with low autism traits. Taken together, these data suggest that the ability to recognize facial expressions in others is linked with the internal representations of our own faces. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Fishes of the Cusiana River (Meta River basin, Colombia), with an identification key to its species
Urbano-Bonilla, Alexander; Ballen, Gustavo A.; Herrera-R, Guido A.; Jhon Zamudio; Herrera-Collazos, Edgar E.; DoNascimiento, Carlos; Saúl Prada-Pedreros; Maldonado-Ocampo, Javier A.
2018-01-01
Abstract The Cusiana River sub-basin has been identified as a priority conservation area in the Orinoco region in Colombia due to its high species diversity. This study presents an updated checklist and identification key for fishes of the Cusiana River sub-basin. The checklist was assembled through direct examination of specimens deposited in the main Colombian ichthyological collections. A total of 2020 lots from 167 different localities from the Cusiana River sub-basin were examined and ranged from 153 to 2970 m in elevation. The highest number of records were from the piedmont region (1091, 54.0 %), followed by the Llanos (878, 43.5 %) and Andean (51, 2.5 %). 241 species distributed in 9 orders, 40 families, and 158 genera were found. The fish species richness observed (241), represents 77.7 % of the 314 estimated species (95 % CI=276.1–394.8). The use of databases to develop lists of fish species is not entirely reliable; therefore taxonomic verification of specimens in collections is essential. The results will facilitate comparisons with other sub-basins of the Orinoquia, which are not categorized as areas of importance for conservation in Colombia. PMID:29416408
Applying the Team Identification-Social Psychological Health Model to Older Sport Fans
ERIC Educational Resources Information Center
Wann, Daniel L.; Rogers, Kelly; Dooley, Keith; Foley, Mary
2011-01-01
According to the Team Identification-Social Psychological Health Model (Wann, 2006b), team identification and social psychological health should be positively correlated because identification leads to important social connections which, in turn, facilitate well-being. Although past research substantiates the hypothesized positive relationship…
Teaching Tip: Active Learning via a Sample Database: The Case of Microsoft's Adventure Works
ERIC Educational Resources Information Center
Mitri, Michel
2015-01-01
This paper describes the use and benefits of Microsoft's Adventure Works (AW) database to teach advanced database skills in a hands-on, realistic environment. Database management and querying skills are a key element of a robust information systems curriculum, and active learning is an important way to develop these skills. To facilitate active…
21 CFR 830.300 - Devices subject to device identification data submission requirements.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Devices subject to device identification data submission requirements. 830.300 Section 830.300 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF... Identification Database § 830.300 Devices subject to device identification data submission requirements. (a) In...
21 CFR 830.330 - Times for submission of unique device identification information.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Times for submission of unique device identification information. 830.330 Section 830.330 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF... Identification Database § 830.330 Times for submission of unique device identification information. (a) The...
A database application for wilderness character monitoring
Ashley Adams; Peter Landres; Simon Kingston
2012-01-01
The National Park Service (NPS) Wilderness Stewardship Division, in collaboration with the Aldo Leopold Wilderness Research Institute and the NPS Inventory and Monitoring Program, developed a database application to facilitate tracking and trend reporting in wilderness character. The Wilderness Character Monitoring Database allows consistent, scientifically based...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Calm, J.M.
The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.
Code of Federal Regulations, 2012 CFR
2012-01-01
... Treasury's database to facilitate their participation in the competitive procurement process for OCC contracts. This database is used by OCC procurement staff to identify firms to be solicited for OCC...
Code of Federal Regulations, 2010 CFR
2010-01-01
... Treasury's database to facilitate their participation in the competitive procurement process for OCC contracts. This database is used by OCC procurement staff to identify firms to be solicited for OCC...
Becker, P.; Gabriel, F.; Cassagne, C.; Accoceberry, I.; Gari-Toussaint, M.; Hasseine, L.; De Geyter, D.; Pierard, D.; Surmont, I.; Djenad, F.; Donnadieu, J. L.; Piarroux, M.; Hendrickx, M.; Piarroux, R.
2017-01-01
ABSTRACT Matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) mass spectrometry has emerged as a reliable technique to identify molds involved in human diseases, including dermatophytes, provided that exhaustive reference databases are available. This study assessed an online identification application based on original algorithms and an extensive in-house reference database comprising 11,851 spectra (938 fungal species and 246 fungal genera). Validation criteria were established using an initial panel of 422 molds, including dermatophytes, previously identified via DNA sequencing (126 species). The application was further assessed using a separate panel of 501 cultured clinical isolates (88 mold taxa including dermatophytes) derived from five hospital laboratories. A total of 438 (87.35%) isolates were correctly identified at the species level, while 26 (5.22%) were assigned to the correct genus but the wrong species and 37 (7.43%) were not identified, since the defined threshold of 20 was not reached. The use of the Bruker Daltonics database included in the MALDI Biotyper software resulted in a much higher rate of unidentified isolates (39.76 and 74.30% using the score thresholds 1.7 and 2.0, respectively). Moreover, the identification delay of the online application remained compatible with real-time online queries (0.15 s per spectrum), and the application was faster than identifications using the MALDI Biotyper software. This is the first study to assess an online identification system based on MALDI-TOF spectrum analysis. We have successfully applied this approach to identify molds, including dermatophytes, for which diversity is insufficiently represented in commercial databases. This free-access application is available to medical mycologists to improve fungal identification. PMID:28637907
Normand, A C; Becker, P; Gabriel, F; Cassagne, C; Accoceberry, I; Gari-Toussaint, M; Hasseine, L; De Geyter, D; Pierard, D; Surmont, I; Djenad, F; Donnadieu, J L; Piarroux, M; Ranque, S; Hendrickx, M; Piarroux, R
2017-09-01
Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry has emerged as a reliable technique to identify molds involved in human diseases, including dermatophytes, provided that exhaustive reference databases are available. This study assessed an online identification application based on original algorithms and an extensive in-house reference database comprising 11,851 spectra (938 fungal species and 246 fungal genera). Validation criteria were established using an initial panel of 422 molds, including dermatophytes, previously identified via DNA sequencing (126 species). The application was further assessed using a separate panel of 501 cultured clinical isolates (88 mold taxa including dermatophytes) derived from five hospital laboratories. A total of 438 (87.35%) isolates were correctly identified at the species level, while 26 (5.22%) were assigned to the correct genus but the wrong species and 37 (7.43%) were not identified, since the defined threshold of 20 was not reached. The use of the Bruker Daltonics database included in the MALDI Biotyper software resulted in a much higher rate of unidentified isolates (39.76 and 74.30% using the score thresholds 1.7 and 2.0, respectively). Moreover, the identification delay of the online application remained compatible with real-time online queries (0.15 s per spectrum), and the application was faster than identifications using the MALDI Biotyper software. This is the first study to assess an online identification system based on MALDI-TOF spectrum analysis. We have successfully applied this approach to identify molds, including dermatophytes, for which diversity is insufficiently represented in commercial databases. This free-access application is available to medical mycologists to improve fungal identification. Copyright © 2017 American Society for Microbiology.
Caillet, P; Oberlin, P; Monnet, E; Guillon-Grammatico, L; Métral, P; Belhassen, M; Denier, P; Banaei-Bouchareb, L; Viprey, M; Biau, D; Schott, A-M
2017-10-01
Osteoporotic hip fractures (OHF) are associated with significant morbidity and mortality. The French medico-administrative database (SNIIRAM) offers an interesting opportunity to improve the management of OHF. However, the validity of studies conducted with this database relies heavily on the quality of the algorithm used to detect OHF. The aim of the REDSIAM network is to facilitate the use of the SNIIRAM database. The main objective of this study was to present and discuss several OHF-detection algorithms that could be used with this database. A non-systematic literature search was performed. The Medline database was explored during the period January 2005-August 2016. Furthermore, a snowball search was then carried out from the articles included and field experts were contacted. The extraction was conducted using the chart developed by the REDSIAM network's "Methodology" task force. The ICD-10 codes used to detect OHF are mainly S72.0, S72.1, and S72.2. The performance of these algorithms is at best partially validated. Complementary use of medical and surgical procedure codes would affect their performance. Finally, few studies described how they dealt with fractures of non-osteoporotic origin, re-hospitalization, and potential contralateral fracture cases. Authors in the literature encourage the use of ICD-10 codes S72.0 to S72.2 to develop algorithms for OHF detection. These are the codes most frequently used for OHF in France. Depending on the study objectives, other ICD10 codes and medical and surgical procedures could be usefully discussed for inclusion in the algorithm. Detection and management of duplicates and non-osteoporotic fractures should be considered in the process. Finally, when a study is based on such an algorithm, all these points should be precisely described in the publication. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
2010-01-01
Background Quantitative models of biochemical and cellular systems are used to answer a variety of questions in the biological sciences. The number of published quantitative models is growing steadily thanks to increasing interest in the use of models as well as the development of improved software systems and the availability of better, cheaper computer hardware. To maximise the benefits of this growing body of models, the field needs centralised model repositories that will encourage, facilitate and promote model dissemination and reuse. Ideally, the models stored in these repositories should be extensively tested and encoded in community-supported and standardised formats. In addition, the models and their components should be cross-referenced with other resources in order to allow their unambiguous identification. Description BioModels Database http://www.ebi.ac.uk/biomodels/ is aimed at addressing exactly these needs. It is a freely-accessible online resource for storing, viewing, retrieving, and analysing published, peer-reviewed quantitative models of biochemical and cellular systems. The structure and behaviour of each simulation model distributed by BioModels Database are thoroughly checked; in addition, model elements are annotated with terms from controlled vocabularies as well as linked to relevant data resources. Models can be examined online or downloaded in various formats. Reaction network diagrams generated from the models are also available in several formats. BioModels Database also provides features such as online simulation and the extraction of components from large scale models into smaller submodels. Finally, the system provides a range of web services that external software systems can use to access up-to-date data from the database. Conclusions BioModels Database has become a recognised reference resource for systems biology. It is being used by the community in a variety of ways; for example, it is used to benchmark different simulation systems, and to study the clustering of models based upon their annotations. Model deposition to the database today is advised by several publishers of scientific journals. The models in BioModels Database are freely distributed and reusable; the underlying software infrastructure is also available from SourceForge https://sourceforge.net/projects/biomodels/ under the GNU General Public License. PMID:20587024
Zhao, Xinjie; Zeng, Zhongda; Chen, Aiming; Lu, Xin; Zhao, Chunxia; Hu, Chunxiu; Zhou, Lina; Liu, Xinyu; Wang, Xiaolin; Hou, Xiaoli; Ye, Yaorui; Xu, Guowang
2018-05-29
Identification of the metabolites is an essential step in metabolomics study to interpret regulatory mechanism of pathological and physiological processes. However, it is still a big headache in LC-MSn-based studies because of the complexity of mass spectrometry, chemical diversity of metabolites, and deficiency of standards database. In this work, a comprehensive strategy is developed for accurate and batch metabolite identification in non-targeted metabolomics studies. First, a well defined procedure was applied to generate reliable and standard LC-MS2 data including tR, MS1 and MS2 information at a standard operational procedure (SOP). An in-house database including about 2000 metabolites was constructed and used to identify the metabolites in non-targeted metabolic profiling by retention time calibration using internal standards, precursor ion alignment and ion fusion, auto-MS2 information extraction and selection, and database batch searching and scoring. As an application example, a pooled serum sample was analyzed to deliver the strategy, 202 metabolites were identified in the positive ion mode. It shows our strategy is useful for LC-MSn-based non-targeted metabolomics study.
Magnette, Amandine; Huang, Te-Din; Renzi, Francesco; Bogaerts, Pierre; Cornelis, Guy R; Glupczynski, Youri
2016-01-01
Capnocytophaga canimorsus and Capnocytophaga cynodegmi can be transmitted from dogs or cats and cause serious human infections. We aimed to evaluate the ability of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) to identify these two Capnocytophaga species. Ninety-four C. canimorsus and 10 C. cynodegmi isolates identified by 16S rRNA gene sequencing were analyzed. Using the MALDI BioTyper database, correct identification was achieved for only 16 of 94 (17%) C. canimorsus and all 10 C. cynodegmi strains, according to the manufacturer's log score specifications. Following the establishment of a complementary homemade reference database by addition of 51 C. canimorsus and 8 C. cynodegmi mass spectra, MALDI-TOF MS provided reliable identification to the species level for 100% of the 45 blind-coded Capnocytophaga isolates tested. MALDI-TOF MS can accurately identify C. canimorsus and C. cynodegmi using an enriched database and thus constitutes a valuable diagnostic tool in the clinical laboratory. Copyright © 2016 Elsevier Inc. All rights reserved.
Indigenous species barcode database improves the identification of zooplankton
Yang, Jianghua; Zhang, Wanwan; Sun, Jingying; Xie, Yuwei; Zhang, Yimin; Burton, G. Allen; Yu, Hongxia
2017-01-01
Incompleteness and inaccuracy of DNA barcode databases is considered an important hindrance to the use of metabarcoding in biodiversity analysis of zooplankton at the species-level. Species barcoding by Sanger sequencing is inefficient for organisms with small body sizes, such as zooplankton. Here mitochondrial cytochrome c oxidase I (COI) fragment barcodes from 910 freshwater zooplankton specimens (87 morphospecies) were recovered by a high-throughput sequencing platform, Ion Torrent PGM. Intraspecific divergence of most zooplanktons was < 5%, except Branchionus leydign (Rotifer, 14.3%), Trichocerca elongate (Rotifer, 11.5%), Lecane bulla (Rotifer, 15.9%), Synchaeta oblonga (Rotifer, 5.95%) and Schmackeria forbesi (Copepod, 6.5%). Metabarcoding data of 28 environmental samples from Lake Tai were annotated by both an indigenous database and NCBI Genbank database. The indigenous database improved the taxonomic assignment of metabarcoding of zooplankton. Most zooplankton (81%) with barcode sequences in the indigenous database were identified by metabarcoding monitoring. Furthermore, the frequency and distribution of zooplankton were also consistent between metabarcoding and morphology identification. Overall, the indigenous database improved the taxonomic assignment of zooplankton. PMID:28977035
Assigning statistical significance to proteotypic peptides via database searches
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2011-01-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
USDA-ARS?s Scientific Manuscript database
The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are...
Computational tools for exploring sequence databases as a resource for antimicrobial peptides.
Porto, W F; Pires, A S; Franco, O L
Data mining has been recognized by many researchers as a hot topic in different areas. In the post-genomic era, the growing number of sequences deposited in databases has been the reason why these databases have become a resource for novel biological information. In recent years, the identification of antimicrobial peptides (AMPs) in databases has gained attention. The identification of unannotated AMPs has shed some light on the distribution and evolution of AMPs and, in some cases, indicated suitable candidates for developing novel antimicrobial agents. The data mining process has been performed mainly by local alignments and/or regular expressions. Nevertheless, for the identification of distant homologous sequences, other techniques such as antimicrobial activity prediction and molecular modelling are required. In this context, this review addresses the tools and techniques, and also their limitations, for mining AMPs from databases. These methods could be helpful not only for the development of novel AMPs, but also for other kinds of proteins, at a higher level of structural genomics. Moreover, solving the problem of unannotated proteins could bring immeasurable benefits to society, especially in the case of AMPs, which could be helpful for developing novel antimicrobial agents and combating resistant bacteria. Copyright © 2017 Elsevier Inc. All rights reserved.
21 CFR 801.57 - Discontinuation of legacy FDA identification numbers assigned to devices.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Discontinuation of legacy FDA identification... Device Identification § 801.57 Discontinuation of legacy FDA identification numbers assigned to devices... been assigned an FDA labeler code to facilitate use of NHRIC or NDC numbers may continue to use that...
Games, Patrícia Dias; daSilva, Elói Quintas Gonçalves; Barbosa, Meire de Oliveira; Almeida-Souza, Hebréia Oliveira; Fontes, Patrícia Pereira; deMagalhães, Marcos Jorge; Pereira, Paulo Roberto Gomes; Prates, Maura Vianna; Franco, Gloria Regina; Faria-Campos, Alessandra; Campos, Sérgio Vale Aguiar; Baracat-Pereira, Maria Cristina
2016-12-15
Antimicrobial peptides from plants present mechanisms of action that are different from those of conventional defense agents. They are under-explored but have a potential as commercial antimicrobials. Bell pepper leaves ('Magali R') are discarded after harvesting the fruit and are sources of bioactive peptides. This work reports the isolation by peptidomics tools, and the identification and partially characterization by computational tools of an antimicrobial peptide from bell pepper leaves, and evidences the usefulness of records and the in silico analysis for the study of plant peptides aiming biotechnological uses. Aqueous extracts from leaves were enriched in peptide by salt fractionation and ultrafiltration. An antimicrobial peptide was isolated by tandem chromatographic procedures. Mass spectrometry, automated peptide sequencing and bioinformatics tools were used alternately for identification and partial characterization of the Hevein-like peptide, named HEV-CANN. The computational tools that assisted to the identification of the peptide included BlastP, PSI-Blast, ClustalOmega, PeptideCutter, and ProtParam; conventional protein databases (DB) as Mascot, Protein-DB, GenBank-DB, RefSeq, Swiss-Prot, and UniProtKB; specific for peptides DB as Amper, APD2, CAMP, LAMPs, and PhytAMP; other tools included in ExPASy for Proteomics; The Bioactive Peptide Databases, and The Pepper Genome Database. The HEV-CANN sequence presented 40 amino acid residues, 4258.8 Da, theoretical pI-value of 8.78, and four disulfide bonds. It was stable, and it has inhibited the growth of phytopathogenic bacteria and a fungus. HEV-CANN presented a chitin-binding domain in their sequence. There was a high identity and a positive alignment of HEV-CANN sequence in various databases, but there was not a complete identity, suggesting that HEV-CANN may be produced by ribosomal synthesis, which is in accordance with its constitutive nature. Computational tools for proteomics and databases are not adjusted for short sequences, which hampered HEV-CANN identification. The adjustment of statistical tests in large databases for proteins is an alternative to promote the significant identification of peptides. The development of specific DB for plant antimicrobial peptides, with information about peptide sequences, functional genomic data, structural motifs and domains of molecules, functional domains, and peptide-biomolecule interactions are valuable and necessary.
Bushakra, Jill M; Lewers, Kim S; Staton, Margaret E; Zhebentyayeva, Tetyana; Saski, Christopher A
2015-10-26
Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed sequence tags (ESTs) are a source of SSRs that can be used to develop markers to facilitate plant breeding and for more basic research across genera and higher plant orders. Leaf and meristem tissue from 'Heritage' red raspberry (Rubus idaeus) and 'Bristol' black raspberry (R. occidentalis) were utilized for RNA extraction. After conversion to cDNA and library construction, ESTs were sequenced, quality verified, assembled and scanned for SSRs. Primers flanking the SSRs were designed and a subset tested for amplification, polymorphism and transferability across species. ESTs containing SSRs were functionally annotated using the GenBank non-redundant (nr) database and further classified using the gene ontology database. To accelerate development of EST-SSRs in the genus Rubus (Rosaceae), 1149 and 2358 cDNA sequences were generated from red raspberry and black raspberry, respectively. The cDNA sequences were screened using rigorous filtering criteria which resulted in the identification of 121 and 257 SSR loci for red and black raspberry, respectively. Primers were designed from the surrounding sequences resulting in 131 and 288 primer pairs, respectively, as some sequences contained more than one SSR locus. Sequence analysis revealed that the SSR-containing genes span a diversity of functions and share more sequence identity with strawberry genes than with other Rosaceous species. This resource of Rubus-specific, gene-derived markers will facilitate the construction of linkage maps composed of transferable markers for studying and manipulating important traits in this economically important genus.
Challenges in conducting post-authorisation safety studies (PASS): A vaccine manufacturer's view.
Cohet, Catherine; Rosillon, Dominique; Willame, Corinne; Haguinet, Francois; Marenne, Marie-Noëlle; Fontaine, Sandrine; Buyse, Hubert; Bauchau, Vincent; Baril, Laurence
2017-05-25
Post-authorisation safety studies (PASS) of vaccines assess or quantify the risk of adverse events following immunisation that were not identified or could not be estimated pre-licensure. The aim of this perspective paper is to describe the authors' experience in the design and conduct of twelve PASS that contributed to the evaluation of the benefit-risk of vaccines in real-world settings. We describe challenges and learnings from selected PASS of rotavirus, malaria, influenza, human papillomavirus and measles-mumps-rubella-varicella vaccines that assessed or identified potential or theoretical risks, which may lead to changes to risk management plans and/or to label updates. Study settings include the use of large healthcare databases and de novo data collection. PASS methodology is influenced by the background incidence of the outcome of interest, vaccine uptake, availability and quality of data sources, identification of the at-risk population and of suitable comparators, availability of validated case definitions, and the frequent need for case ascertainment in large databases. Challenges include the requirement for valid exposure and outcome data, identification of, and access to, adequate data sources, and mitigating limitations including bias and confounding. Assessing feasibility is becoming a key step to confirm that study objectives can be met in a timely manner. PASS provide critical information for regulators, public health agencies, vaccine manufacturers and ultimately, individuals. Collaborative approaches and synergistic efforts between vaccine manufacturers and key stakeholders, such as regulatory and public health agencies, are needed to facilitate access to data, and to drive optimal study design and implementation, with the aim of generating robust evidence. Copyright © 2017 GSK Biologicals SA. Published by Elsevier Ltd.. All rights reserved.
Privacy-preserving matching of similar patients.
Vatsalan, Dinusha; Christen, Peter
2016-02-01
The identification of similar entities represented by records in different databases has drawn considerable attention in many application areas, including in the health domain. One important type of entity matching application that is vital for quality healthcare analytics is the identification of similar patients, known as similar patient matching. A key component of identifying similar records is the calculation of similarity of the values in attributes (fields) between these records. Due to increasing privacy and confidentiality concerns, using the actual attribute values of patient records to identify similar records across different organizations is becoming non-trivial because the attributes in such records often contain highly sensitive information such as personal and medical details of patients. Therefore, the matching needs to be based on masked (encoded) values while being effective and efficient to allow matching of large databases. Bloom filter encoding has widely been used as an efficient masking technique for privacy-preserving matching of string and categorical values. However, no work on Bloom filter-based masking of numerical data, such as integer (e.g. age), floating point (e.g. body mass index), and modulus (numbers wrap around upon reaching a certain value, e.g. date and time), which are commonly required in the health domain, has been presented in the literature. We propose a framework with novel methods for masking numerical data using Bloom filters, thereby facilitating the calculation of similarities between records. We conduct an empirical study on publicly available real-world datasets which shows that our framework provides efficient masking and achieves similar matching accuracy compared to the matching of actual unencoded patient records. Copyright © 2015 Elsevier Inc. All rights reserved.
Sharing and community curation of mass spectrometry data with GNPS
Nguyen, Don Duy; Watrous, Jeramie; Kapono, Clifford A; Luzzatto-Knaan, Tal; Porto, Carla; Bouslimani, Amina; Melnik, Alexey V; Meehan, Michael J; Liu, Wei-Ting; Crüsemann, Max; Boudreau, Paul D; Esquenazi, Eduardo; Sandoval-Calderón, Mario; Kersten, Roland D; Pace, Laura A; Quinn, Robert A; Duncan, Katherine R; Hsu, Cheng-Chih; Floros, Dimitrios J; Gavilan, Ronnie G; Kleigrewe, Karin; Northen, Trent; Dutton, Rachel J; Parrot, Delphine; Carlson, Erin E; Aigle, Bertrand; Michelsen, Charlotte F; Jelsbak, Lars; Sohlenkamp, Christian; Pevzner, Pavel; Edlund, Anna; McLean, Jeffrey; Piel, Jörn; Murphy, Brian T; Gerwick, Lena; Liaw, Chih-Chuang; Yang, Yu-Liang; Humpf, Hans-Ulrich; Maansson, Maria; Keyzers, Robert A; Sims, Amy C; Johnson, Andrew R.; Sidebottom, Ashley M; Sedio, Brian E; Klitgaard, Andreas; Larson, Charles B; P., Cristopher A Boya; Torres-Mendoza, Daniel; Gonzalez, David J; Silva, Denise B; Marques, Lucas M; Demarque, Daniel P; Pociute, Egle; O'Neill, Ellis C; Briand, Enora; Helfrich, Eric J. N.; Granatosky, Eve A; Glukhov, Evgenia; Ryffel, Florian; Houson, Hailey; Mohimani, Hosein; Kharbush, Jenan J; Zeng, Yi; Vorholt, Julia A; Kurita, Kenji L; Charusanti, Pep; McPhail, Kerry L; Nielsen, Kristian Fog; Vuong, Lisa; Elfeki, Maryam; Traxler, Matthew F; Engene, Niclas; Koyama, Nobuhiro; Vining, Oliver B; Baric, Ralph; Silva, Ricardo R; Mascuch, Samantha J; Tomasi, Sophie; Jenkins, Stefan; Macherla, Venkat; Hoffman, Thomas; Agarwal, Vinayak; Williams, Philip G; Dai, Jingqui; Neupane, Ram; Gurr, Joshua; Rodríguez, Andrés M. C.; Lamsa, Anne; Zhang, Chen; Dorrestein, Kathleen; Duggan, Brendan M; Almaliti, Jehad; Allard, Pierre-Marie; Phapale, Prasad; Nothias, Louis-Felix; Alexandrov, Theodore; Litaudon, Marc; Wolfender, Jean-Luc; Kyle, Jennifer E; Metz, Thomas O; Peryea, Tyler; Nguyen, Dac-Trung; VanLeer, Danielle; Shinn, Paul; Jadhav, Ajit; Müller, Rolf; Waters, Katrina M; Shi, Wenyuan; Liu, Xueting; Zhang, Lixin; Knight, Rob; Jensen, Paul R; Palsson, Bernhard O; Pogliano, Kit; Linington, Roger G; Gutiérrez, Marcelino; Lopes, Norberto P; Gerwick, William H; Moore, Bradley S; Dorrestein, Pieter C; Bandeira, Nuno
2017-01-01
The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry techniques are well-suited to high-throughput characterization of natural products, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social molecular networking (GNPS, http://gnps.ucsd.edu), an open-access knowledge base for community wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of ‘living data’ through continuous reanalysis of deposited data. PMID:27504778
Chemical-Help Application for Classification and Identification of Stormwater Constituents
Granato, Gregory E.; Driskell, Timothy R.; Nunes, Catherine
2000-01-01
A computer application called Chemical Help was developed to facilitate review of reports for the National Highway Runoff Water-Quality Data and Methodology Synthesis (NDAMS). The application provides a tool to quickly find a proper classification for any constituent in the NDAMS review sheets. Chemical Help contents include the name of each water-quality property, constituent, or parameter, the section number within the NDAMS review sheet, the organizational levels within a classification hierarchy, the database number, and where appropriate, the chemical formula, the Chemical Abstract Service number, and a list of synonyms (for the organic chemicals). Therefore, Chemical Help provides information necessary to research available reference data for the water-quality properties and constituents of potential interest in stormwater studies. Chemical Help is implemented in the Microsoft help-system interface. (Computer files for the use and documentation of Chemical Help are included on an accompanying diskette.)
Web-based healthcare hand drawing management system.
Hsieh, Sheau-Ling; Weng, Yung-Ching; Chen, Chi-Huang; Hsu, Kai-Ping; Lin, Jeng-Wei; Lai, Feipei
2010-01-01
The paper addresses Medical Hand Drawing Management System architecture and implementation. In the system, we developed four modules: hand drawing management module; patient medical records query module; hand drawing editing and upload module; hand drawing query module. The system adapts windows-based applications and encompasses web pages by ASP.NET hosting mechanism under web services platforms. The hand drawings implemented as files are stored in a FTP server. The file names with associated data, e.g. patient identification, drawing physician, access rights, etc. are reposited in a database. The modules can be conveniently embedded, integrated into any system. Therefore, the system possesses the hand drawing features to support daily medical operations, effectively improve healthcare qualities as well. Moreover, the system includes the printing capability to achieve a complete, computerized medical document process. In summary, the system allows web-based applications to facilitate the graphic processes for healthcare operations.
Structure Elucidation of Unknown Metabolites in Metabolomics by Combined NMR and MS/MS Prediction
Boiteau, Rene M.; Hoyt, David W.; Nicora, Carrie D.; ...
2018-01-17
Here, we introduce a cheminformatics approach that combines highly selective and orthogonal structure elucidation parameters; accurate mass, MS/MS (MS 2), and NMR in a single analysis platform to accurately identify unknown metabolites in untargeted studies. The approach starts with an unknown LC-MS feature, and then combines the experimental MS/MS and NMR information of the unknown to effectively filter the false positive candidate structures based on their predicted MS/MS and NMR spectra. We demonstrate the approach on a model mixture and then we identify an uncatalogued secondary metabolite in Arabidopsis thaliana. The NMR/MS 2 approach is well suited for discovery ofmore » new metabolites in plant extracts, microbes, soils, dissolved organic matter, food extracts, biofuels, and biomedical samples, facilitating the identification of metabolites that are not present in experimental NMR and MS metabolomics databases.« less
Lieutaud, Philippe; Uversky, Alexey V.; Uversky, Vladimir N.; Longhi, Sonia
2016-01-01
ABSTRACT In the last 2 decades it has become increasingly evident that a large number of proteins are either fully or partially disordered. Intrinsically disordered proteins lack a stable 3D structure, are ubiquitous and fulfill essential biological functions. Their conformational heterogeneity is encoded in their amino acid sequences, thereby allowing intrinsically disordered proteins or regions to be recognized based on properties of these sequences. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to structural determination with X-ray crystallization. This article discusses a comprehensive selection of databases and methods currently employed to disseminate experimental and putative annotations of disorder, predict disorder and identify regions involved in induced folding. It also provides a set of detailed instructions that should be followed to perform computational analysis of disorder. PMID:28232901
Wang, Mingxun; Carver, Jeremy J; Phelan, Vanessa V; Sanchez, Laura M; Garg, Neha; Peng, Yao; Nguyen, Don Duy; Watrous, Jeramie; Kapono, Clifford A; Luzzatto-Knaan, Tal; Porto, Carla; Bouslimani, Amina; Melnik, Alexey V; Meehan, Michael J; Liu, Wei-Ting; Crüsemann, Max; Boudreau, Paul D; Esquenazi, Eduardo; Sandoval-Calderón, Mario; Kersten, Roland D; Pace, Laura A; Quinn, Robert A; Duncan, Katherine R; Hsu, Cheng-Chih; Floros, Dimitrios J; Gavilan, Ronnie G; Kleigrewe, Karin; Northen, Trent; Dutton, Rachel J; Parrot, Delphine; Carlson, Erin E; Aigle, Bertrand; Michelsen, Charlotte F; Jelsbak, Lars; Sohlenkamp, Christian; Pevzner, Pavel; Edlund, Anna; McLean, Jeffrey; Piel, Jörn; Murphy, Brian T; Gerwick, Lena; Liaw, Chih-Chuang; Yang, Yu-Liang; Humpf, Hans-Ulrich; Maansson, Maria; Keyzers, Robert A; Sims, Amy C; Johnson, Andrew R; Sidebottom, Ashley M; Sedio, Brian E; Klitgaard, Andreas; Larson, Charles B; P, Cristopher A Boya; Torres-Mendoza, Daniel; Gonzalez, David J; Silva, Denise B; Marques, Lucas M; Demarque, Daniel P; Pociute, Egle; O'Neill, Ellis C; Briand, Enora; Helfrich, Eric J N; Granatosky, Eve A; Glukhov, Evgenia; Ryffel, Florian; Houson, Hailey; Mohimani, Hosein; Kharbush, Jenan J; Zeng, Yi; Vorholt, Julia A; Kurita, Kenji L; Charusanti, Pep; McPhail, Kerry L; Nielsen, Kristian Fog; Vuong, Lisa; Elfeki, Maryam; Traxler, Matthew F; Engene, Niclas; Koyama, Nobuhiro; Vining, Oliver B; Baric, Ralph; Silva, Ricardo R; Mascuch, Samantha J; Tomasi, Sophie; Jenkins, Stefan; Macherla, Venkat; Hoffman, Thomas; Agarwal, Vinayak; Williams, Philip G; Dai, Jingqui; Neupane, Ram; Gurr, Joshua; Rodríguez, Andrés M C; Lamsa, Anne; Zhang, Chen; Dorrestein, Kathleen; Duggan, Brendan M; Almaliti, Jehad; Allard, Pierre-Marie; Phapale, Prasad; Nothias, Louis-Felix; Alexandrov, Theodore; Litaudon, Marc; Wolfender, Jean-Luc; Kyle, Jennifer E; Metz, Thomas O; Peryea, Tyler; Nguyen, Dac-Trung; VanLeer, Danielle; Shinn, Paul; Jadhav, Ajit; Müller, Rolf; Waters, Katrina M; Shi, Wenyuan; Liu, Xueting; Zhang, Lixin; Knight, Rob; Jensen, Paul R; Palsson, Bernhard O; Pogliano, Kit; Linington, Roger G; Gutiérrez, Marcelino; Lopes, Norberto P; Gerwick, William H; Moore, Bradley S; Dorrestein, Pieter C; Bandeira, Nuno
2016-08-09
The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry (MS) techniques are well-suited to high-throughput characterization of NP, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social Molecular Networking (GNPS; http://gnps.ucsd.edu), an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of 'living data' through continuous reanalysis of deposited data.
Structure Elucidation of Unknown Metabolites in Metabolomics by Combined NMR and MS/MS Prediction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boiteau, Rene M.; Hoyt, David W.; Nicora, Carrie D.
Here, we introduce a cheminformatics approach that combines highly selective and orthogonal structure elucidation parameters; accurate mass, MS/MS (MS 2), and NMR in a single analysis platform to accurately identify unknown metabolites in untargeted studies. The approach starts with an unknown LC-MS feature, and then combines the experimental MS/MS and NMR information of the unknown to effectively filter the false positive candidate structures based on their predicted MS/MS and NMR spectra. We demonstrate the approach on a model mixture and then we identify an uncatalogued secondary metabolite in Arabidopsis thaliana. The NMR/MS 2 approach is well suited for discovery ofmore » new metabolites in plant extracts, microbes, soils, dissolved organic matter, food extracts, biofuels, and biomedical samples, facilitating the identification of metabolites that are not present in experimental NMR and MS metabolomics databases.« less
Mathematics and online learning experiences: a gateway site for engineering students
NASA Astrophysics Data System (ADS)
Masouros, Spyridon D.; Alpay, Esat
2010-03-01
This paper focuses on the preliminary design of a multifaceted computer-based mathematics resource for undergraduate and pre-entry engineering students. Online maths resources, while attractive in their flexibility of delivery, have seen variable interest from students and teachers alike. Through student surveys and wide consultations, guidelines have been developed for effectively collating and integrating learning, support, application and diagnostic tools to produce an Engineer's Mathematics Gateway. Specific recommendations include: the development of a shared database of engineering discipline-specific problems and examples; the identification of, and resource development for, troublesome mathematics topics which encompass ideas of threshold concepts and mastery components; the use of motivational and promotional material to raise student interest in learning mathematics in an engineering context; the use of general and lecture-specific concept maps and matrices to identify the needs and relevance of mathematics to engineering topics; and further exploration of the facilitation of peer-based learning through online resources.
[The participation of seniors in volunteer activities: a systematic review].
Godbout, Elisabeth; Filiatrault, Johanne; Plante, Michelle
2012-02-01
Volunteer work can be a very significant form of social participation for seniors. It can also provide seniors with important physical and psychological health benefits. This explains why occupational therapists and other health care professionals, as well as community workers who are concerned with healthy aging, appeal to seniors to volunteer in health promotion and community support However, the recruitment and ongoing involvement of seniors as volunteers is often challenging. A systematic review of the literature was undertaken to enlighten practitioners working in this domain. The objective was to identify factors that influence seniors' participation in volunteer work. Six bibliographic databases were searched using key words. A total of 27 relevant papers were retrieved and allowed an identification of a series of factors that could influence seniors' participation in volunteer work, namely personal factors, environmental factors, and occupational factors. This analysis leads to practical guidelines for facilitating the recruitment and maintenance of seniors' engagement in volunteer work.
Psychiatry's Role in the Management of Human Trafficking Victims: An Integrated Care Approach.
Gordon, Mollie; Salami, Temilola; Coverdale, John; Nguyen, Phuong T
2018-03-01
Human trafficking is an outrageous human rights violation with potentially devastating consequences to individuals and the public health. Victims are often underrecognized and there are few guidelines for how best to identify, care for, and safely reintegrate victims back into the community. The purpose of this paper is to propose a multifaceted, interdisciplinary, and interprofessional guideline for providing care and services to human trafficking victims. Databases such as PubMed and PsycINFO were searched for papers outlining human trafficking programs with a primary psychiatric focus. No integrated care models that provide decisional guidelines at different points of intervention for human trafficking patients and that highlight the important role of psychiatric consultation were found. Psychiatrists and psychologists are pivotal to an integrated care approach in health care settings. The provision of such a comprehensive and integrated model of care should facilitate the identification of victims, promote their recovery, and reduce the possibility of retraumatization.
Structure Elucidation of Unknown Metabolites in Metabolomics by Combined NMR and MS/MS Prediction
Hoyt, David W.; Nicora, Carrie D.; Kinmonth-Schultz, Hannah A.; Ward, Joy K.
2018-01-01
We introduce a cheminformatics approach that combines highly selective and orthogonal structure elucidation parameters; accurate mass, MS/MS (MS2), and NMR into a single analysis platform to accurately identify unknown metabolites in untargeted studies. The approach starts with an unknown LC-MS feature, and then combines the experimental MS/MS and NMR information of the unknown to effectively filter out the false positive candidate structures based on their predicted MS/MS and NMR spectra. We demonstrate the approach on a model mixture, and then we identify an uncatalogued secondary metabolite in Arabidopsis thaliana. The NMR/MS2 approach is well suited to the discovery of new metabolites in plant extracts, microbes, soils, dissolved organic matter, food extracts, biofuels, and biomedical samples, facilitating the identification of metabolites that are not present in experimental NMR and MS metabolomics databases. PMID:29342073
Intelligent communication assistant for databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jakobson, G.; Shaked, V.; Rowley, S.
1983-01-01
An intelligent communication assistant for databases, called FRED (front end for databases) is explored. FRED is designed to facilitate access to database systems by users of varying levels of experience. FRED is a second generation of natural language front-ends for databases and intends to solve two critical interface problems existing between end-users and databases: connectivity and communication problems. The authors report their experiences in developing software for natural language query processing, dialog control, and knowledge representation, as well as the direction of future work. 10 references.
How to: identify non-tuberculous Mycobacterium species using MALDI-TOF mass spectrometry.
Alcaide, F; Amlerová, J; Bou, G; Ceyssens, P J; Coll, P; Corcoran, D; Fangous, M-S; González-Álvarez, I; Gorton, R; Greub, G; Hery-Arnaud, G; Hrábak, J; Ingebretsen, A; Lucey, B; Marekoviċ, I; Mediavilla-Gradolph, C; Monté, M R; O'Connor, J; O'Mahony, J; Opota, O; O'Reilly, B; Orth-Höller, D; Oviaño, M; Palacios, J J; Palop, B; Pranada, A B; Quiroga, L; Rodríguez-Temporal, D; Ruiz-Serrano, M J; Tudó, G; Van den Bossche, A; van Ingen, J; Rodriguez-Sanchez, B
2018-06-01
The implementation of MALDI-TOF MS for microorganism identification has changed the routine of the microbiology laboratories as we knew it. Most microorganisms can now be reliably identified within minutes using this inexpensive, user-friendly methodology. However, its application in the identification of mycobacteria isolates has been hampered by the structure of their cell wall. Improvements in the sample processing method and in the available database have proved key factors for the rapid and reliable identification of non-tuberculous mycobacteria isolates using MALDI-TOF MS. The main objective is to provide information about the proceedings for the identification of non-tuberculous isolates using MALDI-TOF MS and to review different sample processing methods, available databases, and the interpretation of the results. Results from relevant studies on the use of the available MALDI-TOF MS instruments, the implementation of innovative sample processing methods, or the implementation of improved databases are discussed. Insight about the methodology required for reliable identification of non-tuberculous mycobacteria and its implementation in the microbiology laboratory routine is provided. Microbiology laboratories where MALDI-TOF MS is available can benefit from its capacity to identify most clinically interesting non-tuberculous mycobacteria in a rapid, reliable, and inexpensive manner. Copyright © 2017 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-16
... Boating Accident Report Database AGENCY: Coast Guard, DHS. ACTION: Reopening of public comment period... Boating Accident Report Database. DATES: Comments and related material must either be submitted to our... Database that, collectively, are intended to improve recreational boating safety efforts, enhance law...
Bortolan, Giovanni
2015-01-01
Traditional means for identity validation (PIN codes, passwords), and physiological and behavioral biometric characteristics (fingerprint, iris, and speech) are susceptible to hacker attacks and/or falsification. This paper presents a method for person verification/identification based on correlation of present-to-previous limb ECG leads: I (r I), II (r II), calculated from them first principal ECG component (r PCA), linear and nonlinear combinations between r I, r II, and r PCA. For the verification task, the one-to-one scenario is applied and threshold values for r I, r II, and r PCA and their combinations are derived. The identification task supposes one-to-many scenario and the tested subject is identified according to the maximal correlation with a previously recorded ECG in a database. The population based ECG-ILSA database of 540 patients (147 healthy subjects, 175 patients with cardiac diseases, and 218 with hypertension) has been considered. In addition a common reference PTB dataset (14 healthy individuals) with short time interval between the two acquisitions has been taken into account. The results on ECG-ILSA database were satisfactory with healthy people, and there was not a significant decrease in nonhealthy patients, demonstrating the robustness of the proposed method. With PTB database, the method provides an identification accuracy of 92.9% and a verification sensitivity and specificity of 100% and 89.9%. PMID:26568954
Jekova, Irena; Bortolan, Giovanni
2015-01-01
Traditional means for identity validation (PIN codes, passwords), and physiological and behavioral biometric characteristics (fingerprint, iris, and speech) are susceptible to hacker attacks and/or falsification. This paper presents a method for person verification/identification based on correlation of present-to-previous limb ECG leads: I (r I), II (r II), calculated from them first principal ECG component (r PCA), linear and nonlinear combinations between r I, r II, and r PCA. For the verification task, the one-to-one scenario is applied and threshold values for r I, r II, and r PCA and their combinations are derived. The identification task supposes one-to-many scenario and the tested subject is identified according to the maximal correlation with a previously recorded ECG in a database. The population based ECG-ILSA database of 540 patients (147 healthy subjects, 175 patients with cardiac diseases, and 218 with hypertension) has been considered. In addition a common reference PTB dataset (14 healthy individuals) with short time interval between the two acquisitions has been taken into account. The results on ECG-ILSA database were satisfactory with healthy people, and there was not a significant decrease in nonhealthy patients, demonstrating the robustness of the proposed method. With PTB database, the method provides an identification accuracy of 92.9% and a verification sensitivity and specificity of 100% and 89.9%.
Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas
2017-07-01
Next-generation sequencing (NGS) has become a powerful and efficient tool for routine mutation screening in clinical research. As each NGS test yields hundreds of variants, the current challenge is to meaningfully interpret the data and select potential candidates. Analyzing each variant while manually investigating several relevant databases to collect specific information is a cumbersome and time-consuming process, and it requires expertise and familiarity with these databases. Thus, a tool that can seamlessly annotate variants with clinically relevant databases under one common interface would be of great help for variant annotation, cross-referencing, and visualization. This tool would allow variants to be processed in an automated and high-throughput manner and facilitate the investigation of variants in several genome browsers. Several analysis tools are available for raw sequencing-read processing and variant identification, but an automated variant filtering, annotation, cross-referencing, and visualization tool is still lacking. To fulfill these requirements, we developed DaMold, a Web-based, user-friendly tool that can filter and annotate variants and can access and compile information from 37 resources. It is easy to use, provides flexible input options, and accepts variants from NGS and Sanger sequencing as well as hotspots in VCF and BED formats. DaMold is available as an online application at http://damold.platomics.com/index.html, and as a Docker container and virtual machine at https://sourceforge.net/projects/damold/. © 2017 Wiley Periodicals, Inc.
Yang, Qi; Franco, Christopher M M; Sorokin, Shirley J; Zhang, Wei
2017-02-02
For sponges (phylum Porifera), there is no reliable molecular protocol available for species identification. To address this gap, we developed a multilocus-based Sponge Identification Protocol (SIP) validated by a sample of 37 sponge species belonging to 10 orders from South Australia. The universal barcode COI mtDNA, 28S rRNA gene (D3-D5), and the nuclear ITS1-5.8S-ITS2 region were evaluated for their suitability and capacity for sponge identification. The highest Bit Score was applied to infer the identity. The reliability of SIP was validated by phylogenetic analysis. The 28S rRNA gene and COI mtDNA performed better than the ITS region in classifying sponges at various taxonomic levels. A major limitation is that the databases are not well populated and possess low diversity, making it difficult to conduct the molecular identification protocol. The identification is also impacted by the accuracy of the morphological classification of the sponges whose sequences have been submitted to the database. Re-examination of the morphological identification further demonstrated and improved the reliability of sponge identification by SIP. Integrated with morphological identification, the multilocus-based SIP offers an improved protocol for more reliable and effective sponge identification, by coupling the accuracy of different DNA markers.
Yang, Qi; Franco, Christopher M. M.; Sorokin, Shirley J.; Zhang, Wei
2017-01-01
For sponges (phylum Porifera), there is no reliable molecular protocol available for species identification. To address this gap, we developed a multilocus-based Sponge Identification Protocol (SIP) validated by a sample of 37 sponge species belonging to 10 orders from South Australia. The universal barcode COI mtDNA, 28S rRNA gene (D3–D5), and the nuclear ITS1-5.8S-ITS2 region were evaluated for their suitability and capacity for sponge identification. The highest Bit Score was applied to infer the identity. The reliability of SIP was validated by phylogenetic analysis. The 28S rRNA gene and COI mtDNA performed better than the ITS region in classifying sponges at various taxonomic levels. A major limitation is that the databases are not well populated and possess low diversity, making it difficult to conduct the molecular identification protocol. The identification is also impacted by the accuracy of the morphological classification of the sponges whose sequences have been submitted to the database. Re-examination of the morphological identification further demonstrated and improved the reliability of sponge identification by SIP. Integrated with morphological identification, the multilocus-based SIP offers an improved protocol for more reliable and effective sponge identification, by coupling the accuracy of different DNA markers. PMID:28150727
Identity change among smokers and ex-smokers: Findings from the ITC Netherlands Survey.
Meijer, Eline; van Laar, Colette; Gebhardt, Winifred A; Fokkema, Marjolein; van den Putte, Bas; Dijkstra, Arie; Fong, Geoffrey T; Willemsen, Marc C
2017-06-01
Successful smoking cessation appears to be facilitated by identity change, that is, when quitting or nonsmoking becomes part of smokers' and ex-smokers' self-concepts. The current longitudinal study is the first to examine how identity changes over time among smokers and ex-smokers and whether this can be predicted by socioeconomic status (SES) and psychosocial factors (i.e., attitude, perceived health damage, social norms, stigma, acceptance, self-evaluative emotions, health worries, expected social support). We examined identification with smoking (i.e., smoker self-identity) and quitting (i.e., quitter self-identity) among a large sample of smokers (n = 742) and ex-smokers (n = 201) in a cohort study with yearly measurements between 2009 and 2014. Latent growth curve modeling was used as an advanced statistical technique. As hypothesized, smokers perceived themselves more as smokers and less as quitters than do ex-smokers, and identification with smoking increased over time among smokers and decreased among ex-smokers. Furthermore, psychosocial factors predicted baseline identity and identity development. Socioeconomic status (SES) was particularly important. Specifically, lower SES smokers and lower SES ex-smokers identified more strongly with smoking, and smoker and quitter identities were more resistant to change among lower SES groups. Moreover, stronger proquitting social norms were associated with increasing quitter identities over time among smokers and ex-smokers and with decreasing smoker identities among ex-smokers. Predictors of identity differed between smokers and ex-smokers. Results suggest that SES and proquitting social norms should be taken into account when developing ways to facilitate identity change and, thereby, successful smoking cessation. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Wang, Yanli; Bryant, Stephen H.; Cheng, Tiejun; Wang, Jiyao; Gindulyte, Asta; Shoemaker, Benjamin A.; Thiessen, Paul A.; He, Siqian; Zhang, Jian
2017-01-01
PubChem's BioAssay database (https://pubchem.ncbi.nlm.nih.gov) has served as a public repository for small-molecule and RNAi screening data since 2004 providing open access of its data content to the community. PubChem accepts data submission from worldwide researchers at academia, industry and government agencies. PubChem also collaborates with other chemical biology database stakeholders with data exchange. With over a decade's development effort, it becomes an important information resource supporting drug discovery and chemical biology research. To facilitate data discovery, PubChem is integrated with all other databases at NCBI. In this work, we provide an update for the PubChem BioAssay database describing several recent development including added sources of research data, redesigned BioAssay record page, new BioAssay classification browser and new features in the Upload system facilitating data sharing. PMID:27899599
Goldberg, M; Carton, M; Gourmelen, J; Genreau, M; Montourcy, M; Le Got, S; Zins, M
2016-09-01
In France, the national health database (SNIIRAM) is an administrative health database that collects data on hospitalizations and healthcare consumption for more than 60 million people. Although it does not record behavioral and environmental data, these data have a major interest for epidemiology, surveillance and public health. One of the most interesting uses of SNIIRAM is its linkage with surveys collecting data directly from persons. Access to the SNIIRAM data is currently relatively limited, but in the near future changes in regulations will largely facilitate open access. However, it is a huge and complex database and there are some important methodological and technical difficulties for using it due to its volume and architecture. We are developing tools for facilitating the linkage of the Gazel and Constances cohorts to the SNIIRAM: interactive documentation on the SNIIRAM database, software for the verification of the completeness and validity of the data received from the SNIIRAM, methods for constructing indicators from the raw data in order to flag the presence of certain events (specific diagnosis, procedure, drug…), standard queries for producing a set of variables on a specific area (drugs, diagnoses during a hospital stay…). Moreover, the REDSIAM network recently set up aims to develop, evaluate and make available algorithms to identify pathologies in SNIIRAM. In order to fully benefit from the exceptional potential of the SNIIRAM database, it is essential to develop tools to facilitate its use. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Rhinoplasty perioperative database using a personal digital assistant.
Kotler, Howard S
2004-01-01
To construct a reliable, accurate, and easy-to-use handheld computer database that facilitates the point-of-care acquisition of perioperative text and image data specific to rhinoplasty. A user-modified database (Pendragon Forms [v.3.2]; Pendragon Software Corporation, Libertyville, Ill) and graphic image program (Tealpaint [v.4.87]; Tealpaint Software, San Rafael, Calif) were used to capture text and image data, respectively, on a Palm OS (v.4.11) handheld operating with 8 megabytes of memory. The handheld and desktop databases were maintained secure using PDASecure (v.2.0) and GoldSecure (v.3.0) (Trust Digital LLC, Fairfax, Va). The handheld data were then uploaded to a desktop database of either FileMaker Pro 5.0 (v.1) (FileMaker Inc, Santa Clara, Calif) or Microsoft Access 2000 (Microsoft Corp, Redmond, Wash). Patient data were collected from 15 patients undergoing rhinoplasty in a private practice outpatient ambulatory setting. Data integrity was assessed after 6 months' disk and hard drive storage. The handheld database was able to facilitate data collection and accurately record, transfer, and reliably maintain perioperative rhinoplasty data. Query capability allowed rapid search using a multitude of keyword search terms specific to the operative maneuvers performed in rhinoplasty. Handheld computer technology provides a method of reliably recording and storing perioperative rhinoplasty information. The handheld computer facilitates the reliable and accurate storage and query of perioperative data, assisting the retrospective review of one's own results and enhancement of surgical skills.
Validation of a common data model for active safety surveillance research
Ryan, Patrick B; Reich, Christian G; Hartzema, Abraham G; Stang, Paul E
2011-01-01
Objective Systematic analysis of observational medical databases for active safety surveillance is hindered by the variation in data models and coding systems. Data analysts often find robust clinical data models difficult to understand and ill suited to support their analytic approaches. Further, some models do not facilitate the computations required for systematic analysis across many interventions and outcomes for large datasets. Translating the data from these idiosyncratic data models to a common data model (CDM) could facilitate both the analysts' understanding and the suitability for large-scale systematic analysis. In addition to facilitating analysis, a suitable CDM has to faithfully represent the source observational database. Before beginning to use the Observational Medical Outcomes Partnership (OMOP) CDM and a related dictionary of standardized terminologies for a study of large-scale systematic active safety surveillance, the authors validated the model's suitability for this use by example. Validation by example To validate the OMOP CDM, the model was instantiated into a relational database, data from 10 different observational healthcare databases were loaded into separate instances, a comprehensive array of analytic methods that operate on the data model was created, and these methods were executed against the databases to measure performance. Conclusion There was acceptable representation of the data from 10 observational databases in the OMOP CDM using the standardized terminologies selected, and a range of analytic methods was developed and executed with sufficient performance to be useful for active safety surveillance. PMID:22037893
Sharma, Rita; Cao, Peijian; Jung, Ki-Hong; Sharma, Manoj K.; Ronald, Pamela C.
2013-01-01
Glycoside hydrolases (GH) catalyze the hydrolysis of glycosidic bonds in cell wall polymers and can have major effects on cell wall architecture. Taking advantage of the massive datasets available in public databases, we have constructed a rice phylogenomic database of GHs (http://ricephylogenomics.ucdavis.edu/cellwalls/gh/). This database integrates multiple data types including the structural features, orthologous relationships, mutant availability, and gene expression patterns for each GH family in a phylogenomic context. The rice genome encodes 437 GH genes classified into 34 families. Based on pairwise comparison with eight dicot and four monocot genomes, we identified 138 GH genes that are highly diverged between monocots and dicots, 57 of which have diverged further in rice as compared with four monocot genomes scanned in this study. Chromosomal localization and expression analysis suggest a role for both whole-genome and localized gene duplications in expansion and diversification of GH families in rice. We examined the meta-profiles of expression patterns of GH genes in twenty different anatomical tissues of rice. Transcripts of 51 genes exhibit tissue or developmental stage-preferential expression, whereas, seventeen other genes preferentially accumulate in actively growing tissues. When queried in RiceNet, a probabilistic functional gene network that facilitates functional gene predictions, nine out of seventeen genes form a regulatory network with the well-characterized genes involved in biosynthesis of cell wall polymers including cellulose synthase and cellulose synthase-like genes of rice. Two-thirds of the GH genes in rice are up regulated in response to biotic and abiotic stress treatments indicating a role in stress adaptation. Our analyses identify potential GH targets for cell wall modification. PMID:23986771
HMDB 4.0: the human metabolome database for 2018.
Wishart, David S; Feunang, Yannick Djoumbou; Marcu, Ana; Guo, An Chi; Liang, Kevin; Vázquez-Fresno, Rosa; Sajed, Tanvir; Johnson, Daniel; Li, Carin; Karu, Naama; Sayeeda, Zinat; Lo, Elvis; Assempour, Nazanin; Berjanskii, Mark; Singhal, Sandeep; Arndt, David; Liang, Yonjie; Badran, Hasan; Grant, Jason; Serra-Cayuela, Arnau; Liu, Yifeng; Mandal, Rupa; Neveu, Vanessa; Pon, Allison; Knox, Craig; Wilson, Michael; Manach, Claudine; Scalbert, Augustin
2018-01-04
The Human Metabolome Database or HMDB (www.hmdb.ca) is a web-enabled metabolomic database containing comprehensive information about human metabolites along with their biological roles, physiological concentrations, disease associations, chemical reactions, metabolic pathways, and reference spectra. First described in 2007, the HMDB is now considered the standard metabolomic resource for human metabolic studies. Over the past decade the HMDB has continued to grow and evolve in response to emerging needs for metabolomics researchers and continuing changes in web standards. This year's update, HMDB 4.0, represents the most significant upgrade to the database in its history. For instance, the number of fully annotated metabolites has increased by nearly threefold, the number of experimental spectra has grown by almost fourfold and the number of illustrated metabolic pathways has grown by a factor of almost 60. Significant improvements have also been made to the HMDB's chemical taxonomy, chemical ontology, spectral viewing, and spectral/text searching tools. A great deal of brand new data has also been added to HMDB 4.0. This includes large quantities of predicted MS/MS and GC-MS reference spectral data as well as predicted (physiologically feasible) metabolite structures to facilitate novel metabolite identification. Additional information on metabolite-SNP interactions and the influence of drugs on metabolite levels (pharmacometabolomics) has also been added. Many other important improvements in the content, the interface, and the performance of the HMDB website have been made and these should greatly enhance its ease of use and its potential applications in nutrition, biochemistry, clinical chemistry, clinical genetics, medicine, and metabolomics science. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Positive facial affect facilitates the identification of famous faces.
Gallegos, Diana R; Tranel, Daniel
2005-06-01
Several convergent lines of evidence have suggested that the presence of an emotion signal in a visual stimulus can influence processing of that stimulus. In the current study, we picked up on this idea, and explored the hypothesis that the presence of an emotional facial expression (happiness) would facilitate the identification of familiar faces. We studied two groups of normal participants (overall N=54), and neurological patients with either left (n=8) or right (n=10) temporal lobectomies. Reaction times were measured while participants named familiar famous faces that had happy expressions or neutral expressions. In support of the hypothesis, naming was significantly faster for the happy faces, and this effect obtained in the normal participants and in both patient groups. In the patients with left temporal lobectomies, the effect size for this facilitation was large (d=0.87), suggesting that this manipulation might have practical implications for helping such patients compensate for the types of naming defects that often accompany their brain damage. Consistent with other recent work, our findings indicate that emotion can facilitate visual identification, perhaps via a modulatory influence of the amygdala on extrastriate cortex.
Visualization and Enabling Science at PO.DAAC
NASA Astrophysics Data System (ADS)
Tauer, E.; To, C.
2017-12-01
Facilitating the identification of appropriate data for scientific inquiry is important for efficient progress, but mechanisms for that identification vary, as does the effectiveness of those mechanisms. Appropriately crafted visualizations provide the means to quickly assess science data and scientific features, but providing the right visualization to the right application can present challenges. Even greater is the challenge of generating and/or re-constituting visualizations on the fly, particularly for large datasets. One avenue to mitigate the challenge is to arrive at an optimized intermediate data format that is tuned for rapid processing without sacrificing the provenance trace back to the original source data. This presentation will discuss the results of trading several current approaches towards an intermediate data format, and suggest a list of key attributes that will facilitate rapid visualization, and in the process, facilitate the identification of the right data for a given application.
50 CFR 660.150 - Mothership (MS) Coop Program.
Code of Federal Regulations, 2012 CFR
2012-10-01
... record in the NMFS permit database. The application will contain the basis of NMFS' calculation. The... registration as listed in the NMFS permit database, or in the identification of the mothership owner or...
50 CFR 660.150 - Mothership (MS) Coop Program.
Code of Federal Regulations, 2013 CFR
2013-10-01
... record in the NMFS permit database. The application will contain the basis of NMFS' calculation. The... registration as listed in the NMFS permit database, or in the identification of the mothership owner or...
FlavonoidSearch: A system for comprehensive flavonoid annotation by mass spectrometry.
Akimoto, Nayumi; Ara, Takeshi; Nakajima, Daisuke; Suda, Kunihiro; Ikeda, Chiaki; Takahashi, Shingo; Muneto, Reiko; Yamada, Manabu; Suzuki, Hideyuki; Shibata, Daisuke; Sakurai, Nozomu
2017-04-28
Currently, in mass spectrometry-based metabolomics, limited reference mass spectra are available for flavonoid identification. In the present study, a database of probable mass fragments for 6,867 known flavonoids (FsDatabase) was manually constructed based on new structure- and fragmentation-related rules using new heuristics to overcome flavonoid complexity. We developed the FlavonoidSearch system for flavonoid annotation, which consists of the FsDatabase and a computational tool (FsTool) to automatically search the FsDatabase using the mass spectra of metabolite peaks as queries. This system showed the highest identification accuracy for the flavonoid aglycone when compared to existing tools and revealed accurate discrimination between the flavonoid aglycone and other compounds. Sixteen new flavonoids were found from parsley, and the diversity of the flavonoid aglycone among different fruits and vegetables was investigated.
Facilitating Collaboration, Knowledge Construction and Communication with Web-Enabled Databases.
ERIC Educational Resources Information Center
McNeil, Sara G.; Robin, Bernard R.
This paper presents an overview of World Wide Web-enabled databases that dynamically generate Web materials and focuses on the use of this technology to support collaboration, knowledge construction, and communication. Database applications have been used in classrooms to support learning activities for over a decade, but, although business and…
USDA-ARS?s Scientific Manuscript database
Epidemiologic studies show inverse associations between flavonoid intake and chronic disease risk. However, a lack of comprehensive databases of the flavonoid content of foods has hindered efforts to fully characterize population intake. Using a newly released database of flavonoid values, we soug...
Kuhn, Stefan; Schlörer, Nils E
2015-08-01
nmrshiftdb2 supports with its laboratory information management system the integration of an electronic lab administration and management into academic NMR facilities. Also, it offers the setup of a local database, while full access to nmrshiftdb2's World Wide Web database is granted. This freely available system allows on the one hand the submission of orders for measurement, transfers recorded data automatically or manually, and enables download of spectra via web interface, as well as the integrated access to prediction, search, and assignment tools of the NMR database for lab users. On the other hand, for the staff and lab administration, flow of all orders can be supervised; administrative tools also include user and hardware management, a statistic functionality for accounting purposes, and a 'QuickCheck' function for assignment control, to facilitate quality control of assignments submitted to the (local) database. Laboratory information management system and database are based on a web interface as front end and are therefore independent of the operating system in use. Copyright © 2015 John Wiley & Sons, Ltd.
Barczyński, Marcin; Randolph, Gregory W; Cernea, Claudio R; Dralle, Henning; Dionigi, Gianlorenzo; Alesina, Piero F; Mihai, Radu; Finck, Camille; Lombardi, Davide; Hartl, Dana M; Miyauchi, Akira; Serpell, Jonathan; Snyder, Samuel; Volpi, Erivelto; Woodson, Gayle; Kraimps, Jean Louis; Hisham, Abdullah N
2013-09-01
Intraoperative neural monitoring (IONM) during thyroid surgery has gained widespread acceptance as an adjunct to the gold standard of visual identification of the recurrent laryngeal nerve (RLN). Contrary to routine dissection of the RLN, most surgeons tend to avoid rather than routinely expose and identify the external branch of the superior laryngeal nerve (EBSLN) during thyroidectomy or parathyroidectomy. IONM has the potential to be utilized for identification of the EBSLN and functional assessment of its integrity; therefore, IONM might contribute to voice preservation following thyroidectomy or parathyroidectomy. We reviewed the literature and the cumulative experience of the multidisciplinary International Neural Monitoring Study Group (INMSG) with IONM of the EBSLN. A systematic search of the MEDLINE database (from 1950 to the present) with predefined search terms (EBSLN, superior laryngeal nerve, stimulation, neuromonitoring, identification) was undertaken and supplemented by personal communication between members of the INMSG to identify relevant publications in the field. The hypothesis explored in this review is that the use of a standardized approach to the functional preservation of the EBSLN can be facilitated by application of IONM resulting in improved preservation of voice following thyroidectomy or parathyroidectomy. These guidelines are intended to improve the practice of neural monitoring of the EBSLN during thyroidectomy or parathyroidectomy and to optimize clinical utility of this technique based on available evidence and consensus of experts. 5 Copyright © 2013 The American Laryngological, Rhinological and Otological Society, Inc.
A database for coconut crop improvement.
Rajagopal, Velamoor; Manimekalai, Ramaswamy; Devakumar, Krishnamurthy; Rajesh; Karun, Anitha; Niral, Vittal; Gopal, Murali; Aziz, Shamina; Gunasekaran, Marimuthu; Kumar, Mundappurathe Ramesh; Chandrasekar, Arumugam
2005-12-08
Coconut crop improvement requires a number of biotechnology and bioinformatics tools. A database containing information on CG (coconut germplasm), CCI (coconut cultivar identification), CD (coconut disease), MIFSPC (microbial information systems in plantation crops) and VO (vegetable oils) is described. The database was developed using MySQL and PostgreSQL running in Linux operating system. The database interface is developed in PHP, HTML and JAVA. http://www.bioinfcpcri.org.
Databases as policy instruments. About extending networks as evidence-based policy.
de Bont, Antoinette; Stoevelaar, Herman; Bal, Roland
2007-12-07
This article seeks to identify the role of databases in health policy. Access to information and communication technologies has changed traditional relationships between the state and professionals, creating new systems of surveillance and control. As a result, databases may have a profound effect on controlling clinical practice. We conducted three case studies to reconstruct the development and use of databases as policy instruments. Each database was intended to be employed to control the use of one particular pharmaceutical in the Netherlands (growth hormone, antiretroviral drugs for HIV and Taxol, respectively). We studied the archives of the Dutch Health Insurance Board, conducted in-depth interviews with key informants and organized two focus groups, all focused on the use of databases both in policy circles and in clinical practice. Our results demonstrate that policy makers hardly used the databases, neither for cost control nor for quality assurance. Further analysis revealed that these databases facilitated self-regulation and quality assurance by (national) bodies of professionals, resulting in restrictive prescription behavior amongst physicians. The databases fulfill control functions that were formerly located within the policy realm. The databases facilitate collaboration between policy makers and physicians, since they enable quality assurance by professionals. Delegating regulatory authority downwards into a network of physicians who control the use of pharmaceuticals seems to be a good alternative for centralized control on the basis of monitoring data.
The NCBI BioCollections Database
Sharma, Shobha; Ciufo, Stacy; Starchenko, Elena; Darji, Dakshesh; Chlumsky, Larry; Karsch-Mizrachi, Ilene
2018-01-01
Abstract The rapidly growing set of GenBank submissions includes sequences that are derived from vouchered specimens. These are associated with culture collections, museums, herbaria and other natural history collections, both living and preserved. Correct identification of the specimens studied, along with a method to associate the sample with its institution, is critical to the outcome of related studies and analyses. The National Center for Biotechnology Information BioCollections Database was established to allow the association of specimen vouchers and related sequence records to their home institutions. This process also allows cross-linking from the home institution for quick identification of all records originating from each collection. Database URL: https://www.ncbi.nlm.nih.gov/biocollections PMID:29688360
2011-04-25
contract to assist the Afghan government in collecting and managing the biometric data for all of the ANSF. 5. The Electronic Payroll System (EPS...Identification cards numbers will be utilized as the common data fields for the various payroll , biometric , and personnel databases and systems. In addition to...data in MoI’s payroll , personnel, identification card/registration, and biometric databases and systems. 3. Take the following steps as part of all
Drainage identification analysis and mapping, phase 2.
DOT National Transportation Integrated Search
2017-01-01
Drainage Identification, Analysis and Mapping System (DIAMS) is a computerized database that captures and : stores relevant information associated with all aboveground and underground hydraulic structures belonging to : the New Jersey Department of T...
Yamamoto, Mikachi; Umeda, Yoshiko; Yo, Ayaka; Yamaura, Mariko; Makimura, Koichi
2014-02-01
Matrix-assisted laser desorption and ionization time-of-flight mass spectrometry (MALDI-TOF-MS) has been utilized for identification of various microorganisms. Malassezia species, including Malassezia restricta, which is associated with seborrheic dermatitis, has been difficult to identify by traditional means. This study was performed to develop a system for identification of Malassezia species with MALDI-TOF-MS and to investigate the incidence and variety of cutaneous Malassezia microbiota of 1-month-old infants using this technique. A Malassezia species-specific MALDI-TOF-MS database was developed from eight standard strains, and the availability of this system was assessed using 54 clinical strains isolated from the skin of 1-month-old infants. Clinical isolates were cultured initially on CHROMagar Malassezia growth medium, and the 28S ribosomal DNA (D1/D2) sequence was analyzed for confirmatory identification. Using this database, we detected and analyzed Malassezia species in 68% and 44% of infants with and without infantile seborrheic dermatitis, respectively. The results of MALDI-TOF-MS analysis were consistent with those of rDNA sequencing identification (100% accuracy rate). To our knowledge, this is the first report of a MALDI-TOF-MS database for major skin pathogenic Malassezia species. This system is an easy, rapid and reliable method for identification of Malassezia. © 2014 Japanese Dermatological Association.
Design considerations, architecture, and use of the Mini-Sentinel distributed data system.
Curtis, Lesley H; Weiner, Mark G; Boudreau, Denise M; Cooper, William O; Daniel, Gregory W; Nair, Vinit P; Raebel, Marsha A; Beaulieu, Nicolas U; Rosofsky, Robert; Woodworth, Tiffany S; Brown, Jeffrey S
2012-01-01
We describe the design, implementation, and use of a large, multiorganizational distributed database developed to support the Mini-Sentinel Pilot Program of the US Food and Drug Administration (FDA). As envisioned by the US FDA, this implementation will inform and facilitate the development of an active surveillance system for monitoring the safety of medical products (drugs, biologics, and devices) in the USA. A common data model was designed to address the priorities of the Mini-Sentinel Pilot and to leverage the experience and data of participating organizations and data partners. A review of existing common data models informed the process. Each participating organization designed a process to extract, transform, and load its source data, applying the common data model to create the Mini-Sentinel Distributed Database. Transformed data were characterized and evaluated using a series of programs developed centrally and executed locally by participating organizations. A secure communications portal was designed to facilitate queries of the Mini-Sentinel Distributed Database and transfer of confidential data, analytic tools were developed to facilitate rapid response to common questions, and distributed querying software was implemented to facilitate rapid querying of summary data. As of July 2011, information on 99,260,976 health plan members was included in the Mini-Sentinel Distributed Database. The database includes 316,009,067 person-years of observation time, with members contributing, on average, 27.0 months of observation time. All data partners have successfully executed distributed code and returned findings to the Mini-Sentinel Operations Center. This work demonstrates the feasibility of building a large, multiorganizational distributed data system in which organizations retain possession of their data that are used in an active surveillance system. Copyright © 2012 John Wiley & Sons, Ltd.
Early hazard identification of new chemicals is often difficult due to lack of data on the novel material for toxicity endpoints, including neurotoxicity. At present, there are no structure searchable neurotoxicity databases. A working group was formed to construct a database to...
A prototypic small molecule database for bronchoalveolar lavage-based metabolomics
NASA Astrophysics Data System (ADS)
Walmsley, Scott; Cruickshank-Quinn, Charmion; Quinn, Kevin; Zhang, Xing; Petrache, Irina; Bowler, Russell P.; Reisdorph, Richard; Reisdorph, Nichole
2018-04-01
The analysis of bronchoalveolar lavage fluid (BALF) using mass spectrometry-based metabolomics can provide insight into lung diseases, such as asthma. However, the important step of compound identification is hindered by the lack of a small molecule database that is specific for BALF. Here we describe prototypic, small molecule databases derived from human BALF samples (n=117). Human BALF was extracted into lipid and aqueous fractions and analyzed using liquid chromatography mass spectrometry. Following filtering to reduce contaminants and artifacts, the resulting BALF databases (BALF-DBs) contain 11,736 lipid and 658 aqueous compounds. Over 10% of these were found in 100% of samples. Testing the BALF-DBs using nested test sets produced a 99% match rate for lipids and 47% match rate for aqueous molecules. Searching an independent dataset resulted in 45% matching to the lipid BALF-DB compared to<25% when general databases are searched. The BALF-DBs are available for download from MetaboLights. Overall, the BALF-DBs can reduce false positives and improve confidence in compound identification compared to when general databases are used.
Padliya, Neerav D; Garrett, Wesley M; Campbell, Kimberly B; Tabb, David L; Cooper, Bret
2007-11-01
LC-MS/MS has demonstrated potential for detecting plant pathogens. Unlike PCR or ELISA, LC-MS/MS does not require pathogen-specific reagents for the detection of pathogen-specific proteins and peptides. However, the MS/MS approach we and others have explored does require a protein sequence reference database and database-search software to interpret tandem mass spectra. To evaluate the limitations of database composition on pathogen identification, we analyzed proteins from cultured Ustilago maydis, Phytophthora sojae, Fusarium graminearum, and Rhizoctonia solani by LC-MS/MS. When the search database did not contain sequences for a target pathogen, or contained sequences to related pathogens, target pathogen spectra were reliably matched to protein sequences from nontarget organisms, giving an illusion that proteins from nontarget organisms were identified. Our analysis demonstrates that when database-search software is used as part of the identification process, a paradox exists whereby additional sequences needed to detect a wide variety of possible organisms may lead to more cross-species protein matches and misidentification of pathogens.
OGRO: The Overview of functionally characterized Genes in Rice online database.
Yamamoto, Eiji; Yonemaru, Jun-Ichi; Yamamoto, Toshio; Yano, Masahiro
2012-12-01
The high-quality sequence information and rich bioinformatics tools available for rice have contributed to remarkable advances in functional genomics. To facilitate the application of gene function information to the study of natural variation in rice, we comprehensively searched for articles related to rice functional genomics and extracted information on functionally characterized genes. As of 31 March 2012, 702 functionally characterized genes were annotated. This number represents about 1.6% of the predicted loci in the Rice Annotation Project Database. The compiled gene information is organized to facilitate direct comparisons with quantitative trait locus (QTL) information in the Q-TARO database. Comparison of genomic locations between functionally characterized genes and the QTLs revealed that QTL clusters were often co-localized with high-density gene regions, and that the genes associated with the QTLs in these clusters were different genes, suggesting that these QTL clusters are likely to be explained by tightly linked but distinct genes. Information on the functionally characterized genes compiled during this study is now available in the O verview of Functionally Characterized G enes in R ice O nline database (OGRO) on the Q-TARO website ( http://qtaro.abr.affrc.go.jp/ogro ). The database has two interfaces: a table containing gene information, and a genome viewer that allows users to compare the locations of QTLs and functionally characterized genes. OGRO on Q-TARO will facilitate a candidate-gene approach to identifying the genes responsible for QTLs. Because the QTL descriptions in Q-TARO contain information on agronomic traits, such comparisons will also facilitate the annotation of functionally characterized genes in terms of their effects on traits important for rice breeding. The increasing amount of information on rice gene function being generated from mutant panels and other types of studies will make the OGRO database even more valuable in the future.
Bashir, K; Blizard, B; Bosanquet, A; Bosanquet, N; Mann, A; Jenkins, R
2000-08-01
Facilitation uses personal contact between the facilitator and the professional to encourage good practice and better service organisation. The model has been applied to physical illness but not to psychiatric disorders. To determine if a non-specialist facilitator can improve the recognition, management, and outcome of psychiatric illness presenting to general practitioners (GPs). Six practices were visited over an 18-month period by a facilitator whose activities included providing guidelines and organising training initiatives. Six other practices acted as controls. Recognition (identification index of family doctors), management (psychotropic prescribing, psychological consultations with the GP, specialist mental health treatment, and the use of medical interventions and investigations), and patient outcome at four months were assessed before and after intervention. The mean identification index of facilitator GPs rose from 0.51 to 0.64 following intervention, while that of the control GPs fell from 0.67 to 0.59 (P = 0.046). The facilitator had no detectable effect on management or patient outcome. The facilitator improved recognition of psychiatric illness by GPs. Generic facilitators can be trained to take on a mental health role; however, the failure to achieve more fundamental changes in treatment and outcome implies that facilitator intervention requires development.
NASA Astrophysics Data System (ADS)
Agung, Muhammad Budi; Budiarsa, I. Made; Suwastika, I. Nengah
2017-02-01
Cocoa bean is one of the main commodities from Indonesia for the world, which still have problem regarding yield degradation due to pathogens and disease attack. Developing robust cacao plant that genetically resistant to pathogen and disease attack is an ideal solution in over taking on this problem. The aim of this study was to identify Theobroma cacao genes on database of cacao genome that homolog to response genes of pathogen and disease attack in other plant, through in silico analysis. Basic information survey and gene identification were performed in GenBank and The Arabidopsis Information Resource database. The In silico analysis contains protein BLAST, homology test of each gene's protein candidates, and identification of homologue gene in Cacao Genome Database using data source "Theobroma cacao cv. Matina 1-6 v1.1" genome. Identification found that Thecc1EG011959t1 (EDS1), Thecc1EG006803t1 (EDS5), Thecc1EG013842t1 (ICS1), and Thecc1EG015614t1 (BG_PPAP) gene of Cacao Genome Database were Theobroma cacao genes that homolog to plant's resistance genes which highly possible to have similar functions of each gene's homologue gene.
Biological agents database in the armed forces.
Niemcewicz, Marcin; Kocik, Janusz; Bielecka, Anna; Wierciński, Michał
2014-10-01
Rapid detection and identification of the biological agent during both, natural or deliberate outbreak is crucial for implementation of appropriate control measures and procedures in order to mitigate the spread of disease. Determination of pathogen etiology may not only support epidemiological investigation and safety of human beings, but also enhance forensic efforts in pathogen tracing, collection of evidences and correct inference. The article presents objectives of the Biological Agents Database, which was developed for the purpose of the Ministry of National Defense of the Republic of Poland under the European Defence Agency frame. The Biological Agents Database is an electronic catalogue of genetic markers of highly dangerous pathogens and biological agents of weapon of mass destruction concern, which provides full identification of biological threats emerging in Poland and in locations of activity of Polish troops. The Biological Agents Database is a supportive tool used for tracing biological agents' origin as well as rapid identification of agent causing the disease of unknown etiology. It also provides support in diagnosis, analysis, response and exchange of information between institutions that use information contained in it. Therefore, it can be used not only for military purposes, but also in a civilian environment.
Rai, Sauharda; Gurung, Dristy; Kaiser, Bonnie N; Sikkema, Kathleen J; Dhakal, Manoj; Bhardwaj, Anvita; Tergesen, Cori; Kohrt, Brandon A
2018-06-01
Service users' involvement as cofacilitators of mental health trainings is a nascent endeavor in low- and middle-income countries, and the role of families on service user participation in trainings has received limited attention. This study examined how caregivers perceive and facilitate service user's involvement in an antistigma program that was added to mental health Gap Action Program (mhGAP) trainings for primary care workers in Nepal. Service users were trained as cofacilitators for antistigma and mhGAP trainings delivered to primary care workers through the REducing Stigma among HealthcAre ProvidErs (RESHAPE) program. Key informant interviews (n = 17) were conducted with caregivers and service users in RESHAPE. Five themes emerged: (a) Caregivers' perceived benefits of service user involvement included reduced caregiver burden, learning new skills, and opportunities to develop support groups. (b) Caregivers' fear of worsening stigma impeded RESHAPE participation. (c) Lack of trust between caregivers and service users jeopardized participation, but it could be mitigated through family engagement with health workers. (d) Orientation provided to caregivers regarding RESHAPE needed greater attention, and when information was provided, it contributed to stigma reduction in families. (e) Time management impacted caregivers' ability to facilitate service user participation. Engagement with families allows for greater identification of motivational factors and barriers impacting optimal program performance. Caregiver involvement in all program elements should be considered best practice for service user-facilitated antistigma initiatives, and service users reluctant to include caregivers should be provided with health staff support to address barriers to including family. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Gleason, Robert A.; Tangen, Brian A.; Laubhan, Murray K.; Finocchiaro, Raymond G.; Stamm, John F.
2009-01-01
Long-term accumulation of salts in wetlands at Bowdoin National Wildlife Refuge (NWR), Mont., has raised concern among wetland managers that increasing salinity may threaten plant and invertebrate communities that provide important habitat and food resources for migratory waterfowl. Currently, the U.S. Fish and Wildlife Service (USFWS) is evaluating various water management strategies to help maintain suitable ranges of salinity to sustain plant and invertebrate resources of importance to wildlife. To support this evaluation, the USFWS requested that the U.S. Geological Survey (USGS) provide information on salinity ranges of water and soil for common plants and invertebrates on Bowdoin NWR lands. To address this need, we conducted a search of the literature on occurrences of plants and invertebrates in relation to salinity and pH of the water and soil. The compiled literature was used to (1) provide a general overview of salinity concepts, (2) document published tolerances and adaptations of biota to salinity, (3) develop databases that the USFWS can use to summarize the range of reported salinity values associated with plant and invertebrate taxa, and (4) perform database summaries that describe reported salinity ranges associated with plants and invertebrates at Bowdoin NWR. The purpose of this report is to synthesize information to facilitate a better understanding of the ecological relations between salinity and flora and fauna when developing wetland management strategies. A primary focus of this report is to provide information to help evaluate and address salinity issues at Bowdoin NWR; however, the accompanying databases, as well as concepts and information discussed, are applicable to other areas or refuges. The accompanying databases include salinity values reported for 411 plant taxa and 330 invertebrate taxa. The databases are available in Microsoft Excel version 2007 (http://pubs.usgs.gov/sir/2009/5098/downloads/databases_21april2009.xls) and contain 27 data fields that include variables such as taxonomic identification, values for salinity and pH, wetland classification, location of study, and source of data. The databases are not exhaustive of the literature and are biased toward wetland habitats located in the glaciated North-Central United States; however, the databases do encompass a diversity of biota commonly found in brackish and freshwater inland wetland habitats.
Raebel, Marsha A; Schmittdiel, Julie; Karter, Andrew J; Konieczny, Jennifer L; Steiner, John F
2013-08-01
To propose a unifying set of definitions for prescription adherence research utilizing electronic health record prescribing databases, prescription dispensing databases, and pharmacy claims databases and to provide a conceptual framework to operationalize these definitions consistently across studies. We reviewed recent literature to identify definitions in electronic database studies of prescription-filling patterns for chronic oral medications. We then develop a conceptual model and propose standardized terminology and definitions to describe prescription-filling behavior from electronic databases. The conceptual model we propose defines 2 separate constructs: medication adherence and persistence. We define primary and secondary adherence as distinct subtypes of adherence. Metrics for estimating secondary adherence are discussed and critiqued, including a newer metric (New Prescription Medication Gap measure) that enables estimation of both primary and secondary adherence. Terminology currently used in prescription adherence research employing electronic databases lacks consistency. We propose a clear, consistent, broadly applicable conceptual model and terminology for such studies. The model and definitions facilitate research utilizing electronic medication prescribing, dispensing, and/or claims databases and encompasses the entire continuum of prescription-filling behavior. Employing conceptually clear and consistent terminology to define medication adherence and persistence will facilitate future comparative effectiveness research and meta-analytic studies that utilize electronic prescription and dispensing records.
Chatzigeorgiou, Kalliopi-Stavroula; Sergentanis, Theodoros N.; Tsiodras, Sotirios; Hamodrakas, Stavros J.; Bagos, Pantelis G.
2011-01-01
Phoenix 100 and Vitek 2 (operating with the current colorimetric cards) are commonly used in hospital laboratories for rapid identification of microorganisms. The present meta-analysis aims to evaluate and compare their performance on Gram-positive and Gram-negative bacteria. The MEDLINE database was searched up to October 2010 for the retrieval of relevant articles. Pooled correct identification rates were derived from random-effects models, using the arcsine transformation. Separate analyses were conducted at the genus and species levels; subanalyses and meta-regression were undertaken to reveal meaningful system- and study-related modifiers. A total of 29 (6,635 isolates) and 19 (4,363 isolates) articles were eligible for Phoenix and colorimetric Vitek 2, respectively. No significant differences were observed between Phoenix and Vitek 2 either at the genus (97.70% versus 97.59%, P = 0.919) or the species (92.51% versus 88.77%, P = 0.149) level. Studies conducted with conventional comparator methods tended to report significantly better results compared to those using molecular reference techniques. Speciation of Staphylococcus aureus was significantly more accurate in comparison to coagulase-negative staphylococci by both Phoenix (99.78% versus 88.42%, P < 0.00001) and Vitek 2 (98.22% versus 91.89%, P = 0.043). Vitek 2 also reached higher correct identification rates for Gram-negative fermenters versus nonfermenters at the genus (99.60% versus 95.90%, P = 0.004) and the species (97.42% versus 84.85%, P = 0.003) level. In conclusion, the accuracy of both systems seems modified by underlying sample- and comparator method-related parameters. Future simultaneous assessment of the instruments against molecular comparator procedures may facilitate interpretation of the current observations. PMID:21752980
Shteynberg, David; Mendoza, Luis; Hoopmann, Michael R.; Sun, Zhi; Schmidt, Frank; Deutsch, Eric W.; Moritz, Robert L.
2016-01-01
Most shotgun proteomics data analysis workflows are based on the assumption that each fragment ion spectrum is explained by a single species of peptide ion isolated by the mass spectrometer; however, in reality mass spectrometers often isolate more than one peptide ion within the window of isolation that contributes to additional peptide fragment peaks in many spectra. We present a new tool called reSpect, implemented in the Trans-Proteomic Pipeline (TPP), that enables an iterative workflow whereby fragment ion peaks explained by a peptide ion identified in one round of sequence searching or spectral library search are attenuated based on the confidence of the identification, and then the altered spectrum is subjected to further rounds of searching. The reSpect tool is not implemented as a search engine, but rather as a post search engine processing step where only fragment ion intensities are altered. This enables the application of any search engine combination in the following iterations. Thus, reSpect is compatible with all other protein sequence database search engines as well as peptide spectral library search engines that are supported by the TPP. We show that while some datasets are highly amenable to chimeric spectrum identification and lead to additional peptide identification boosts of over 30% with as many as four different peptide ions identified per spectrum, datasets with narrow precursor ion selection only benefit from such processing at the level of a few percent. We demonstrate a technique that facilitates the determination of the degree to which a dataset would benefit from chimeric spectrum analysis. The reSpect tool is free and open source, provided within the TPP and available at the TPP website. PMID:26419769
Shteynberg, David; Mendoza, Luis; Hoopmann, Michael R; Sun, Zhi; Schmidt, Frank; Deutsch, Eric W; Moritz, Robert L
2015-11-01
Most shotgun proteomics data analysis workflows are based on the assumption that each fragment ion spectrum is explained by a single species of peptide ion isolated by the mass spectrometer; however, in reality mass spectrometers often isolate more than one peptide ion within the window of isolation that contribute to additional peptide fragment peaks in many spectra. We present a new tool called reSpect, implemented in the Trans-Proteomic Pipeline (TPP), which enables an iterative workflow whereby fragment ion peaks explained by a peptide ion identified in one round of sequence searching or spectral library search are attenuated based on the confidence of the identification, and then the altered spectrum is subjected to further rounds of searching. The reSpect tool is not implemented as a search engine, but rather as a post-search engine processing step where only fragment ion intensities are altered. This enables the application of any search engine combination in the iterations that follow. Thus, reSpect is compatible with all other protein sequence database search engines as well as peptide spectral library search engines that are supported by the TPP. We show that while some datasets are highly amenable to chimeric spectrum identification and lead to additional peptide identification boosts of over 30% with as many as four different peptide ions identified per spectrum, datasets with narrow precursor ion selection only benefit from such processing at the level of a few percent. We demonstrate a technique that facilitates the determination of the degree to which a dataset would benefit from chimeric spectrum analysis. The reSpect tool is free and open source, provided within the TPP and available at the TPP website. Graphical Abstract ᅟ.
NASA Astrophysics Data System (ADS)
Shteynberg, David; Mendoza, Luis; Hoopmann, Michael R.; Sun, Zhi; Schmidt, Frank; Deutsch, Eric W.; Moritz, Robert L.
2015-11-01
Most shotgun proteomics data analysis workflows are based on the assumption that each fragment ion spectrum is explained by a single species of peptide ion isolated by the mass spectrometer; however, in reality mass spectrometers often isolate more than one peptide ion within the window of isolation that contribute to additional peptide fragment peaks in many spectra. We present a new tool called reSpect, implemented in the Trans-Proteomic Pipeline (TPP), which enables an iterative workflow whereby fragment ion peaks explained by a peptide ion identified in one round of sequence searching or spectral library search are attenuated based on the confidence of the identification, and then the altered spectrum is subjected to further rounds of searching. The reSpect tool is not implemented as a search engine, but rather as a post-search engine processing step where only fragment ion intensities are altered. This enables the application of any search engine combination in the iterations that follow. Thus, reSpect is compatible with all other protein sequence database search engines as well as peptide spectral library search engines that are supported by the TPP. We show that while some datasets are highly amenable to chimeric spectrum identification and lead to additional peptide identification boosts of over 30% with as many as four different peptide ions identified per spectrum, datasets with narrow precursor ion selection only benefit from such processing at the level of a few percent. We demonstrate a technique that facilitates the determination of the degree to which a dataset would benefit from chimeric spectrum analysis. The reSpect tool is free and open source, provided within the TPP and available at the TPP website.
A Community Data Model for Hydrologic Observations
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Horsburgh, J. S.; Zaslavsky, I.; Maidment, D. R.; Valentine, D.; Jennings, B.
2006-12-01
The CUAHSI Hydrologic Information System project is developing information technology infrastructure to support hydrologic science. Hydrologic information science involves the description of hydrologic environments in a consistent way, using data models for information integration. This includes a hydrologic observations data model for the storage and retrieval of hydrologic observations in a relational database designed to facilitate data retrieval for integrated analysis of information collected by multiple investigators. It is intended to provide a standard format to facilitate the effective sharing of information between investigators and to facilitate analysis of information within a single study area or hydrologic observatory, or across hydrologic observatories and regions. The observations data model is designed to store hydrologic observations and sufficient ancillary information (metadata) about the observations to allow them to be unambiguously interpreted and used and provide traceable heritage from raw measurements to usable information. The design is based on the premise that a relational database at the single observation level is most effective for providing querying capability and cross dimension data retrieval and analysis. This premise is being tested through the implementation of a prototype hydrologic observations database, and the development of web services for the retrieval of data from and ingestion of data into the database. These web services hosted by the San Diego Supercomputer center make data in the database accessible both through a Hydrologic Data Access System portal and directly from applications software such as Excel, Matlab and ArcGIS that have Standard Object Access Protocol (SOAP) capability. This paper will (1) describe the data model; (2) demonstrate the capability for representing diverse data in the same database; (3) demonstrate the use of the database from applications software for the performance of hydrologic analysis across different observation types.
Analysis of high accuracy, quantitative proteomics data in the MaxQB database.
Schaab, Christoph; Geiger, Tamar; Stoehr, Gabriele; Cox, Juergen; Mann, Matthias
2012-03-01
MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb.
Environmental Chemistry Compound Identification Using High ...
There is a growing need for rapid chemical screening and prioritization to inform regulatory decision-making on thousands of chemicals in the environment. We have previously used high-resolution mass spectrometry to examine household vacuum dust samples using liquid chromatography time-of-flight mass spectrometry (LC-TOF/MS). Using a combination of exact mass, isotope distribution, and isotope spacing, molecular features were matched with a list of chemical formulas from the EPA’s Distributed Structure-Searchable Toxicity (DSSTox) database. This has further developed our understanding of how openly available chemical databases, together with the appropriate searches, could be used for the purpose of compound identification. We report here on the utility of the EPA’s iCSS Chemistry Dashboard for the purpose of compound identification using searches against a database of over 720,000 chemicals. We also examine the benefits of QSAR prediction for the purpose of retention time prediction to allow for alignment of both chromatographic and mass spectral properties. This abstract does not reflect U.S. EPA policy presentation at the Eastern Analytical Symposium.
Code of Federal Regulations, 2010 CFR
2010-04-01
... USE DEVICES General Hospital and Personal Use Miscellaneous Devices § 880.6300 Implantable... identification code is used to access patient identity and corresponding health information stored in a database...
Rozo, Andrea Morales; Valencia, Fernando; Acosta, Alexis; Parra, Juan Luis
2014-01-01
The department of Antioquia, Colombia, lies in the northwestern corner of South America and provides a biogeographical link among divergent faunas, including Caribbean, Andean, Pacific and Amazonian. Information about the distribution of biodiversity in this area is of relevance for academic, practical and social purposes. This data paper describes the dataset containing all bird specimens deposited in the Colección de Ciencias Naturales del Museo Universitario de la Universidad de Antioquia (MUA). We curated all the information associated with the bird specimens, including the georeferences and taxonomy, and published the database through the Global Biodiversity Information Facility network. During this process we checked the species identification and existing georeferences and completed the information when possible. The collection holds 663 bird specimens collected between 1940 and 2011. Even though most specimens are from Antioquia (70%), the collection includes material from several other departments and one specimen from the United States. The collection holds specimens from three endemic and endangered species (Coeligena orina, Diglossa gloriossisima, and Hypopirrhus pyrohipogaster), and includes localities poorly represented in other collections. The information contained in the collection has been used for biodiversity modeling, conservation planning and management, and we expect to further facilitate these activities by making it publicly available.
Kamatuka, Kenta; Hattori, Masahiro; Sugiyama, Tomoyasu
2016-12-01
RNA interference (RNAi) screening is extensively used in the field of reverse genetics. RNAi libraries constructed using random oligonucleotides have made this technology affordable. However, the new methodology requires exploration of the RNAi target gene information after screening because the RNAi library includes non-natural sequences that are not found in genes. Here, we developed a web-based tool to support RNAi screening. The system performs short hairpin RNA (shRNA) target prediction that is informed by comprehensive enquiry (SPICE). SPICE automates several tasks that are laborious but indispensable to evaluate the shRNAs obtained by RNAi screening. SPICE has four main functions: (i) sequence identification of shRNA in the input sequence (the sequence might be obtained by sequencing clones in the RNAi library), (ii) searching the target genes in the database, (iii) demonstrating biological information obtained from the database, and (iv) preparation of search result files that can be utilized in a local personal computer (PC). Using this system, we demonstrated that genes targeted by random oligonucleotide-derived shRNAs were not different from those targeted by organism-specific shRNA. The system facilitates RNAi screening, which requires sequence analysis after screening. The SPICE web application is available at http://www.spice.sugysun.org/.
SilkPathDB: a comprehensive resource for the study of silkworm pathogens.
Li, Tian; Pan, Guo-Qing; Vossbrinck, Charles R; Xu, Jin-Shan; Li, Chun-Feng; Chen, Jie; Long, Meng-Xian; Yang, Ming; Xu, Xiao-Fei; Xu, Chen; Debrunner-Vossbrinck, Bettina A; Zhou, Ze-Yang
2017-01-01
Silkworm pathogens have been heavily impeding the development of sericultural industry and play important roles in lepidopteran ecology, and some of which are used as biological insecticides. Rapid advances in studies on the omics of silkworm pathogens have produced a large amount of data, which need to be brought together centrally in a coherent and systematic manner. This will facilitate the reuse of these data for further analysis. We have collected genomic data for 86 silkworm pathogens from 4 taxa (fungi, microsporidia, bacteria and viruses) and from 4 lepidopteran hosts, and developed the open-access Silkworm Pathogen Database (SilkPathDB) to make this information readily available. The implementation of SilkPathDB involves integrating Drupal and GBrowse as a graphic interface for a Chado relational database which houses all of the datasets involved. The genomes have been assembled and annotated for comparative purposes and allow the search and analysis of homologous sequences, transposable elements, protein subcellular locations, including secreted proteins, and gene ontology. We believe that the SilkPathDB will aid researchers in the identification of silkworm parasites, understanding the mechanisms of silkworm infections, and the developmental ecology of silkworm parasites (gene expression) and their hosts. http://silkpathdb.swu.edu.cn. © The Author(s) 2017. Published by Oxford University Press.
Wang, Zheng; Zhou, Di; Jia, Zhenjun; Li, Luyao; Wu, Wei; Li, Chengtao; Hou, Yiping
2016-01-01
STRs, scattered throughout the genome with higher mutation rate, are attractive to genetic application like forensic, anthropological and population genetics studies. STR profiling has now been applied in various aspects of human identification in forensic investigations. This work described the developmental validation of a novel and universal assay, the Huaxia Platinum System, which amplifies all markers in the expanded CODIS core loci and the Chinese National Database in one single PCR system. Developmental validation demonstrated that this novel assay is accurate, sensitive, reproducible and robust. No discordant calls were observed between the Huaxia Platinum System and other STR systems. Full genotypes could be achieved even with 250 pg of human DNA. Additionally, 402 unrelated individuals from 3 main ethnic groups of China (Han, Uygur and Tibetan) were genotyped to investigate the effectiveness of this novel assay. The CMP were 2.3094 × 10−27, 4.3791 × 10−28 and 6.9118 × 10−27, respectively, and the CPE were 0.99999999939059, 0.99999999989653 and 0.99999999976386, respectively. Aforementioned results suggested that the Huaxia Platinum System is polymorphic and informative, which provides efficient tool for national DNA database and facilitate international data sharing. PMID:27498550
Personal identification based on prescription eyewear.
Berg, Gregory E; Collins, Randall S
2007-03-01
This study presents a web-based tool that can be used to assist in identification of unknown individuals using spectacle prescriptions. Currently, when lens prescriptions are used in forensic identifications, investigators are constrained to a simple "match" or "no-match" judgment with an antemortem prescription. It is not possible to evaluate the strength of the conclusion, or rather, the potential or real error rates associated with the conclusion. Three databases totaling over 385,000 individual prescriptions are utilized in this study to allow forensic analysts to easily determine the strength of individuation of a spectacle match to antemortem records by calculating the frequency at which the observed prescription occurs in various U.S. populations. Optical refractive errors are explained, potential states and combinations of refractive errors are described, measuring lens corrections is discussed, and a detailed description of the databases is presented. The practical application of this system is demonstrated using two recent forensic identifications. This research provides a valuable personal identification tool that can be used in cases where eyeglass portions are recovered in forensic contexts.
A database for coconut crop improvement
Rajagopal, Velamoor; Manimekalai, Ramaswamy; Devakumar, Krishnamurthy; Rajesh; Karun, Anitha; Niral, Vittal; Gopal, Murali; Aziz, Shamina; Gunasekaran, Marimuthu; Kumar, Mundappurathe Ramesh; Chandrasekar, Arumugam
2005-01-01
Coconut crop improvement requires a number of biotechnology and bioinformatics tools. A database containing information on CG (coconut germplasm), CCI (coconut cultivar identification), CD (coconut disease), MIFSPC (microbial information systems in plantation crops) and VO (vegetable oils) is described. The database was developed using MySQL and PostgreSQL running in Linux operating system. The database interface is developed in PHP, HTML and JAVA. Availability http://www.bioinfcpcri.org PMID:17597858
Cloud, Joann L; Conville, Patricia S; Croft, Ann; Harmsen, Dag; Witebsky, Frank G; Carroll, Karen C
2004-02-01
Identification of clinically significant nocardiae to the species level is important in patient diagnosis and treatment. A study was performed to evaluate Nocardia species identification obtained by partial 16S ribosomal DNA (rDNA) sequencing by the MicroSeq 500 system with an expanded database. The expanded portion of the database was developed from partial 5' 16S rDNA sequences derived from 28 reference strains (from the American Type Culture Collection and the Japanese Collection of Microorganisms). The expanded MicroSeq 500 system was compared to (i). conventional identification obtained from a combination of growth characteristics with biochemical and drug susceptibility tests; (ii). molecular techniques involving restriction enzyme analysis (REA) of portions of the 16S rRNA and 65-kDa heat shock protein genes; and (iii). when necessary, sequencing of a 999-bp fragment of the 16S rRNA gene. An unknown isolate was identified as a particular species if the sequence obtained by partial 16S rDNA sequencing by the expanded MicroSeq 500 system was 99.0% similar to that of the reference strain. Ninety-four nocardiae representing 10 separate species were isolated from patient specimens and examined by using the three different methods. Sequencing of partial 16S rDNA by the expanded MicroSeq 500 system resulted in only 72% agreement with conventional methods for species identification and 90% agreement with the alternative molecular methods. Molecular methods for identification of Nocardia species provide more accurate and rapid results than the conventional methods using biochemical and susceptibility testing. With an expanded database, the MicroSeq 500 system for partial 16S rDNA was able to correctly identify the human pathogens N. brasiliensis, N. cyriacigeorgica, N. farcinica, N. nova, N. otitidiscaviarum, and N. veterana.
We discuss the initial design and application of the National Urban Database and Access Portal Tool (NUDAPT). This new project is sponsored by the USEPA and involves collaborations and contributions from many groups from federal and state agencies, and from private and academic i...
Performance of an open-source heart sound segmentation algorithm on eight independent databases.
Liu, Chengyu; Springer, David; Clifford, Gari D
2017-08-01
Heart sound segmentation is a prerequisite step for the automatic analysis of heart sound signals, facilitating the subsequent identification and classification of pathological events. Recently, hidden Markov model-based algorithms have received increased interest due to their robustness in processing noisy recordings. In this study we aim to evaluate the performance of the recently published logistic regression based hidden semi-Markov model (HSMM) heart sound segmentation method, by using a wider variety of independently acquired data of varying quality. Firstly, we constructed a systematic evaluation scheme based on a new collection of heart sound databases, which we assembled for the PhysioNet/CinC Challenge 2016. This collection includes a total of more than 120 000 s of heart sounds recorded from 1297 subjects (including both healthy subjects and cardiovascular patients) and comprises eight independent heart sound databases sourced from multiple independent research groups around the world. Then, the HSMM-based segmentation method was evaluated using the assembled eight databases. The common evaluation metrics of sensitivity, specificity, accuracy, as well as the [Formula: see text] measure were used. In addition, the effect of varying the tolerance window for determining a correct segmentation was evaluated. The results confirm the high accuracy of the HSMM-based algorithm on a separate test dataset comprised of 102 306 heart sounds. An average [Formula: see text] score of 98.5% for segmenting S1 and systole intervals and 97.2% for segmenting S2 and diastole intervals were observed. The [Formula: see text] score was shown to increases with an increases in the tolerance window size, as expected. The high segmentation accuracy of the HSMM-based algorithm on a large database confirmed the algorithm's effectiveness. The described evaluation framework, combined with the largest collection of open access heart sound data, provides essential resources for evaluators who need to test their algorithms with realistic data and share reproducible results.
HPIDB 2.0: a curated database for host–pathogen interactions
Ammari, Mais G.; Gresham, Cathy R.; McCarthy, Fiona M.; Nanduri, Bindu
2016-01-01
Identification and analysis of host–pathogen interactions (HPI) is essential to study infectious diseases. However, HPI data are sparse in existing molecular interaction databases, especially for agricultural host–pathogen systems. Therefore, resources that annotate, predict and display the HPI that underpin infectious diseases are critical for developing novel intervention strategies. HPIDB 2.0 (http://www.agbase.msstate.edu/hpi/main.html) is a resource for HPI data, and contains 45, 238 manually curated entries in the current release. Since the first description of the database in 2010, multiple enhancements to HPIDB data and interface services were made that are described here. Notably, HPIDB 2.0 now provides targeted biocuration of molecular interaction data. As a member of the International Molecular Exchange consortium, annotations provided by HPIDB 2.0 curators meet community standards to provide detailed contextual experimental information and facilitate data sharing. Moreover, HPIDB 2.0 provides access to rapidly available community annotations that capture minimum molecular interaction information to address immediate researcher needs for HPI network analysis. In addition to curation, HPIDB 2.0 integrates HPI from existing external sources and contains tools to infer additional HPI where annotated data are scarce. Compared to other interaction databases, our data collection approach ensures HPIDB 2.0 users access the most comprehensive HPI data from a wide range of pathogens and their hosts (594 pathogen and 70 host species, as of February 2016). Improvements also include enhanced search capacity, addition of Gene Ontology functional information, and implementation of network visualization. The changes made to HPIDB 2.0 content and interface ensure that users, especially agricultural researchers, are able to easily access and analyse high quality, comprehensive HPI data. All HPIDB 2.0 data are updated regularly, are publically available for direct download, and are disseminated to other molecular interaction resources. Database URL: http://www.agbase.msstate.edu/hpi/main.html PMID:27374121
43 CFR 11.25 - Preassessment screen-preliminary identification of resources potentially at risk.
Code of Federal Regulations, 2014 CFR
2014-10-01
... pathways. (1) The authorized official shall make a preliminary identification of potential exposure pathways to facilitate identification of resources at risk. (2) Factors to be considered in this... toxicological properties of the oil or hazardous substance. (3) Pathways to be considered shall include, as...
43 CFR 11.25 - Preassessment screen-preliminary identification of resources potentially at risk.
Code of Federal Regulations, 2013 CFR
2013-10-01
... pathways. (1) The authorized official shall make a preliminary identification of potential exposure pathways to facilitate identification of resources at risk. (2) Factors to be considered in this... toxicological properties of the oil or hazardous substance. (3) Pathways to be considered shall include, as...
43 CFR 11.25 - Preassessment screen-preliminary identification of resources potentially at risk.
Code of Federal Regulations, 2011 CFR
2011-10-01
... pathways. (1) The authorized official shall make a preliminary identification of potential exposure pathways to facilitate identification of resources at risk. (2) Factors to be considered in this... toxicological properties of the oil or hazardous substance. (3) Pathways to be considered shall include, as...
43 CFR 11.25 - Preassessment screen-preliminary identification of resources potentially at risk.
Code of Federal Regulations, 2010 CFR
2010-10-01
... pathways. (1) The authorized official shall make a preliminary identification of potential exposure pathways to facilitate identification of resources at risk. (2) Factors to be considered in this... toxicological properties of the oil or hazardous substance. (3) Pathways to be considered shall include, as...
43 CFR 11.25 - Preassessment screen-preliminary identification of resources potentially at risk.
Code of Federal Regulations, 2012 CFR
2012-10-01
... pathways. (1) The authorized official shall make a preliminary identification of potential exposure pathways to facilitate identification of resources at risk. (2) Factors to be considered in this... toxicological properties of the oil or hazardous substance. (3) Pathways to be considered shall include, as...
Rahi, Praveen; Prakash, Om; Shouche, Yogesh S.
2016-01-01
Matrix-assisted laser desorption/ionization time-of-flight mass-spectrometry (MALDI-TOF MS) based biotyping is an emerging technique for high-throughput and rapid microbial identification. Due to its relatively higher accuracy, comprehensive database of clinically important microorganisms and low-cost compared to other microbial identification methods, MALDI-TOF MS has started replacing existing practices prevalent in clinical diagnosis. However, applicability of MALDI-TOF MS in the area of microbial ecology research is still limited mainly due to the lack of data on non-clinical microorganisms. Intense research activities on cultivation of microbial diversity by conventional as well as by innovative and high-throughput methods has substantially increased the number of microbial species known today. This important area of research is in urgent need of rapid and reliable method(s) for characterization and de-replication of microorganisms from various ecosystems. MALDI-TOF MS based characterization, in our opinion, appears to be the most suitable technique for such studies. Reliability of MALDI-TOF MS based identification method depends mainly on accuracy and width of reference databases, which need continuous expansion and improvement. In this review, we propose a common strategy to generate MALDI-TOF MS spectral database and advocated its sharing, and also discuss the role of MALDI-TOF MS based high-throughput microbial identification in microbial ecology studies. PMID:27625644
Emerging new strategies for successful metabolite identification in metabolomics
Bingol, Kerem; Bruschweiler-Li, Lei; Li, Dawei; Zhang, Bo; Xie, Mouzhe; Brüschweiler, Rafael
2016-01-01
This review discusses strategies for the identification of metabolites in complex biological mixtures, as encountered in metabolomics, which have emerged in the recent past. These include NMR database-assisted approaches for the identification of commonly known metabolites as well as novel combinations of NMR and MS analysis methods for the identification of unknown metabolites. The use of certain chemical additives to the NMR tube can permit identification of metabolites with specific physical chemical properties. PMID:26915807
VAS: A Vision Advisor System combining agents and object-oriented databases
NASA Technical Reports Server (NTRS)
Eilbert, James L.; Lim, William; Mendelsohn, Jay; Braun, Ron; Yearwood, Michael
1994-01-01
A model-based approach to identifying and finding the orientation of non-overlapping parts on a tray has been developed. The part models contain both exact and fuzzy descriptions of part features, and are stored in an object-oriented database. Full identification of the parts involves several interacting tasks each of which is handled by a distinct agent. Using fuzzy information stored in the model allowed part features that were essentially at the noise level to be extracted and used for identification. This was done by focusing attention on the portion of the part where the feature must be found if the current hypothesis of the part ID is correct. In going from one set of parts to another the only thing that needs to be changed is the database of part models. This work is part of an effort in developing a Vision Advisor System (VAS) that combines agents and objected-oriented databases.
Schuemie, Martijn J; Mons, Barend; Weeber, Marc; Kors, Jan A
2007-06-01
Gene and protein name identification in text requires a dictionary approach to relate synonyms to the same gene or protein, and to link names to external databases. However, existing dictionaries are incomplete. We investigate two complementary methods for automatic generation of a comprehensive dictionary: combination of information from existing gene and protein databases and rule-based generation of spelling variations. Both methods have been reported in literature before, but have hitherto not been combined and evaluated systematically. We combined gene and protein names from several existing databases of four different organisms. The combined dictionaries showed a substantial increase in recall on three different test sets, as compared to any single database. Application of 23 spelling variation rules to the combined dictionaries further increased recall. However, many rules appeared to have no effect and some appear to have a detrimental effect on precision.
Carrara, Marta; Carozzi, Luca; Moss, Travis J; de Pasquale, Marco; Cerutti, Sergio; Lake, Douglas E; Moorman, J Randall; Ferrario, Manuela
2015-01-01
Identification of atrial fibrillation (AF) is a clinical imperative. Heartbeat interval time series are increasingly available from personal monitors, allowing new opportunity for AF diagnosis. Previously, we devised numerical algorithms for identification of normal sinus rhythm (NSR), AF, and SR with frequent ectopy using dynamical measures of heart rate. Here, we wished to validate them in the canonical MIT-BIH ECG databases. We tested algorithms on the NSR, AF and arrhythmia databases. When the databases were combined, the positive predictive value of the new algorithms exceeded 95% for NSR and AF, and was 40% for SR with ectopy. Further, dynamical measures did not distinguish atrial from ventricular ectopy. Inspection of individual 24hour records showed good correlation of observed and predicted rhythms. Heart rate dynamical measures are effective ingredients in numerical algorithms to classify cardiac rhythm from the heartbeat intervals time series alone. Copyright © 2015 Elsevier Inc. All rights reserved.
Genome image programs: visualization and interpretation of Escherichia coli microarray experiments.
Zimmer, Daniel P; Paliy, Oleg; Thomas, Brian; Gyaneshwar, Prasad; Kustu, Sydney
2004-08-01
We have developed programs to facilitate analysis of microarray data in Escherichia coli. They fall into two categories: manipulation of microarray images and identification of known biological relationships among lists of genes. A program in the first category arranges spots from glass-slide DNA microarrays according to their position in the E. coli genome and displays them compactly in genome order. The resulting genome image is presented in a web browser with an image map that allows the user to identify genes in the reordered image. Another program in the first category aligns genome images from two or more experiments. These images assist in visualizing regions of the genome with common transcriptional control. Such regions include multigene operons and clusters of operons, which are easily identified as strings of adjacent, similarly colored spots. The images are also useful for assessing the overall quality of experiments. The second category of programs includes a database and a number of tools for displaying biological information about many E. coli genes simultaneously rather than one gene at a time, which facilitates identifying relationships among them. These programs have accelerated and enhanced our interpretation of results from E. coli DNA microarray experiments. Examples are given. Copyright 2004 Genetics Society of America
Smartphone Analytics: Mobilizing the Lab into the Cloud for Omic-Scale Analyses.
Montenegro-Burke, J Rafael; Phommavongsay, Thiery; Aisporna, Aries E; Huan, Tao; Rinehart, Duane; Forsberg, Erica; Poole, Farris L; Thorgersen, Michael P; Adams, Michael W W; Krantz, Gregory; Fields, Matthew W; Northen, Trent R; Robbins, Paul D; Niedernhofer, Laura J; Lairson, Luke; Benton, H Paul; Siuzdak, Gary
2016-10-04
Active data screening is an integral part of many scientific activities, and mobile technologies have greatly facilitated this process by minimizing the reliance on large hardware instrumentation. In order to meet with the increasingly growing field of metabolomics and heavy workload of data processing, we designed the first remote metabolomic data screening platform for mobile devices. Two mobile applications (apps), XCMS Mobile and METLIN Mobile, facilitate access to XCMS and METLIN, which are the most important components in the computer-based XCMS Online platforms. These mobile apps allow for the visualization and analysis of metabolic data throughout the entire analytical process. Specifically, XCMS Mobile and METLIN Mobile provide the capabilities for remote monitoring of data processing, real time notifications for the data processing, visualization and interactive analysis of processed data (e.g., cloud plots, principle component analysis, box-plots, extracted ion chromatograms, and hierarchical cluster analysis), and database searching for metabolite identification. These apps, available on Apple iOS and Google Android operating systems, allow for the migration of metabolomic research onto mobile devices for better accessibility beyond direct instrument operation. The utility of XCMS Mobile and METLIN Mobile functionalities was developed and is demonstrated here through the metabolomic LC-MS analyses of stem cells, colon cancer, aging, and bacterial metabolism.
Smartphone Analytics: Mobilizing the Lab into the Cloud for Omic-Scale Analyses
2016-01-01
Active data screening is an integral part of many scientific activities, and mobile technologies have greatly facilitated this process by minimizing the reliance on large hardware instrumentation. In order to meet with the increasingly growing field of metabolomics and heavy workload of data processing, we designed the first remote metabolomic data screening platform for mobile devices. Two mobile applications (apps), XCMS Mobile and METLIN Mobile, facilitate access to XCMS and METLIN, which are the most important components in the computer-based XCMS Online platforms. These mobile apps allow for the visualization and analysis of metabolic data throughout the entire analytical process. Specifically, XCMS Mobile and METLIN Mobile provide the capabilities for remote monitoring of data processing, real time notifications for the data processing, visualization and interactive analysis of processed data (e.g., cloud plots, principle component analysis, box-plots, extracted ion chromatograms, and hierarchical cluster analysis), and database searching for metabolite identification. These apps, available on Apple iOS and Google Android operating systems, allow for the migration of metabolomic research onto mobile devices for better accessibility beyond direct instrument operation. The utility of XCMS Mobile and METLIN Mobile functionalities was developed and is demonstrated here through the metabolomic LC-MS analyses of stem cells, colon cancer, aging, and bacterial metabolism. PMID:27560777
Automated structural classification of lipids by machine learning.
Taylor, Ryan; Miller, Ryan H; Miller, Ryan D; Porter, Michael; Dalgleish, James; Prince, John T
2015-03-01
Modern lipidomics is largely dependent upon structural ontologies because of the great diversity exhibited in the lipidome, but no automated lipid classification exists to facilitate this partitioning. The size of the putative lipidome far exceeds the number currently classified, despite a decade of work. Automated classification would benefit ongoing classification efforts by decreasing the time needed and increasing the accuracy of classification while providing classifications for mass spectral identification algorithms. We introduce a tool that automates classification into the LIPID MAPS ontology of known lipids with >95% accuracy and novel lipids with 63% accuracy. The classification is based upon simple chemical characteristics and modern machine learning algorithms. The decision trees produced are intelligible and can be used to clarify implicit assumptions about the current LIPID MAPS classification scheme. These characteristics and decision trees are made available to facilitate alternative implementations. We also discovered many hundreds of lipids that are currently misclassified in the LIPID MAPS database, strongly underscoring the need for automated classification. Source code and chemical characteristic lists as SMARTS search strings are available under an open-source license at https://www.github.com/princelab/lipid_classifier. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Smartphone Analytics: Mobilizing the Lab into the Cloud for Omic-Scale Analyses
Montenegro-Burke, J. Rafael; Phommavongsay, Thiery; Aisporna, Aries E.; ...
2016-08-25
Active data screening is an integral part of many scientific activities, and mobile technologies have greatly facilitated this process by minimizing the reliance on large hardware instrumentation. In order to meet with the increasingly growing field of metabolomics and heavy workload of data processing, we designed the first remote metabolomic data screening platform for mobile devices. Two mobile applications (apps), XCMS Mobile and METLIN Mobile, facilitate access to XCMS and METLIN, which are the most important components in the computer-based XCMS Online platforms. These mobile apps allow for the visualization and analysis of metabolic data throughout the entire analytical process.more » Specifically, XCMS Mobile and METLIN Mobile provide the capabilities for remote monitoring of data processing, real time notifications for the data processing, visualization and interactive analysis of processed data (e.g., cloud plots, principle component analysis, box-plots, extracted ion chromatograms, and hierarchical cluster analysis), and database searching for metabolite identification. These apps, available on Apple iOS and Google Android operating systems, allow for the migration of metabolomic research onto mobile devices for better accessibility beyond direct instrument operation. The utility of XCMS Mobile and METLIN Mobile functionalities was developed and is demonstrated here through the metabolomic LC-MS analyses of stem cells, colon cancer, aging, and bacterial metabolism.« less
Smartphone Analytics: Mobilizing the Lab into the Cloud for Omic-Scale Analyses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Montenegro-Burke, J. Rafael; Phommavongsay, Thiery; Aisporna, Aries E.
Active data screening is an integral part of many scientific activities, and mobile technologies have greatly facilitated this process by minimizing the reliance on large hardware instrumentation. In order to meet with the increasingly growing field of metabolomics and heavy workload of data processing, we designed the first remote metabolomic data screening platform for mobile devices. Two mobile applications (apps), XCMS Mobile and METLIN Mobile, facilitate access to XCMS and METLIN, which are the most important components in the computer-based XCMS Online platforms. These mobile apps allow for the visualization and analysis of metabolic data throughout the entire analytical process.more » Specifically, XCMS Mobile and METLIN Mobile provide the capabilities for remote monitoring of data processing, real time notifications for the data processing, visualization and interactive analysis of processed data (e.g., cloud plots, principle component analysis, box-plots, extracted ion chromatograms, and hierarchical cluster analysis), and database searching for metabolite identification. These apps, available on Apple iOS and Google Android operating systems, allow for the migration of metabolomic research onto mobile devices for better accessibility beyond direct instrument operation. The utility of XCMS Mobile and METLIN Mobile functionalities was developed and is demonstrated here through the metabolomic LC-MS analyses of stem cells, colon cancer, aging, and bacterial metabolism.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Guotian; Jain, Rashmi; Chern, Mawsheng
The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportionmore » of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. In conclusion, this work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations.« less
[Mapping of the key oncology indicators available in France].
Laanani, Moussa; Vongmany, Natalie; Lafay, Lionel; Cerf, Nicole Rasamimanana; Le Quellec-Nathan, Martine; Viguier, Jérôme; Bousquet, Philippe Jean
2014-01-01
Available data in the field of oncology in France are scattered due to the large number of available indicators and their sources. In order to facilitate identification and analysis of these indicators, the French National Cancer Institute (INCa) has mapped the main indicators available in oncology. Mapping was based on the needs of various categories of potential users. Standardized interviews were conducted face-to-face or by email among representatives to determine their needs and expectations. The underlying data sources were also identified: databases, national surveys, websites. A first selection of indicators was proposed in the report entitled "La situation du cancer en France en 2009" ("The state of cancer in France in 2009") and was expanded. Data collection concerning indicators was performed among INCa correspondents for each theme. Several themes were defined: epidemiology, prevention and risk factors, screening, medical demography, health care offer, living conditions, costs and expenses, research. Data were classified according to: geographical coverage, age, gender, type of cancer, occupational categories. This information was collected for each indicator selected and was made available via the cancer data website (http://lesdonnees.e-cancer.fr). The available oncology indicators are numerous and scattered. Mapping can be a useful tool to facilitate access to these indicators. It should be regularly updated to reflect the most recent data.
Nucleotide sequencing and identification of some wild mushrooms.
Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari
2013-01-01
The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits.
Radio Frequency Identification (RFID) technology and patient safety
Ajami, Sima; Rajabzadeh, Ahmad
2013-01-01
Background: Radio frequency identification (RFID) systems have been successfully applied in areas of manufacturing, supply chain, agriculture, transportation, healthcare, and services to name a few. However, the different advantages and disadvantages expressed in various studies of the challenges facing the technology of the use of the RFID technology have been met with skepticism by managers of healthcare organizations. The aim of this study was to express and display the role of RFID technology in improving patient safety and increasing the impact of it in healthcare. Materials and Methods: This study was non-systematical review, which the literature search was conducted with the help of libraries, books, conference proceedings, PubMed databases and also search engines available at Google, Google scholar in which published between 2004 and 2013 during Febuary 2013. We employed the following keywords and their combinations; RFID, healthcare, patient safety, medical errors, and medication errors in the searching areas of title, keywords, abstract, and full text. Results: The preliminary search resulted in 68 articles. After a careful analysis of the content of each paper, a total of 33 papers was selected based on their relevancy. Conclusion: We should integrate RFID with hospital information systems (HIS) and electronic health records (EHRs) and support it by clinical decision support systems (CDSS), it facilitates processes and reduce medical, medication and diagnosis errors. PMID:24381626
Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S
2015-10-19
Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species.
Radio Frequency Identification (RFID) technology and patient safety.
Ajami, Sima; Rajabzadeh, Ahmad
2013-09-01
Radio frequency identification (RFID) systems have been successfully applied in areas of manufacturing, supply chain, agriculture, transportation, healthcare, and services to name a few. However, the different advantages and disadvantages expressed in various studies of the challenges facing the technology of the use of the RFID technology have been met with skepticism by managers of healthcare organizations. The aim of this study was to express and display the role of RFID technology in improving patient safety and increasing the impact of it in healthcare. This study was non-systematical review, which the literature search was conducted with the help of libraries, books, conference proceedings, PubMed databases and also search engines available at Google, Google scholar in which published between 2004 and 2013 during Febuary 2013. We employed the following keywords and their combinations; RFID, healthcare, patient safety, medical errors, and medication errors in the searching areas of title, keywords, abstract, and full text. The preliminary search resulted in 68 articles. After a careful analysis of the content of each paper, a total of 33 papers was selected based on their relevancy. We should integrate RFID with hospital information systems (HIS) and electronic health records (EHRs) and support it by clinical decision support systems (CDSS), it facilitates processes and reduce medical, medication and diagnosis errors.
Nascimento, M H S; Almeida, M S; Veira, M N S; Limeira Filho, D; Lima, R C; Barros, M C; Fraga, E C
2016-08-29
DNA barcoding is a useful complementary tool for use in traditional taxonomic studies due to its ability to detect cryptic species, and may be particularly efficient in the identification of fish species. The fish fauna of the Itapecuru River represents an important fishery resource in the Brazilian State of Maranhão, although it is currently suffering increasing degradation as a result of anthropogenic impacts. Therefore, DNA barcoding was used in the present study to identify fish species and establish a database of the rich freshwater fish fauna of Maranhão. A total of 440 specimens were analyzed, corresponding to 64 species belonging to 59 genera, 31 families, and 10 orders. Overall, 92.19% of these species could be identified by DNA barcoding, and were characterized by low levels (average 0.80%) of intra-specific divergence. However, five species (Anableps anableps, Gymnotus carapo, Sciades couma, Pseudauchenipterus nodosus, and Leporinus piau) presented values of mean genetic divergence above 3%, indicating the existence of cryptic diversity in these fishes. The DNA barcoding approach permitted the analysis of a large number of specimens and facilitated the discrimination and identification of closely related fish species in the Itapecuru Basin.
Wu, Yunke; Trepanowski, Nevada F; Molongoski, John J; Reagel, Peter F; Lingafelter, Steven W; Nadel, Hannah; Myers, Scott W; Ray, Ann M
2017-01-16
Global trade facilitates the inadvertent movement of insect pests and subsequent establishment of populations outside their native ranges. Despite phytosanitary measures, nonnative insects arrive at United States (U.S.) ports of entry as larvae in solid wood packaging material (SWPM). Identification of wood-boring larval insects is important for pest risk analysis and management, but is difficult beyond family level due to highly conserved morphology. Therefore, we integrated DNA barcoding and rearing of larvae to identify wood-boring insects in SWPM. From 2012 to 2015, we obtained larvae of 338 longhorned beetles (Cerambycidae) and 38 metallic wood boring beetles (Buprestidae) intercepted in SWPM associated with imported products at six U.S. ports. We identified 265 specimens to species or genus using DNA barcodes. Ninety-three larvae were reared to adults and identified morphologically. No conflict was found between the two approaches, which together identified 275 cerambycids (23 genera) and 16 buprestids (4 genera). Our integrated approach confirmed novel DNA barcodes for seven species (10 specimens) of woodborers not in public databases. This study demonstrates the utility of DNA barcoding as a tool for regulatory agencies. We provide important documentation of potential beetle pests that may cross country borders through the SWPM pathway.
Wu, Yunke; Trepanowski, Nevada F.; Molongoski, John J.; Reagel, Peter F.; Lingafelter, Steven W.; Nadel, Hannah; Myers, Scott W.; Ray, Ann M.
2017-01-01
Global trade facilitates the inadvertent movement of insect pests and subsequent establishment of populations outside their native ranges. Despite phytosanitary measures, nonnative insects arrive at United States (U.S.) ports of entry as larvae in solid wood packaging material (SWPM). Identification of wood-boring larval insects is important for pest risk analysis and management, but is difficult beyond family level due to highly conserved morphology. Therefore, we integrated DNA barcoding and rearing of larvae to identify wood-boring insects in SWPM. From 2012 to 2015, we obtained larvae of 338 longhorned beetles (Cerambycidae) and 38 metallic wood boring beetles (Buprestidae) intercepted in SWPM associated with imported products at six U.S. ports. We identified 265 specimens to species or genus using DNA barcodes. Ninety-three larvae were reared to adults and identified morphologically. No conflict was found between the two approaches, which together identified 275 cerambycids (23 genera) and 16 buprestids (4 genera). Our integrated approach confirmed novel DNA barcodes for seven species (10 specimens) of woodborers not in public databases. This study demonstrates the utility of DNA barcoding as a tool for regulatory agencies. We provide important documentation of potential beetle pests that may cross country borders through the SWPM pathway. PMID:28091577
Improving Recall Using Database Management Systems: A Learning Strategy.
ERIC Educational Resources Information Center
Jonassen, David H.
1986-01-01
Describes the use of microcomputer database management systems to facilitate the instructional uses of learning strategies relating to information processing skills, especially recall. Two learning strategies, cross-classification matrixing and node acquisition and integration, are highlighted. (Author/LRW)
ERIC Educational Resources Information Center
McGrew, Kevin; And Others
This research analyzes similarities and differences in how students with disabilities are identified in national databases, through examination of 19 national data collection programs in the U.S. Departments of Education, Commerce, Justice, and Health and Human Services, as well as databases from the National Science Foundation. The study found…
Code of Federal Regulations, 2013 CFR
2013-04-01
... ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GENERAL HOSPITAL AND PERSONAL... identification code is used to access patient identity and corresponding health information stored in a database...
Code of Federal Regulations, 2014 CFR
2014-04-01
... ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GENERAL HOSPITAL AND PERSONAL... identification code is used to access patient identity and corresponding health information stored in a database...
Code of Federal Regulations, 2012 CFR
2012-04-01
... ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GENERAL HOSPITAL AND PERSONAL... identification code is used to access patient identity and corresponding health information stored in a database...
Code of Federal Regulations, 2011 CFR
2011-04-01
... ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GENERAL HOSPITAL AND PERSONAL... identification code is used to access patient identity and corresponding health information stored in a database...
NASA Astrophysics Data System (ADS)
Wantuch, Andrew C.; Vita, Joshua A.; Jimenez, Edward S.; Bray, Iliana E.
2016-10-01
Despite object detection, recognition, and identification being very active areas of computer vision research, many of the available tools to aid in these processes are designed with only photographs in mind. Although some algorithms used specifically for feature detection and identification may not take explicit advantage of the colors available in the image, they still under-perform on radiographs, which are grayscale images. We are especially interested in the robustness of these algorithms, specifically their performance on a preexisting database of X-ray radiographs in compressed JPEG form, with multiple ways of describing pixel information. We will review various aspects of the performance of available feature detection and identification systems, including MATLABs Computer Vision toolbox, VLFeat, and OpenCV on our non-ideal database. In the process, we will explore possible reasons for the algorithms' lessened ability to detect and identify features from the X-ray radiographs.
Jones, Andrew R; Siepen, Jennifer A; Hubbard, Simon J; Paton, Norman W
2009-03-01
LC-MS experiments can generate large quantities of data, for which a variety of database search engines are available to make peptide and protein identifications. Decoy databases are becoming widely used to place statistical confidence in result sets, allowing the false discovery rate (FDR) to be estimated. Different search engines produce different identification sets so employing more than one search engine could result in an increased number of peptides (and proteins) being identified, if an appropriate mechanism for combining data can be defined. We have developed a search engine independent score, based on FDR, which allows peptide identifications from different search engines to be combined, called the FDR Score. The results demonstrate that the observed FDR is significantly different when analysing the set of identifications made by all three search engines, by each pair of search engines or by a single search engine. Our algorithm assigns identifications to groups according to the set of search engines that have made the identification, and re-assigns the score (combined FDR Score). The combined FDR Score can differentiate between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine.
Machouart, Marie; Morio, Florent; Sabou, Marcela; Kauffmann-LaCroix, Catherine; Contet-Audonneau, Nelly; Candolfi, Ermanno; Letscher-Bru, Valérie
2016-01-01
ABSTRACT The genus Malassezia comprises commensal yeasts on human skin. These yeasts are involved in superficial infections but are also isolated in deeper infections, such as fungemia, particularly in certain at-risk patients, such as neonates or patients with parenteral nutrition catheters. Very little is known about Malassezia epidemiology and virulence. This is due mainly to the difficulty of distinguishing species. Currently, species identification is based on morphological and biochemical characteristics. Only molecular biology techniques identify species with certainty, but they are time-consuming and expensive. The aim of this study was to develop and evaluate a matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) database for identifying Malassezia species by mass spectrometry. Eighty-five Malassezia isolates from patients in three French university hospitals were investigated. Each strain was identified by internal transcribed spacer sequencing. Forty-five strains of the six species Malassezia furfur, M. sympodialis, M. slooffiae, M. globosa, M. restricta, and M. pachydermatis allowed the creation of a MALDI-TOF database. Forty other strains were used to test this database. All strains were identified by our Malassezia database with log scores of >2.0, according to the manufacturer's criteria. Repeatability and reproducibility tests showed a coefficient of variation of the log score values of <10%. In conclusion, our new Malassezia database allows easy, fast, and reliable identification of Malassezia species. Implementation of this database will contribute to a better, more rapid identification of Malassezia species and will be helpful in gaining a better understanding of their epidemiology. PMID:27795342
Denis, Julie; Machouart, Marie; Morio, Florent; Sabou, Marcela; Kauffmann-LaCroix, Catherine; Contet-Audonneau, Nelly; Candolfi, Ermanno; Letscher-Bru, Valérie
2017-01-01
The genus Malassezia comprises commensal yeasts on human skin. These yeasts are involved in superficial infections but are also isolated in deeper infections, such as fungemia, particularly in certain at-risk patients, such as neonates or patients with parenteral nutrition catheters. Very little is known about Malassezia epidemiology and virulence. This is due mainly to the difficulty of distinguishing species. Currently, species identification is based on morphological and biochemical characteristics. Only molecular biology techniques identify species with certainty, but they are time-consuming and expensive. The aim of this study was to develop and evaluate a matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) database for identifying Malassezia species by mass spectrometry. Eighty-five Malassezia isolates from patients in three French university hospitals were investigated. Each strain was identified by internal transcribed spacer sequencing. Forty-five strains of the six species Malassezia furfur, M. sympodialis, M. slooffiae, M. globosa, M. restricta, and M. pachydermatis allowed the creation of a MALDI-TOF database. Forty other strains were used to test this database. All strains were identified by our Malassezia database with log scores of >2.0, according to the manufacturer's criteria. Repeatability and reproducibility tests showed a coefficient of variation of the log score values of <10%. In conclusion, our new Malassezia database allows easy, fast, and reliable identification of Malassezia species. Implementation of this database will contribute to a better, more rapid identification of Malassezia species and will be helpful in gaining a better understanding of their epidemiology. Copyright © 2016 Denis et al.
Achieving high confidence protein annotations in a sea of unknowns
NASA Astrophysics Data System (ADS)
Timmins-Schiffman, E.; May, D. H.; Noble, W. S.; Nunn, B. L.; Mikan, M.; Harvey, H. R.
2016-02-01
Increased sensitivity of mass spectrometry (MS) technology allows deep and broad insight into community functional analyses. Metaproteomics holds the promise to reveal functional responses of natural microbial communities, whereas metagenomics alone can only hint at potential functions. The complex datasets resulting from ocean MS have the potential to inform diverse realms of the biological, chemical, and physical ocean sciences, yet the extent of bacterial functional diversity and redundancy has not been fully explored. To take advantage of these impressive datasets, we need a clear bioinformatics pipeline for metaproteomics peptide identification and annotation with a database that can provide confident identifications. Researchers must consider whether it is sufficient to leverage the vast quantities of available ocean sequence data or if they must invest in site-specific metagenomic sequencing. We have sequenced, to our knowledge, the first western arctic metagenomes from the Bering Strait and the Chukchi Sea. We have addressed the long standing question: Is a metagenome required to accurately complete metaproteomics and assess the biological distribution of metabolic functions controlling nutrient acquisition in the ocean? Two different protein databases were constructed from 1) a site-specific metagenome and 2) subarctic/arctic groups available in NCBI's non-redundant database. Multiple proteomic search strategies were employed, against each individual database and against both databases combined, to determine the algorithm and approach that yielded the balance of high sensitivity and confident identification. Results yielded over 8200 confidently identified proteins. Our comparison of these results allows us to quantify the utility of investing resources in a metagenome versus using the constantly expanding and immediately available public databases for metaproteomic studies.
2010-01-01
Background Fruit development, maturation and ripening consists of a complex series of biochemical and physiological changes that in climacteric fruits, including apple and tomato, are coordinated by the gaseous hormone ethylene. These changes lead to final fruit quality and understanding of the functional machinery underlying these processes is of both biological and practical importance. To date many reports have been made on the analysis of gene expression in apple. In this study we focused our investigation on the role of ethylene during apple maturation, specifically comparing transcriptomics of normal ripening with changes resulting from application of the hormone receptor competitor 1-Methylcyclopropene. Results To gain insight into the molecular process regulating ripening in apple, and to compare to tomato (model species for ripening studies), we utilized both homologous and heterologous (tomato) microarray to profile transcriptome dynamics of genes involved in fruit development and ripening, emphasizing those which are ethylene regulated. The use of both types of microarrays facilitated transcriptome comparison between apple and tomato (for the later using data previously published and available at the TED: tomato expression database) and highlighted genes conserved during ripening of both species, which in turn represent a foundation for further comparative genomic studies. The cross-species analysis had the secondary aim of examining the efficiency of heterologous (specifically tomato) microarray hybridization for candidate gene identification as related to the ripening process. The resulting transcriptomics data revealed coordinated gene expression during fruit ripening of a subset of ripening-related and ethylene responsive genes, further facilitating the analysis of ethylene response during fruit maturation and ripening. Conclusion Our combined strategy based on microarray hybridization enabled transcriptome characterization during normal climacteric apple ripening, as well as definition of ethylene-dependent transcriptome changes. Comparison with tomato fruit maturation and ethylene responsive transcriptome activity facilitated identification of putative conserved orthologous ripening-related genes, which serve as an initial set of candidates for assessing conservation of gene activity across genomes of fruit bearing plant species. PMID:20973957
Designing a User Manual to Support an In-House Database.
ERIC Educational Resources Information Center
Kraft, Melissa A.; Pugh, W. Jean
1988-01-01
Describes the steps involved in designing a user manual for an in-house database. Topics covered include goal definition, target audience identification, production scheduling, design and production choices, testing and review, and updating of the manual. (CLB)
Molecular Identification and Databases in Fusarium
USDA-ARS?s Scientific Manuscript database
DNA sequence-based methods for identifying pathogenic and mycotoxigenic Fusarium isolates have become the gold standard worldwide. Moreover, fusarial DNA sequence data are increasing rapidly in several web-accessible databases for comparative purposes. Unfortunately, the use of Basic Alignment Sea...
Kang, Lin; Li, Nan; Li, Ping; Zhou, Yang; Gao, Shan; Gao, Hongwei; Xin, Wenwen; Wang, Jinglin
2017-04-01
Salmonella can cause global foodborne illnesses in humans and many animals. The current diagnostic gold standard used for detecting Salmonella infection is microbiological culture followed by serological confirmation tests. However, these methods are complicated and time-consuming. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis offers some advantages in rapid identification, for example, simple and fast sample preparation, fast and automated measurement, and robust and reliable identification up to genus and species levels, possibly even to the strain level. In this study, we established a reference database for species identification using whole-cell MALDI-TOF MS; the database consisted of 12 obtained main spectra of the Salmonella culture collection strains belonged to seven serotypes. Eighty-two clinical isolates of Salmonella were identified using established database, and partial 16S rDNA gene sequencing and serological method were used as comparison. We found that MALDI-TOF mass spectrometry provided high accuracy in identification of Salmonella at species level but was limited to type or subtype Salmonella serovars. We also tried to find serovar-specific biomarkers and failed. Our study demonstrated that (a) MALDI-TOF MS was suitable for identification of Salmonella at species level with high accuracy and (b) that MALDI-TOF MS method presented in this study was not useful for serovar assignment of Salmonella currently, because of its low matching with serological method and (c) MALDI-TOF MS method presented in this study was not suitable to subtype S. typhimurium because of its low discriminatory ability.
Becker, Pierre T; de Bel, Annelies; Martiny, Delphine; Ranque, Stéphane; Piarroux, Renaud; Cassagne, Carole; Detandt, Monique; Hendrickx, Marijke
2014-11-01
The identification of filamentous fungi by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) relies mainly on a robust and extensive database of reference spectra. To this end, a large in-house library containing 760 strains and representing 472 species was built and evaluated on 390 clinical isolates by comparing MALDI-TOF MS with the classical identification method based on morphological observations. The use of MALDI-TOF MS resulted in the correct identification of 95.4% of the isolates at species level, without considering LogScore values. Taking into account the Brukers' cutoff value for reliability (LogScore >1.70), 85.6% of the isolates were correctly identified. For a number of isolates, microscopic identification was limited to the genus, resulting in only 61.5% of the isolates correctly identified at species level while the correctness reached 94.6% at genus level. Using this extended in-house database, MALDI-TOF MS thus appears superior to morphology in order to obtain a robust and accurate identification of filamentous fungi. A continuous extension of the library is however necessary to further improve its reliability. Indeed, 15 isolates were still not represented while an additional three isolates were not recognized, probably because of a lack of intraspecific variability of the corresponding species in the database. © The Author 2014. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Using Web Database Tools To Facilitate the Construction of Knowledge in Online Courses.
ERIC Educational Resources Information Center
McNeil, Sara G.; Robin, Bernard R.
This paper presents an overview of database tools that dynamically generate World Wide Web materials and focuses on the use of these tools to support research activities, as well as teaching and learning. Database applications have been used in classrooms to support learning activities for over a decade, but, although business and e-commerce have…
ERIC Educational Resources Information Center
Li, Rui; Liu, Min
2007-01-01
The purpose of this study is to examine the potential of using computer databases as cognitive tools to share learners' cognitive load and facilitate learning in a multimedia problem-based learning (PBL) environment designed for sixth graders. Two research questions were: (a) can the computer database tool share sixth-graders' cognitive load? and…
Evaluation of the Microbial Identification System for identification of clinically isolated yeasts.
Crist, A E; Johnson, L M; Burke, P J
1996-01-01
The Microbial Identification System (MIS; Microbial ID, Inc., Newark, Del.) was evaluated for the identification of 550 clinically isolated yeasts. The organisms evaluated were fresh clinical isolates identified by methods routinely used in our laboratory (API 20C and conventional methods) and included Candida albicans (n = 294), C. glabrata (n = 145), C. tropicalis (n = 58), C. parapsilosis (n = 33), and other yeasts (n = 20). In preparation for fatty acid analysis, yeasts were inoculated onto Sabouraud dextrose agar and incubated at 28 degrees C for 24 h. Yeasts were harvested, saponified, derivatized, and extracted, and fatty acid analysis was performed according to the manufacturer's instructions. Fatty acid profiles were analyzed, and computer identifications were made with the Yeast Clinical Library (database version 3.8). Of the 550 isolates tested, 374 (68.0%) were correctly identified to the species level, with 87 (15.8%) being incorrectly identified and 89 (16.2%) giving no identification. Repeat testing of isolates giving no identification resulted in an additional 18 isolates being correctly identified. This gave the MIS an overall identification rate of 71.3%. The most frequently misidentified yeast was C. glabrata, which was identified as Saccharomyces cerevisiae 32.4% of the time. On the basis of these results, the MIS, with its current database, does not appear suitable for the routine identification of clinically important yeasts. PMID:8880489
Yang, Lawrence H; Wonpat-Borja, Ahtoy J
2012-08-01
Identifying factors that facilitate treatment for psychotic disorders among Chinese-immigrants is crucial due to delayed treatment use. Identifying causal beliefs held by relatives that might predict identification of 'mental illness' as opposed to other 'indigenous labels' may promote more effective mental health service use. We examine what effects beliefs of 'physical causes' and other non-biomedical causal beliefs ('general social causes', and 'indigenous Chinese beliefs' or culture-specific epistemologies of illness) might have on mental illness identification. Forty-nine relatives of Chinese-immigrant consumers with psychosis were sampled. Higher endorsement of 'physical causes' was associated with mental illness labeling. However among the non-biomedical causal beliefs, 'general social causes' demonstrated no relationship with mental illness identification, while endorsement of 'indigenous Chinese beliefs' showed a negative relationship. Effective treatment- and community-based psychoeducation, in addition to emphasizing biomedical models, might integrate indigenous Chinese epistemologies of illness to facilitate rapid identification of psychotic disorders and promote treatment use.
Wonpat-Borja, Ahtoy J.
2013-01-01
Identifying factors that facilitate treatment for psychotic disorders among Chinese-immigrants is crucial due to delayed treatment use. Identifying causal beliefs held by relatives that might predict identification of ‘mental illness’ as opposed to other ‘indigenous labels’ may promote more effective mental health service use. We examine what effects beliefs of ‘physical causes’ and other non-biomedical causal beliefs (‘general social causes’, and ‘indigenous Chinese beliefs’ or culture-specific epistemologies of illness) might have on mental illness identification. Forty-nine relatives of Chinese-immigrant consumers with psychosis were sampled. Higher endorsement of ‘physical causes’ was associated with mental illness labeling. However among the non-biomedical causal beliefs, ‘general social causes’ demonstrated no relationship with mental illness identification, while endorsement of ‘indigenous Chinese beliefs’ showed a negative relationship. Effective treatment- and community-based psychoeducation, in addition to emphasizing biomedical models, might integrate indigenous Chinese epistemologies of illness to facilitate rapid identification of psychotic disorders and promote treatment use. PMID:22075770
Gruszka, Damian; Marzec, Marek; Szarejko, Iwona
2012-06-14
The high level of conservation of genes that regulate DNA replication and repair indicates that they may serve as a source of information on the origin and evolution of the species and makes them a reliable system for the identification of cross-species homologs. Studies that had been conducted to date shed light on the processes of DNA replication and repair in bacteria, yeast and mammals. However, there is still much to be learned about the process of DNA damage repair in plants. These studies, which were conducted mainly using bioinformatics tools, enabled the list of genes that participate in various pathways of DNA repair in Arabidopsis thaliana (L.) Heynh to be outlined; however, information regarding these mechanisms in crop plants is still very limited. A similar, functional approach is particularly difficult for a species whose complete genomic sequences are still unavailable. One of the solutions is to apply ESTs (Expressed Sequence Tags) as the basis for gene identification. For the construction of the barley EST DNA Replication and Repair Database (bEST-DRRD), presented here, the Arabidopsis nucleotide and protein sequences involved in DNA replication and repair were used to browse for and retrieve the deposited sequences, derived from four barley (Hordeum vulgare L.) sequence databases, including the "Barley Genome version 0.05" database (encompassing ca. 90% of barley coding sequences) and from two databases covering the complete genomes of two monocot models: Oryza sativa L. and Brachypodium distachyon L. in order to identify homologous genes. Sequences of the categorised Arabidopsis queries are used for browsing the repositories, which are located on the ViroBLAST platform. The bEST-DRRD is currently used in our project during the identification and validation of the barley genes involved in DNA repair. The presented database provides information about the Arabidopsis genes involved in DNA replication and repair, their expression patterns and models of protein interactions. It was designed and established to provide an open-access tool for the identification of monocot homologs of known Arabidopsis genes that are responsible for DNA-related processes. The barley genes identified in the project are currently being analysed to validate their function.
NASA Astrophysics Data System (ADS)
Thorne, James H.; Girvetz, Evan H.; McCoy, Michael C.
2009-05-01
This study presents a GIS-based database framework used to assess aggregate terrestrial habitat impacts from multiple highway construction projects in California, USA. Transportation planners need such impact assessment tools to effectively address additive biological mitigation obligations. Such assessments can reduce costly delays due to protracted environmental review. This project incorporated the best available statewide natural resource data into early project planning and preliminary environmental assessments for single and multiple highway construction projects, and provides an assessment of the 10-year state-wide mitigation obligations for the California Department of Transportation. Incorporation of these assessments will facilitate early and more strategic identification of mitigation opportunities, for single-project and regional mitigation efforts. The data architecture format uses eight spatial scales: six nested watersheds, counties, and transportation planning districts, which were intersected. This resulted in 8058 map planning units statewide, which were used to summarize all subsequent analyses. Range maps and georeferenced locations of federally and state-listed plants and animals and a 55-class landcover map were spatially intersected with the planning units and the buffered spatial footprint of 967 funded projects. Projected impacts were summarized and output to the database. Queries written in the database can sum expected impacts and provide summaries by individual construction project, or by watershed, county, transportation district or highway. The data architecture allows easy incorporation of new information and results in a tool usable without GIS by a wide variety of agency biologists and planners. The data architecture format would be useful for other types of regional planning.
Thorne, James H; Girvetz, Evan H; McCoy, Michael C
2009-05-01
This study presents a GIS-based database framework used to assess aggregate terrestrial habitat impacts from multiple highway construction projects in California, USA. Transportation planners need such impact assessment tools to effectively address additive biological mitigation obligations. Such assessments can reduce costly delays due to protracted environmental review. This project incorporated the best available statewide natural resource data into early project planning and preliminary environmental assessments for single and multiple highway construction projects, and provides an assessment of the 10-year state-wide mitigation obligations for the California Department of Transportation. Incorporation of these assessments will facilitate early and more strategic identification of mitigation opportunities, for single-project and regional mitigation efforts. The data architecture format uses eight spatial scales: six nested watersheds, counties, and transportation planning districts, which were intersected. This resulted in 8058 map planning units statewide, which were used to summarize all subsequent analyses. Range maps and georeferenced locations of federally and state-listed plants and animals and a 55-class landcover map were spatially intersected with the planning units and the buffered spatial footprint of 967 funded projects. Projected impacts were summarized and output to the database. Queries written in the database can sum expected impacts and provide summaries by individual construction project, or by watershed, county, transportation district or highway. The data architecture allows easy incorporation of new information and results in a tool usable without GIS by a wide variety of agency biologists and planners. The data architecture format would be useful for other types of regional planning.
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2007-01-01
Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
Influenza research database: an integrated bioinformatics resource for influenza virus research
USDA-ARS?s Scientific Manuscript database
The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics, an...
Peptide reranking with protein-peptide correspondence and precursor peak intensity information.
Yang, Chao; He, Zengyou; Yang, Can; Yu, Weichuan
2012-01-01
Searching tandem mass spectra against a protein database has been a mainstream method for peptide identification. Improving peptide identification results by ranking true Peptide-Spectrum Matches (PSMs) over their false counterparts leads to the development of various reranking algorithms. In peptide reranking, discriminative information is essential to distinguish true PSMs from false PSMs. Generally, most peptide reranking methods obtain discriminative information directly from database search scores or by training machine learning models. Information in the protein database and MS1 spectra (i.e., single stage MS spectra) is ignored. In this paper, we propose to use information in the protein database and MS1 spectra to rerank peptide identification results. To quantitatively analyze their effects to peptide reranking results, three peptide reranking methods are proposed: PPMRanker, PPIRanker, and MIRanker. PPMRanker only uses Protein-Peptide Map (PPM) information from the protein database, PPIRanker only uses Precursor Peak Intensity (PPI) information, and MIRanker employs both PPM information and PPI information. According to our experiments on a standard protein mixture data set, a human data set and a mouse data set, PPMRanker and MIRanker achieve better peptide reranking results than PetideProphet, PeptideProphet+NSP (number of sibling peptides) and a score regularization method SRPI. The source codes of PPMRanker, PPIRanker, and MIRanker, and all supplementary documents are available at our website: http://bioinformatics.ust.hk/pepreranking/. Alternatively, these documents can also be downloaded from: http://sourceforge.net/projects/pepreranking/.
Renard, Bernhard Y.; Xu, Buote; Kirchner, Marc; Zickmann, Franziska; Winter, Dominic; Korten, Simone; Brattig, Norbert W.; Tzur, Amit; Hamprecht, Fred A.; Steen, Hanno
2012-01-01
Currently, the reliable identification of peptides and proteins is only feasible when thoroughly annotated sequence databases are available. Although sequencing capacities continue to grow, many organisms remain without reliable, fully annotated reference genomes required for proteomic analyses. Standard database search algorithms fail to identify peptides that are not exactly contained in a protein database. De novo searches are generally hindered by their restricted reliability, and current error-tolerant search strategies are limited by global, heuristic tradeoffs between database and spectral information. We propose a Bayesian information criterion-driven error-tolerant peptide search (BICEPS) and offer an open source implementation based on this statistical criterion to automatically balance the information of each single spectrum and the database, while limiting the run time. We show that BICEPS performs as well as current database search algorithms when such algorithms are applied to sequenced organisms, whereas BICEPS only uses a remotely related organism database. For instance, we use a chicken instead of a human database corresponding to an evolutionary distance of more than 300 million years (International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716). We demonstrate the successful application to cross-species proteomics with a 33% increase in the number of identified proteins for a filarial nematode sample of Litomosoides sigmodontis. PMID:22493179
Carmo, Michele Picanço; Costa, Nayara Thais de Oliveira; Momensohn-Santos, Teresa Maria
2013-10-01
Introduction For infants under 6 months, the literature recommends 1,000-Hz tympanometry, which has a greater sensitivity for the correct identification of middle ear disorders in this population. Objective To systematically analyze national and international publications found in electronic databases that used tympanometry with 226-Hz and 1,000-Hz probe tones. Data Synthesis Initially, we identified 36 articles in the SciELO database, 11 in the Latin American and Caribbean Literature on the Health Sciences (LILACS) database, 199 in MEDLINE, 0 in the Cochrane database, 16 in ISI Web of Knowledge, and 185 in the Scopus database. We excluded 433 articles because they did not fit the selection criteria, leaving 14 publications that were analyzed in their entirety. Conclusions The 1,000-Hz tone test has greater sensitivity and specificity for the correct identification of tympanometric curve changes. However, it is necessary to clarify the doubts that still exist regarding the use of this test frequency. Improved methods for rating curves, standardization of normality criteria, and the types of curves found in infants should be addressed.
Carmo, Michele Picanço; Costa, Nayara Thais de Oliveira; Momensohn-Santos, Teresa Maria
2013-01-01
Introduction For infants under 6 months, the literature recommends 1,000-Hz tympanometry, which has a greater sensitivity for the correct identification of middle ear disorders in this population. Objective To systematically analyze national and international publications found in electronic databases that used tympanometry with 226-Hz and 1,000-Hz probe tones. Data Synthesis Initially, we identified 36 articles in the SciELO database, 11 in the Latin American and Caribbean Literature on the Health Sciences (LILACS) database, 199 in MEDLINE, 0 in the Cochrane database, 16 in ISI Web of Knowledge, and 185 in the Scopus database. We excluded 433 articles because they did not fit the selection criteria, leaving 14 publications that were analyzed in their entirety. Conclusions The 1,000-Hz tone test has greater sensitivity and specificity for the correct identification of tympanometric curve changes. However, it is necessary to clarify the doubts that still exist regarding the use of this test frequency. Improved methods for rating curves, standardization of normality criteria, and the types of curves found in infants should be addressed. PMID:25992044
Wang, Julia; Al-Ouran, Rami; Hu, Yanhui; Kim, Seon-Young; Wan, Ying-Wooi; Wangler, Michael F; Yamamoto, Shinya; Chao, Hsiao-Tuan; Comjean, Aram; Mohr, Stephanie E; Perrimon, Norbert; Liu, Zhandong; Bellen, Hugo J
2017-06-01
One major challenge encountered with interpreting human genetic variants is the limited understanding of the functional impact of genetic alterations on biological processes. Furthermore, there remains an unmet demand for an efficient survey of the wealth of information on human homologs in model organisms across numerous databases. To efficiently assess the large volume of publically available information, it is important to provide a concise summary of the most relevant information in a rapid user-friendly format. To this end, we created MARRVEL (model organism aggregated resources for rare variant exploration). MARRVEL is a publicly available website that integrates information from six human genetic databases and seven model organism databases. For any given variant or gene, MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER. Importantly, it curates model organism-specific databases to concurrently display a concise summary regarding the human gene homologs in budding and fission yeast, worm, fly, fish, mouse, and rat on a single webpage. Experiment-based information on tissue expression, protein subcellular localization, biological process, and molecular function for the human gene and homologs in the seven model organisms are arranged into a concise output. Hence, rather than visiting multiple separate databases for variant and gene analysis, users can obtain important information by searching once through MARRVEL. Altogether, MARRVEL dramatically improves efficiency and accessibility to data collection and facilitates analysis of human genes and variants by cross-disciplinary integration of 18 million records available in public databases to facilitate clinical diagnosis and basic research. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
2010-01-01
Background Accurate identification is necessary to discriminate harmless environmental Yersinia species from the food-borne pathogens Yersinia enterocolitica and Yersinia pseudotuberculosis and from the group A bioterrorism plague agent Yersinia pestis. In order to circumvent the limitations of current phenotypic and PCR-based identification methods, we aimed to assess the usefulness of matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) protein profiling for accurate and rapid identification of Yersinia species. As a first step, we built a database of 39 different Yersinia strains representing 12 different Yersinia species, including 13 Y. pestis isolates representative of the Antiqua, Medievalis and Orientalis biotypes. The organisms were deposited on the MALDI-TOF plate after appropriate ethanol-based inactivation, and a protein profile was obtained within 6 minutes for each of the Yersinia species. Results When compared with a 3,025-profile database, every Yersinia species yielded a unique protein profile and was unambiguously identified. In the second step of analysis, environmental and clinical isolates of Y. pestis (n = 2) and Y. enterocolitica (n = 11) were compared to the database and correctly identified. In particular, Y. pestis was unambiguously identified at the species level, and MALDI-TOF was able to successfully differentiate the three biotypes. Conclusion These data indicate that MALDI-TOF can be used as a rapid and accurate first-line method for the identification of Yersinia isolates. PMID:21073689
Ayyadurai, Saravanan; Flaudrops, Christophe; Raoult, Didier; Drancourt, Michel
2010-11-12
Accurate identification is necessary to discriminate harmless environmental Yersinia species from the food-borne pathogens Yersinia enterocolitica and Yersinia pseudotuberculosis and from the group A bioterrorism plague agent Yersinia pestis. In order to circumvent the limitations of current phenotypic and PCR-based identification methods, we aimed to assess the usefulness of matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) protein profiling for accurate and rapid identification of Yersinia species. As a first step, we built a database of 39 different Yersinia strains representing 12 different Yersinia species, including 13 Y. pestis isolates representative of the Antiqua, Medievalis and Orientalis biotypes. The organisms were deposited on the MALDI-TOF plate after appropriate ethanol-based inactivation, and a protein profile was obtained within 6 minutes for each of the Yersinia species. When compared with a 3,025-profile database, every Yersinia species yielded a unique protein profile and was unambiguously identified. In the second step of analysis, environmental and clinical isolates of Y. pestis (n = 2) and Y. enterocolitica (n = 11) were compared to the database and correctly identified. In particular, Y. pestis was unambiguously identified at the species level, and MALDI-TOF was able to successfully differentiate the three biotypes. These data indicate that MALDI-TOF can be used as a rapid and accurate first-line method for the identification of Yersinia isolates.
ERIC Educational Resources Information Center
Hammonds, S. J.
1990-01-01
A technique for the numerical identification of bacteria using normalized likelihoods calculated from a probabilistic database is described, and the principles of the technique are explained. The listing of the computer program is included. Specimen results from the program, and examples of how they should be interpreted, are given. (KR)
DrugBank 4.0: shedding new light on drug metabolism
Law, Vivian; Knox, Craig; Djoumbou, Yannick; Jewison, Tim; Guo, An Chi; Liu, Yifeng; Maciejewski, Adam; Arndt, David; Wilson, Michael; Neveu, Vanessa; Tang, Alexandra; Gabriel, Geraldine; Ly, Carol; Adamjee, Sakina; Dame, Zerihun T.; Han, Beomsoo; Zhou, You; Wishart, David S.
2014-01-01
DrugBank (http://www.drugbank.ca) is a comprehensive online database containing extensive biochemical and pharmacological information about drugs, their mechanisms and their targets. Since it was first described in 2006, DrugBank has rapidly evolved, both in response to user requests and in response to changing trends in drug research and development. Previous versions of DrugBank have been widely used to facilitate drug and in silico drug target discovery. The latest update, DrugBank 4.0, has been further expanded to contain data on drug metabolism, absorption, distribution, metabolism, excretion and toxicity (ADMET) and other kinds of quantitative structure activity relationships (QSAR) information. These enhancements are intended to facilitate research in xenobiotic metabolism (both prediction and characterization), pharmacokinetics, pharmacodynamics and drug design/discovery. For this release, >1200 drug metabolites (including their structures, names, activity, abundance and other detailed data) have been added along with >1300 drug metabolism reactions (including metabolizing enzymes and reaction types) and dozens of drug metabolism pathways. Another 30 predicted or measured ADMET parameters have been added to each DrugCard, bringing the average number of quantitative ADMET values for Food and Drug Administration-approved drugs close to 40. Referential nuclear magnetic resonance and MS spectra have been added for almost 400 drugs as well as spectral and mass matching tools to facilitate compound identification. This expanded collection of drug information is complemented by a number of new or improved search tools, including one that provides a simple analyses of drug–target, –enzyme and –transporter associations to provide insight on drug–drug interactions. PMID:24203711
Functional Annotation of the Arabidopsis Genome Using Controlled Vocabularies1
Berardini, Tanya Z.; Mundodi, Suparna; Reiser, Leonore; Huala, Eva; Garcia-Hernandez, Margarita; Zhang, Peifen; Mueller, Lukas A.; Yoon, Jungwoon; Doyle, Aisling; Lander, Gabriel; Moseyko, Nick; Yoo, Danny; Xu, Iris; Zoeckler, Brandon; Montoya, Mary; Miller, Neil; Weems, Dan; Rhee, Seung Y.
2004-01-01
Controlled vocabularies are increasingly used by databases to describe genes and gene products because they facilitate identification of similar genes within an organism or among different organisms. One of The Arabidopsis Information Resource's goals is to associate all Arabidopsis genes with terms developed by the Gene Ontology Consortium that describe the molecular function, biological process, and subcellular location of a gene product. We have also developed terms describing Arabidopsis anatomy and developmental stages and use these to annotate published gene expression data. As of March 2004, we used computational and manual annotation methods to make 85,666 annotations representing 26,624 unique loci. We focus on associating genes to controlled vocabulary terms based on experimental data from the literature and use The Arabidopsis Information Resource-developed PubSearch software to facilitate this process. Each annotation is tagged with a combination of evidence codes, evidence descriptions, and references that provide a robust means to assess data quality. Annotation of all Arabidopsis genes will allow quantitative comparisons between sets of genes derived from sources such as microarray experiments. The Arabidopsis annotation data will also facilitate annotation of newly sequenced plant genomes by using sequence similarity to transfer annotations to homologous genes. In addition, complete and up-to-date annotations will make unknown genes easy to identify and target for experimentation. Here, we describe the process of Arabidopsis functional annotation using a variety of data sources and illustrate several ways in which this information can be accessed and used to infer knowledge about Arabidopsis and other plant species. PMID:15173566
DrugBank 4.0: shedding new light on drug metabolism.
Law, Vivian; Knox, Craig; Djoumbou, Yannick; Jewison, Tim; Guo, An Chi; Liu, Yifeng; Maciejewski, Adam; Arndt, David; Wilson, Michael; Neveu, Vanessa; Tang, Alexandra; Gabriel, Geraldine; Ly, Carol; Adamjee, Sakina; Dame, Zerihun T; Han, Beomsoo; Zhou, You; Wishart, David S
2014-01-01
DrugBank (http://www.drugbank.ca) is a comprehensive online database containing extensive biochemical and pharmacological information about drugs, their mechanisms and their targets. Since it was first described in 2006, DrugBank has rapidly evolved, both in response to user requests and in response to changing trends in drug research and development. Previous versions of DrugBank have been widely used to facilitate drug and in silico drug target discovery. The latest update, DrugBank 4.0, has been further expanded to contain data on drug metabolism, absorption, distribution, metabolism, excretion and toxicity (ADMET) and other kinds of quantitative structure activity relationships (QSAR) information. These enhancements are intended to facilitate research in xenobiotic metabolism (both prediction and characterization), pharmacokinetics, pharmacodynamics and drug design/discovery. For this release, >1200 drug metabolites (including their structures, names, activity, abundance and other detailed data) have been added along with >1300 drug metabolism reactions (including metabolizing enzymes and reaction types) and dozens of drug metabolism pathways. Another 30 predicted or measured ADMET parameters have been added to each DrugCard, bringing the average number of quantitative ADMET values for Food and Drug Administration-approved drugs close to 40. Referential nuclear magnetic resonance and MS spectra have been added for almost 400 drugs as well as spectral and mass matching tools to facilitate compound identification. This expanded collection of drug information is complemented by a number of new or improved search tools, including one that provides a simple analyses of drug-target, -enzyme and -transporter associations to provide insight on drug-drug interactions.
Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I; Marcotte, Edward M
2011-07-01
Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for every possible PSM and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for most proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses.
Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I.; Marcotte, Edward M.
2011-01-01
Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for all possible PSMs and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for all detected proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses. PMID:21488652
Mining a human transcriptome database for Nrf2 modulators
Nuclear factor erythroid-2 related factor 2 (Nrf2) is a key transcription factor important in the protection against oxidative stress. We developed computational procedures to enable the identification of chemical, genetic and environmental modulators of Nrf2 in a large database ...
AN EPA SPONSORED LITERATURE REVIEW DATABASE TO SUPPORT STRESSOR IDENTIFICATION
The Causal Analysis/Diagnosis Decision Information System (CADDIS) is an EPA decision-support system currently under development for evaluating the biological impact of stressors on water bodies. In support of CADDIS, EPA is developing CADLIT, a searchable database of the scient...
Santos, Sara; Oliveira, Manuela; Amorim, António; van Asch, Barbara
2014-11-01
The grapevine (Vitis vinifera subsp. vinifera) is one of the most important agricultural crops worldwide. A long interest in the historical origins of ancient and cultivated current grapevines, as well as the need to establish phylogenetic relationships and parentage, solve homonymies and synonymies, fingerprint cultivars and clones, and assess the authenticity of plants and wines has encouraged the development of genetic identification methods. STR analysis is currently the most commonly used method for these purposes. A large dataset of grapevines genotypes for many cultivars worldwide has been produced in the last decade using a common set of recommended dinucleotide nuclear STRs. This type of marker has been replaced by long core-repeat loci in standardized state-of-the-art human forensic genotyping. The first steps toward harmonized grapevine genotyping have already been taken to bring the genetic identification methods closer to human forensic STR standards by previous authors. In this context, we bring forward a set of basic suggestions that reinforce the need to (i) guarantee trueness-to-type of the sample; (ii) use the long core-repeat markers; (iii) verify the specificity and amplification consistency of PCR primers; (iv) sequence frequent alleles and use these standardized allele ladders; (v) consider mutation rates when evaluating results of STR-based parentage and pedigree analysis; (vi) genotype large and representative samples in order to obtain allele frequency databases; (vii) standardize genotype data by establishing allele nomenclature based on repeat number to facilitate information exchange and data compilation. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Code of Federal Regulations, 2014 CFR
2014-07-01
... maintenance of aerial photographic records? (a) Mark each aerial film container with a unique identification code to facilitate identification and filing. (b) Mark aerial film indexes with the unique aerial film identification codes or container codes for the aerial film that they index. Also, file and mark the aerial...
Code of Federal Regulations, 2013 CFR
2013-07-01
... maintenance of aerial photographic records? (a) Mark each aerial film container with a unique identification code to facilitate identification and filing. (b) Mark aerial film indexes with the unique aerial film identification codes or container codes for the aerial film that they index. Also, file and mark the aerial...
Code of Federal Regulations, 2012 CFR
2012-07-01
... maintenance of aerial photographic records? (a) Mark each aerial film container with a unique identification code to facilitate identification and filing. (b) Mark aerial film indexes with the unique aerial film identification codes or container codes for the aerial film that they index. Also, file and mark the aerial...
Code of Federal Regulations, 2011 CFR
2011-07-01
... maintenance of aerial photographic records? (a) Mark each aerial film container with a unique identification code to facilitate identification and filing. (b) Mark aerial film indexes with the unique aerial film identification codes or container codes for the aerial film that they index. Also, file and mark the aerial...
Code of Federal Regulations, 2010 CFR
2010-07-01
... maintenance of aerial photographic records? (a) Mark each aerial film container with a unique identification code to facilitate identification and filing. (b) Mark aerial film indexes with the unique aerial film identification codes or container codes for the aerial film that they index. Also, file and mark the aerial...
ERIC Educational Resources Information Center
Steele, Marcee M.
2004-01-01
The early identification of children with learning disabilities (LD) is difficult but can be accomplished. Observation of key behaviors which are indicators of LD by preschool and kindergarten teachers can assist in this process. This early identification facilitates the use of intervention strategies to provide a positive early experience for…
Basophile: Accurate Fragment Charge State Prediction Improves Peptide Identification Rates
Wang, Dong; Dasari, Surendra; Chambers, Matthew C.; ...
2013-03-07
In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher-energy collision induced dissociation (HCD) of chargedmore » peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.« less
Carey, George B; Kazantsev, Stephanie; Surati, Mosmi; Rolle, Cleo E; Kanteti, Archana; Sadiq, Ahad; Bahroos, Neil; Raumann, Brigitte; Madduri, Ravi; Dave, Paul; Starkey, Adam; Hensing, Thomas; Husain, Aliya N; Vokes, Everett E; Vigneswaran, Wickii; Armato, Samuel G; Kindler, Hedy L; Salgia, Ravi
2012-01-01
Objective An area of need in cancer informatics is the ability to store images in a comprehensive database as part of translational cancer research. To meet this need, we have implemented a novel tandem database infrastructure that facilitates image storage and utilisation. Background We had previously implemented the Thoracic Oncology Program Database Project (TOPDP) database for our translational cancer research needs. While useful for many research endeavours, it is unable to store images, hence our need to implement an imaging database which could communicate easily with the TOPDP database. Methods The Thoracic Oncology Research Program (TORP) imaging database was designed using the Research Electronic Data Capture (REDCap) platform, which was developed by Vanderbilt University. To demonstrate proof of principle and evaluate utility, we performed a retrospective investigation into tumour response for malignant pleural mesothelioma (MPM) patients treated at the University of Chicago Medical Center with either of two analogous chemotherapy regimens and consented to at least one of two UCMC IRB protocols, 9571 and 13473A. Results A cohort of 22 MPM patients was identified using clinical data in the TOPDP database. After measurements were acquired, two representative CT images and 0–35 histological images per patient were successfully stored in the TORP database, along with clinical and demographic data. Discussion We implemented the TORP imaging database to be used in conjunction with our comprehensive TOPDP database. While it requires an additional effort to use two databases, our database infrastructure facilitates more comprehensive translational research. Conclusions The investigation described herein demonstrates the successful implementation of this novel tandem imaging database infrastructure, as well as the potential utility of investigations enabled by it. The data model presented here can be utilised as the basis for further development of other larger, more streamlined databases in the future. PMID:23103606
Identification of clinical yeasts by Vitek MS system compared with API ID 32 C.
Durán-Valle, M Teresa; Sanz-Rodríguez, Nuria; Muñoz-Paraíso, Carmen; Almagro-Moltó, María; Gómez-Garcés, José Luis
2014-05-01
We performed a clinical evaluation of the Vitek MS matrix-assisted laser desorption ionization-time-of-flight mass spectrometry (MALDI-TOF MS) system with the commercial database version 2.0 for rapid identification of medically important yeasts as compared with the conventional phenotypic method API ID 32 C. We tested 161 clinical isolates, nine isolates from culture collections and five reference strains. In case of discrepant results or no identification with one or both methods, molecular identification techniques were employed. Concordance between both methods was observed with 160/175 isolates (91.42%) and misidentifications by both systems occurred only when taxa were not included in the respective databases, i.e., one isolate of Candida etchellsii was identified as C. globosa by Vitek MS and two isolates of C. orthopsilosis were identified as C. parapsilosis by API ID 32 C. Vitek MS could not identify nine strains (5.14%) and API ID 32 C did not identify 13 (7.42%). Vitek MS was more reliable than API ID 32 C and reduced the time required for the identification of clinical isolates to only a few minutes.
Alshawa, Kinda; Beretti, Jean-Luc; Lacroix, Claire; Feuilhade, Martine; Dauphin, Brunhilde; Quesne, Gilles; Hassouni, Noura; Nassif, Xavier
2012-01-01
Dermatophytes are keratinolytic fungi responsible for a wide variety of diseases of glabrous skin, nails, and hair. Their identification, currently based on morphological criteria, is hindered by intraspecies morphological variability and the atypical morphology of some clinical isolates. The aim of this study was to evaluate matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) as a routine tool for identifying dermatophyte and Neoscytalidium species, both of which cause dermatomycoses. We first developed a spectral database of 12 different species of common and unusual dermatophytes and two molds responsible for dermatomycoses (Neoscytalidium dimidiatum and N. dimidiatum var. hyalinum). We then prospectively tested the performance of the database on 381 clinical dermatophyte and Neoscytalidium isolates. Correct identification of the species was obtained for 331/360 dermatophytes (91.9%) and 18/21 Neoscytalidium isolates (85.7%). The results of MALDI-TOF MS and standard identification disagreed for only 2 isolates. These results suggest that MALDI-TOF MS could be a useful tool for routine and fast identification of dermatophytes and Neoscytalidium spp. in clinical mycology laboratories. PMID:22535981
MaizeGDB: New tools and resource
USDA-ARS?s Scientific Manuscript database
MaizeGDB, the USDA-ARS genetics and genomics database, is a highly curated, community-oriented informatics service to researchers focused on the crop plant and model organism Zea mays. MaizeGDB facilitates maize research by curating, integrating, and maintaining a database that serves as the central...
StreptomycesInforSys: A web-enabled information repository
Jain, Chakresh Kumar; Gupta, Vidhi; Gupta, Ashvarya; Gupta, Sanjay; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Sarethy, Indira P
2012-01-01
Members of Streptomyces produce 70% of natural bioactive products. There is considerable amount of information available based on polyphasic approach for classification of Streptomyces. However, this information based on phenotypic, genotypic and bioactive component production profiles is crucial for pharmacological screening programmes. This is scattered across various journals, books and other resources, many of which are not freely accessible. The designed database incorporates polyphasic typing information using combinations of search options to aid in efficient screening of new isolates. This will help in the preliminary categorization of appropriate groups. It is a free relational database compatible with existing operating systems. A cross platform technology with XAMPP Web server has been used to develop, manage, and facilitate the user query effectively with database support. Employment of PHP, a platform-independent scripting language, embedded in HTML and the database management software MySQL will facilitate dynamic information storage and retrieval. The user-friendly, open and flexible freeware (PHP, MySQL and Apache) is foreseen to reduce running and maintenance cost. Availability www.sis.biowaves.org PMID:23275736
StreptomycesInforSys: A web-enabled information repository.
Jain, Chakresh Kumar; Gupta, Vidhi; Gupta, Ashvarya; Gupta, Sanjay; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Sarethy, Indira P
2012-01-01
Members of Streptomyces produce 70% of natural bioactive products. There is considerable amount of information available based on polyphasic approach for classification of Streptomyces. However, this information based on phenotypic, genotypic and bioactive component production profiles is crucial for pharmacological screening programmes. This is scattered across various journals, books and other resources, many of which are not freely accessible. The designed database incorporates polyphasic typing information using combinations of search options to aid in efficient screening of new isolates. This will help in the preliminary categorization of appropriate groups. It is a free relational database compatible with existing operating systems. A cross platform technology with XAMPP Web server has been used to develop, manage, and facilitate the user query effectively with database support. Employment of PHP, a platform-independent scripting language, embedded in HTML and the database management software MySQL will facilitate dynamic information storage and retrieval. The user-friendly, open and flexible freeware (PHP, MySQL and Apache) is foreseen to reduce running and maintenance cost. www.sis.biowaves.org.
NASA Astrophysics Data System (ADS)
Li, Xiuyuan; Tang, Yanyan; Lu, Xinxin
2018-04-01
Currently, the capability of identification for Acinetobacter species using MALDI-TOF MS still remains unclear in clinical laboratories due to certain elusory phenomena. Thus, we conducted this research to evaluate this technique and reveal the causes of misidentification. Briefly, a total of 788 Acinetobacter strains were collected and confirmed at the species level by 16S rDNA and rpoB sequencing, and subsequently compared to the identification by MALDI-TOF MS using direct smear and bacterial extraction pretreatments. Cluster analysis was performed based on the mass spectra and 16S rDNA to reflect the diversity among different species. Eventually, 19 Acinetobacter species were confirmed, including 6 species unavailable in Biotyper 3.0 database. Another novel species was observed, temporarily named A. corallinus. The accuracy of identification for Acinetobacter species using MALDI-TOF MS was 97.08% (765/788), regardless of which pretreatment was applied. The misidentification only occurred on 3 A. parvus strains and 20 strains of species unavailable in the database. The proportions of strains with identification score ≥ 2.000 using direct smear and bacterial extraction pretreatments were 86.04% (678/788) and 95.43% (752/788), χ 2 = 41.336, P < 0.001. The species similar in 16 rDNA were discriminative from the mass spectra, such as A. baumannii & A. junii, A. pittii & A. calcoaceticus, and A. nosocomialis & A. seifertii. Therefore, using MALDI-TOF MS to identify Acinetobacter strains isolated from clinical samples was deemed reliable. Misidentification occurred occasionally due to the insufficiency of the database rather than sample extraction failure. We suggest gene sequencing should be performed when the identification score is under 2.000 even when using bacterial extraction pretreatment. [Figure not available: see fulltext.
Li, Xiuyuan; Tang, Yanyan; Lu, Xinxin
2018-04-09
Currently, the capability of identification for Acinetobacter species using MALDI-TOF MS still remains unclear in clinical laboratories due to certain elusory phenomena. Thus, we conducted this research to evaluate this technique and reveal the causes of misidentification. Briefly, a total of 788 Acinetobacter strains were collected and confirmed at the species level by 16S rDNA and rpoB sequencing, and subsequently compared to the identification by MALDI-TOF MS using direct smear and bacterial extraction pretreatments. Cluster analysis was performed based on the mass spectra and 16S rDNA to reflect the diversity among different species. Eventually, 19 Acinetobacter species were confirmed, including 6 species unavailable in Biotyper 3.0 database. Another novel species was observed, temporarily named A. corallinus. The accuracy of identification for Acinetobacter species using MALDI-TOF MS was 97.08% (765/788), regardless of which pretreatment was applied. The misidentification only occurred on 3 A. parvus strains and 20 strains of species unavailable in the database. The proportions of strains with identification score ≥ 2.000 using direct smear and bacterial extraction pretreatments were 86.04% (678/788) and 95.43% (752/788), χ 2 = 41.336, P < 0.001. The species similar in 16 rDNA were discriminative from the mass spectra, such as A. baumannii & A. junii, A. pittii & A. calcoaceticus, and A. nosocomialis & A. seifertii. Therefore, using MALDI-TOF MS to identify Acinetobacter strains isolated from clinical samples was deemed reliable. Misidentification occurred occasionally due to the insufficiency of the database rather than sample extraction failure. We suggest gene sequencing should be performed when the identification score is under 2.000 even when using bacterial extraction pretreatment. Graphical Abstract ᅟ.
Matajira, Carlos E C; Moreno, Luisa Z; Gomes, Vasco T M; Silva, Ana Paula S; Mesquita, Renan E; Doto, Daniela S; Calderaro, Franco F; de Souza, Fernando N; Christ, Ana Paula G; Sato, Maria Inês Z; Moreno, Andrea M
2017-03-01
Traditional microbiological methods enable genus-level identification of Streptococcus spp. isolates. However, as the species of this genus show broad phenotypic variation, species-level identification or even differentiation within the genus is difficult. Herein we report the evaluation of protein spectra cluster analysis for the identification of Streptococcus species associated with disease in swine by means of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). A total of 250 S. suis-like isolates obtained from pigs with clinical signs of encephalitis, arthritis, pneumonia, metritis, and urinary or septicemic infection were studied. The isolates came from pigs in different Brazilian states from 2001 to 2014. The MALDI-TOF MS analysis identified 86% (215 of 250) as S. suis and 14% (35 of 250) as S. alactolyticus, S. dysgalactiae, S. gallinaceus, S. gallolyticus, S. gordonii, S. henryi, S. hyointestinalis, S. hyovaginalis, S. mitis, S. oralis, S. pluranimalium, and S. sanguinis. The MALDI-TOF MS identification was confirmed in 99.2% of the isolates by 16S rDNA sequencing, with MALDI-TOF MS misidentifying 2 S. pluranimalium as S. hyovaginalis. Isolates were also tested by a biochemical automated system that correctly identified all isolates of 8 of the 10 species in the database. Neither the isolates of the 3 species not in the database ( S. gallinaceus, S. henryi, and S. hyovaginalis) nor the isolates of 2 species that were in the database ( S. oralis and S. pluranimalium) could be identified. The topology of the protein spectra cluster analysis appears to sustain the species phylogenetic similarities, further supporting identification by MALDI-TOF MS examination as a rapid and accurate alternative to 16S rDNA sequencing.
DNA barcoding of medicinal plant material for identification
USDA-ARS?s Scientific Manuscript database
Because of the increasing demand for herbal remedies and for authentication of the source material, it is vital to provide a single database containing information about authentic plant materials and their potential adulterants. The database should provide DNA barcodes for data retrieval and similar...
National Institute of Standards and Technology Data Gateway
SRD 100 Database for Simulation of Electron Spectra for Surface Analysis (SESSA)Database for Simulation of Electron Spectra for Surface Analysis (SESSA) (PC database for purchase) This database has been designed to facilitate quantitative interpretation of Auger-electron and X-ray photoelectron spectra and to improve the accuracy of quantitation in routine analysis. The database contains all physical data needed to perform quantitative interpretation of an electron spectrum for a thin-film specimen of given composition. A simulation module provides an estimate of peak intensities as well as the energy and angular distributions of the emitted electron flux.
Gil de la Fuente, Alberto; Grace Armitage, Emily; Otero, Abraham; Barbas, Coral; Godzien, Joanna
2017-09-01
Metabolite identification is one of the most challenging steps in metabolomics studies and reflects one of the greatest bottlenecks in the entire workflow. The success of this step determines the success of the entire research, therefore the quality at which annotations are given requires special attention. A variety of tools and resources are available to aid metabolite identification or annotation, offering different and often complementary functionalities. In preparation for this article, almost 50 databases were reviewed, from which 17 were selected for discussion, chosen for their online ESI-MS functionality. The general characteristics and functions of each database is discussed in turn, considering the advantages and limitations of each along with recommendations for optimal use of each tool, as derived from experiences encountered at the Centre for Metabolomics and Bioanalysis (CEMBIO) in Madrid. These databases were evaluated considering their utility in non-targeted metabolomics, including aspects such as identifier assignment, structural assignment and interpretation of results. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sadygov, Rovshan G; Cociorva, Daniel; Yates, John R
2004-12-01
Database searching is an essential element of large-scale proteomics. Because these methods are widely used, it is important to understand the rationale of the algorithms. Most algorithms are based on concepts first developed in SEQUEST and PeptideSearch. Four basic approaches are used to determine a match between a spectrum and sequence: descriptive, interpretative, stochastic and probability-based matching. We review the basic concepts used by most search algorithms, the computational modeling of peptide identification and current challenges and limitations of this approach for protein identification.
Burisch, Johan; Cukovic-Cavka, Silvija; Kaimakliotis, Ioannis; Shonová, Olga; Andersen, Vibeke; Dahlerup, Jens F; Elkjaer, Margarita; Langholz, Ebbe; Pedersen, Natalia; Salupere, Riina; Kolho, Kaija-Leena; Manninen, Pia; Lakatos, Peter Laszlo; Shuhaibar, Mary; Odes, Selwyn; Martinato, Matteo; Mihu, Ion; Magro, Fernando; Belousova, Elena; Fernandez, Alberto; Almer, Sven; Halfvarson, Jonas; Hart, Ailsa; Munkholm, Pia
2011-08-01
The EpiCom-study investigates a possible East-West-gradient in Europe in the incidence of IBD and the association with environmental factors. A secured web-based database is used to facilitate and centralize data registration. To construct and validate a web-based inception cohort database available in both English and Russian language. The EpiCom database has been constructed in collaboration with all 34 participating centers. The database was translated into Russian using forward translation, patient questionnaires were translated by simplified forward-backward translation. Data insertion implies fulfillment of international diagnostic criteria, disease activity, medical therapy, quality of life, work productivity and activity impairment, outcome of pregnancy, surgery, cancer and death. Data is secured by the WinLog3 System, developed in cooperation with the Danish Data Protection Agency. Validation of the database has been performed in two consecutive rounds, each followed by corrections in accordance with comments. The EpiCom database fulfills the requirements of the participating countries' local data security agencies by being stored at a single location. The database was found overall to be "good" or "very good" by 81% of the participants after the second validation round and the general applicability of the database was evaluated as "good" or "very good" by 77%. In the inclusion period January 1st -December 31st 2010 1336 IBD patients have been included in the database. A user-friendly, tailor-made and secure web-based inception cohort database has been successfully constructed, facilitating remote data input. The incidence of IBD in 23 European countries can be found at www.epicom-ecco.eu. Copyright © 2011 European Crohn's and Colitis Organisation. All rights reserved.
Hoijemberg, Pablo A; Pelczer, István
2018-01-05
A lot of time is spent by researchers in the identification of metabolites in NMR-based metabolomic studies. The usual metabolite identification starts employing public or commercial databases to match chemical shifts thought to belong to a given compound. Statistical total correlation spectroscopy (STOCSY), in use for more than a decade, speeds the process by finding statistical correlations among peaks, being able to create a better peak list as input for the database query. However, the (normally not automated) analysis becomes challenging due to the intrinsic issue of peak overlap, where correlations of more than one compound appear in the STOCSY trace. Here we present a fully automated methodology that analyzes all STOCSY traces at once (every peak is chosen as driver peak) and overcomes the peak overlap obstacle. Peak overlap detection by clustering analysis and sorting of traces (POD-CAST) first creates an overlap matrix from the STOCSY traces, then clusters the overlap traces based on their similarity and finally calculates a cumulative overlap index (COI) to account for both strong and intermediate correlations. This information is gathered in one plot to help the user identify the groups of peaks that would belong to a single molecule and perform a more reliable database query. The simultaneous examination of all traces reduces the time of analysis, compared to viewing STOCSY traces by pairs or small groups, and condenses the redundant information in the 2D STOCSY matrix into bands containing similar traces. The COI helps in the detection of overlapping peaks, which can be added to the peak list from another cross-correlated band. POD-CAST overcomes the generally overlooked and underestimated presence of overlapping peaks and it detects them to include them in the search of all compounds contributing to the peak overlap, enabling the user to accelerate the metabolite identification process with more successful database queries and searching all tentative compounds in the sample set.
Charnot-Katsikas, Angella; Tesic, Vera; Boonlayangoor, Sue; Bethel, Cindy; Frank, Karen M
2014-02-01
This study assessed the accuracy of bacterial and yeast identification using the VITEK MS, and the time to reporting of isolates before and after its implementation in routine clinical practice. Three hundred and sixty-two isolates of bacteria and yeast, consisting of a variety of clinical isolates and American Type Culture Collection strains, were tested. Results were compared with reference identifications from the VITEK 2 system and with 16S rRNA sequence analysis. The VITEK MS provided an acceptable identification to species level for 283 (78 %) isolates. Considering organisms for which genus-level identification is acceptable for routine clinical care, 315 isolates (87 %) had an acceptable identification. Six isolates (2 %) were identified incorrectly, five of which were Shigella species. Finally, the time for reporting the identifications was decreased significantly after implementation of the VITEK MS for a total mean reduction in time of 10.52 h (P<0.0001). Overall, accuracy of the VITEK MS was comparable or superior to that from the VITEK 2. The findings were also comparable to other studies examining the accuracy of the VITEK MS, although differences exist, depending on the diversity of species represented as well as on the versions of the databases used. The VITEK MS can be incorporated effectively into routine use in a clinical microbiology laboratory and future expansion of the database should provide improved accuracy for the identification of micro-organisms.
Mathis, Alexander; Depaquit, Jérôme; Dvořák, Vit; Tuten, Holly; Bañuls, Anne-Laure; Halada, Petr; Zapata, Sonia; Lehrter, Véronique; Hlavačková, Kristýna; Prudhomme, Jorian; Volf, Petr; Sereno, Denis; Kaufmann, Christian; Pflüger, Valentin; Schaffner, Francis
2015-05-10
Rapid, accurate and high-throughput identification of vector arthropods is of paramount importance in surveillance programmes that are becoming more common due to the changing geographic occurrence and extent of many arthropod-borne diseases. Protein profiling by MALDI-TOF mass spectrometry fulfils these requirements for identification, and reference databases have recently been established for several vector taxa, mostly with specimens from laboratory colonies. We established and validated a reference database containing 20 phlebotomine sand fly (Diptera: Psychodidae, Phlebotominae) species by using specimens from colonies or field-collections that had been stored for various periods of time. Identical biomarker mass patterns ('superspectra') were obtained with colony- or field-derived specimens of the same species. In the validation study, high quality spectra (i.e. more than 30 evaluable masses) were obtained with all fresh insects from colonies, and with 55/59 insects deep-frozen (liquid nitrogen/-80 °C) for up to 25 years. In contrast, only 36/52 specimens stored in ethanol could be identified. This resulted in an overall sensitivity of 87 % (140/161); specificity was 100 %. Duration of storage impaired data counts in the high mass range, and thus cluster analyses of closely related specimens might reflect their storage conditions rather than phenotypic distinctness. A major drawback of MALDI-TOF MS is the restricted availability of in-house databases and the fact that mass spectrometers from 2 companies (Bruker, Shimadzu) are widely being used. We have analysed fingerprints of phlebotomine sand flies obtained by automatic routine procedure on a Bruker instrument by using our database and the software established on a Shimadzu system. The sensitivity with 312 specimens from 8 sand fly species from laboratory colonies when evaluating only high quality spectra was 98.3 %; the specificity was 100 %. The corresponding diagnostic values with 55 field-collected specimens from 4 species were 94.7 % and 97.4 %, respectively. A centralized high-quality database (created by expert taxonomists and experienced users of mass spectrometers) that is easily amenable to customer-oriented identification services is a highly desirable resource. As shown in the present work, spectra obtained from different specimens with different instruments can be analysed using a centralized database, which should be available in the near future via an online platform in a cost-efficient manner.
Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md Imtaiyaz
2015-01-01
Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions.
Learning from number board games: you learn what you encode.
Laski, Elida V; Siegler, Robert S
2014-03-01
We tested the hypothesis that encoding the numerical-spatial relations in a number board game is a key process in promoting learning from playing such games. Experiment 1 used a microgenetic design to examine the effects on learning of the type of counting procedure that children use. As predicted, having kindergartners count-on from their current number on the board while playing a 0-100 number board game facilitated their encoding of the numerical-spatial relations on the game board and improved their number line estimates, numeral identification, and count-on skill. Playing the same game using the standard count-from-1 procedure led to considerably less learning. Experiment 2 demonstrated that comparable improvement in number line estimation does not occur with practice encoding the numerals 1-100 outside of the context of a number board game. The general importance of aligning learning activities and physical materials with desired mental representations is discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved.
ElemeNT: a computational tool for detecting core promoter elements.
Sloutskin, Anna; Danino, Yehuda M; Orenstein, Yaron; Zehavi, Yonathan; Doniger, Tirza; Shamir, Ron; Juven-Gershon, Tamar
2015-01-01
Core promoter elements play a pivotal role in the transcriptional output, yet they are often detected manually within sequences of interest. Here, we present 2 contributions to the detection and curation of core promoter elements within given sequences. First, the Elements Navigation Tool (ElemeNT) is a user-friendly web-based, interactive tool for prediction and display of putative core promoter elements and their biologically-relevant combinations. Second, the CORE database summarizes ElemeNT-predicted core promoter elements near CAGE and RNA-seq-defined Drosophila melanogaster transcription start sites (TSSs). ElemeNT's predictions are based on biologically-functional core promoter elements, and can be used to infer core promoter compositions. ElemeNT does not assume prior knowledge of the actual TSS position, and can therefore assist in annotation of any given sequence. These resources, freely accessible at http://lifefaculty.biu.ac.il/gershon-tamar/index.php/resources, facilitate the identification of core promoter elements as active contributors to gene expression.
Failure of anthropometry as a facial identification technique using high-quality photographs.
Kleinberg, Krista F; Vanezis, Peter; Burton, A Mike
2007-07-01
Anthropometry can be used in certain circumstances to facilitate comparison of a photograph of a suspect with that of the potential offender from surveillance footage. Experimental research was conducted to determine whether anthropometry has a place in forensic practice in confirming the identity of a suspect from a surveillance video. We examined an existing database of photographic lineups, where one video image was compared against 10 photographs, which has previously been used in psychological research. Target (1) and test (10) photos were of high quality, although taken with a different camera. The anthropometric landmarks of right and left ectocanthions, nasion, and stomion were chosen, and proportions and angle values between these landmarks were measured to compare target with test photos. Results indicate that these measurements failed to accurately identify targets. There was also no indication that any of the landmarks made a better comparison than another. It was concluded that, for these landmarks, this method does not generate the consistent results necessary for use as evidence in a court of law.
A Survey of Neonatal Pharmacokinetic and Pharmacodynamic Studies in Pediatric Drug Development.
Wang, J; Avant, D; Green, D; Seo, S; Fisher, J; Mulberg, A E; McCune, S K; Burckart, G J
2015-09-01
Conducting clinical trials in neonates is challenging, and knowledge gaps in neonatal clinical pharmacology exist. We surveyed the US Food and Drug Administration databases and identified 43 drugs studied in neonates or referring to neonates between 1998 and 2014. Twenty drugs were approved in neonates. For 10 drugs, approval was based on efficacy data in neonates, supplemented by pharmacokinetic data for four drugs. Approval for neonates was based on full extrapolation from older patients for six drugs, and partial extrapolation was the basis of approval for four drugs. Dosing recommendations differed from older patients for most drugs, and used body-size based adjustment in neonates. Trial failures were associated with various factors including inappropriate dose selection. Successful drug development in neonates could be facilitated by an improved understanding of the natural history and pathophysiology of neonatal diseases and identification and validation of clinically relevant biomarkers. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.
Simple Approach for De Novo Structural Identification of Mannose Trisaccharides
NASA Astrophysics Data System (ADS)
Hsu, Hsu Chen; Liew, Chia Yen; Huang, Shih-Pei; Tsai, Shang-Ting; Ni, Chi-Kung
2018-03-01
Oligosaccharides have diverse functions in biological systems. However, the structural determination of oligosaccharides remains difficult and has created a bottleneck in carbohydrate research. In this study, a new approach for the de novo structural determination of underivatized oligosaccharides is demonstrated. A low-energy collision-induced dissociation (CID) of sodium ion adducts was used to facilitate the cleavage of desired chemical bonds during the dissociation. The selection of fragments for the subsequent CID was guided using a procedure that we built from the understanding of the saccharide dissociation mechanism. The linkages, anomeric configurations, and branch locations of oligosaccharides were determined by comparing the CID spectra of oligosaccharide with the fragmentation patterns based on the dissociation mechanism and our specially prepared disaccharide CID spectrum database. The usefulness of this method was demonstrated to determine the structures of several mannose trisaccharides. This method can also be applied in the structural determination of oligosaccharides larger than trisaccharides and containing hexose other than mannose if authentic standards are available. [Figure not available: see fulltext.
Simple Approach for De Novo Structural Identification of Mannose Trisaccharides
NASA Astrophysics Data System (ADS)
Hsu, Hsu Chen; Liew, Chia Yen; Huang, Shih-Pei; Tsai, Shang-Ting; Ni, Chi-Kung
2017-12-01
Oligosaccharides have diverse functions in biological systems. However, the structural determination of oligosaccharides remains difficult and has created a bottleneck in carbohydrate research. In this study, a new approach for the de novo structural determination of underivatized oligosaccharides is demonstrated. A low-energy collision-induced dissociation (CID) of sodium ion adducts was used to facilitate the cleavage of desired chemical bonds during the dissociation. The selection of fragments for the subsequent CID was guided using a procedure that we built from the understanding of the saccharide dissociation mechanism. The linkages, anomeric configurations, and branch locations of oligosaccharides were determined by comparing the CID spectra of oligosaccharide with the fragmentation patterns based on the dissociation mechanism and our specially prepared disaccharide CID spectrum database. The usefulness of this method was demonstrated to determine the structures of several mannose trisaccharides. This method can also be applied in the structural determination of oligosaccharides larger than trisaccharides and containing hexose other than mannose if authentic standards are available. [Figure not available: see fulltext.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Kelly Porter
Key goals towards national biosecurity include methods for analyzing pathogens, predicting their emergence, and developing countermeasures. These goals are served by studying bacterial genes that promote pathogenicity and the pathogenicity islands that mobilize them. Cyberinfrastructure promoting an island database advances this field and enables deeper bioinformatic analysis that may identify novel pathogenicity genes. New automated methods and rich visualizations were developed for identifying pathogenicity islands, based on the principle that islands occur sporadically among closely related strains. The chromosomally-ordered pan-genome organizes all genes from a clade of strains; gaps in this visualization indicate islands, and decorations of the gene matrixmore » facilitate exploration of island gene functions. A %E2%80%9Clearned phyloblocks%E2%80%9D method was developed for automated island identification, that trains on the phylogenetic patterns of islands identified by other methods. Learned phyloblocks better defined termini of previously identified islands in multidrug-resistant Klebsiella pneumoniae ATCC BAA-2146, and found its only antibiotic resistance island.« less
Li, Guo-Zhong; Vissers, Johannes P C; Silva, Jeffrey C; Golick, Dan; Gorenstein, Marc V; Geromanos, Scott J
2009-03-01
A novel database search algorithm is presented for the qualitative identification of proteins over a wide dynamic range, both in simple and complex biological samples. The algorithm has been designed for the analysis of data originating from data independent acquisitions, whereby multiple precursor ions are fragmented simultaneously. Measurements used by the algorithm include retention time, ion intensities, charge state, and accurate masses on both precursor and product ions from LC-MS data. The search algorithm uses an iterative process whereby each iteration incrementally increases the selectivity, specificity, and sensitivity of the overall strategy. Increased specificity is obtained by utilizing a subset database search approach, whereby for each subsequent stage of the search, only those peptides from securely identified proteins are queried. Tentative peptide and protein identifications are ranked and scored by their relative correlation to a number of models of known and empirically derived physicochemical attributes of proteins and peptides. In addition, the algorithm utilizes decoy database techniques for automatically determining the false positive identification rates. The search algorithm has been tested by comparing the search results from a four-protein mixture, the same four-protein mixture spiked into a complex biological background, and a variety of other "system" type protein digest mixtures. The method was validated independently by data dependent methods, while concurrently relying on replication and selectivity. Comparisons were also performed with other commercially and publicly available peptide fragmentation search algorithms. The presented results demonstrate the ability to correctly identify peptides and proteins from data independent acquisition strategies with high sensitivity and specificity. They also illustrate a more comprehensive analysis of the samples studied; providing approximately 20% more protein identifications, compared to a more conventional data directed approach using the same identification criteria, with a concurrent increase in both sequence coverage and the number of modified peptides.
Columba: an integrated database of proteins, structures, and annotations.
Trissl, Silke; Rother, Kristian; Müller, Heiko; Steinke, Thomas; Koch, Ina; Preissner, Robert; Frömmel, Cornelius; Leser, Ulf
2005-03-31
Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.
DBGC: A Database of Human Gastric Cancer
Wang, Chao; Zhang, Jun; Cai, Mingdeng; Zhu, Zhenggang; Gu, Wenjie; Yu, Yingyan; Zhang, Xiaoyan
2015-01-01
The Database of Human Gastric Cancer (DBGC) is a comprehensive database that integrates various human gastric cancer-related data resources. Human gastric cancer-related transcriptomics projects, proteomics projects, mutations, biomarkers and drug-sensitive genes from different sources were collected and unified in this database. Moreover, epidemiological statistics of gastric cancer patients in China and clinicopathological information annotated with gastric cancer cases were also integrated into the DBGC. We believe that this database will greatly facilitate research regarding human gastric cancer in many fields. DBGC is freely available at http://bminfor.tongji.edu.cn/dbgc/index.do PMID:26566288
Preimpoundment Water Quality Study
1981-12-01
standard taxonomic references were used for identification : Schmidt, et al., 1874-1879; Heurck, 1896; Hustedt, 1927-1930, 1930, 1931-1959, 1949, 1961-1966...critical identifications can only be performed if the diatoms are cleaned (all organic matter removed); thereby leaving only the silica cell walls...Diatom identification was facilitated by cleaning apprcximately 30 ml of some of the initial samples using the hydrogen peroxide method (Werff, 1953
MALDI-TOF MS versus VITEK 2 ANC card for identification of anaerobic bacteria.
Li, Yang; Gu, Bing; Liu, Genyan; Xia, Wenying; Fan, Kun; Mei, Yaning; Huang, Peijun; Pan, Shiyang
2014-05-01
Matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is an accurate, rapid and inexpensive technique that has initiated a revolution in the clinical microbiology laboratory for identification of pathogens. The Vitek 2 anaerobe and Corynebacterium (ANC) identification card is a newly developed method for identification of corynebacteria and anaerobic species. The aim of this study was to evaluate the effectiveness of the ANC card and MALDI-TOF MS techniques for identification of clinical anaerobic isolates. Five reference strains and a total of 50 anaerobic bacteria clinical isolates comprising ten different genera and 14 species were identified and analyzed by the ANC card together with Vitek 2 identification system and Vitek MS together with version 2.0 database respectively. 16S rRNA gene sequencing was used as reference method for accuracy in the identification. Vitek 2 ANC card and Vitek MS provided comparable results at species level for the five reference strains. Of 50 clinical strains, the Vitek MS provided identification for 46 strains (92%) to the species level, 47 (94%) to genus level, one (2%) low discrimination, two (4%) no identification and one (2%) misidentification. The Vitek 2 ANC card provided identification for 43 strains (86%) correct to the species level, 47 (94%) correct to the genus level, three (6%) low discrimination, three (6%) no identification and one (2%) misidentification. Both Vitek MS and Vitek 2 ANC card can be used for accurate routine clinical anaerobe identification. Comparing to the Vitek 2 ANC card, Vitek MS is easier, faster and more economic for each test. The databases currently available for both systems should be updated and further developed to enhance performance.
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2010 CFR
2010-01-01
... the Current Research Information System (CRIS). (b) Initial Documentation in the CRIS Database... identification of equipment purchased with any Federal funds under the award and any subsequent use of such equipment. (e) CRIS Web Site Via Internet. The CRIS database is available to the public on the worldwide web...
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2011 CFR
2011-01-01
... (CRIS). (b) Initial Documentation in the CRIS Database. Information collected in the “Work Unit... elect) to obtain patent(s) on any such invention; and an identification of equipment purchased with any.... The CRIS database is available to the public on the worldwide web. CRIS project information is...
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2012 CFR
2012-01-01
... (CRIS). (b) Initial Documentation in the CRIS Database. Information collected in the “Work Unit... elect) to obtain patent(s) on any such invention; and an identification of equipment purchased with any.... The CRIS database is available to the public on the worldwide web. CRIS project information is...
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2013 CFR
2013-01-01
... (CRIS). (b) Initial Documentation in the CRIS Database. Information collected in the “Work Unit... elect) to obtain patent(s) on any such invention; and an identification of equipment purchased with any.... The CRIS database is available to the public on the worldwide web. CRIS project information is...
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2014 CFR
2014-01-01
... (CRIS). (b) Initial Documentation in the CRIS Database. Information collected in the “Work Unit... elect) to obtain patent(s) on any such invention; and an identification of equipment purchased with any.... The CRIS database is available to the public on the worldwide web. CRIS project information is...
Texture-based approach to palmprint retrieval for personal identification
NASA Astrophysics Data System (ADS)
Li, Wenxin; Zhang, David; Xu, Z.; You, J.
2000-12-01
This paper presents a new approach to palmprint retrieval for personal identification. Three key issues in image retrieval are considered - feature selection, similarity measures and dynamic search for the best matching of the sample in the image database. We propose a texture-based method for palmprint feature representation. The concept of texture energy is introduced to define a palm print's global and local features, which are characterized with high convergence of inner-palm similarities and good dispersion of inter-palm discrimination. The search is carried out in a layered fashion: first global features are used to guide the fast selection of a small set of similar candidates from the database from the database and then local features are used to decide the final output within the candidate set. The experimental results demonstrate the effectiveness and accuracy of the proposed method.
Texture-based approach to palmprint retrieval for personal identification
NASA Astrophysics Data System (ADS)
Li, Wenxin; Zhang, David; Xu, Z.; You, J.
2001-01-01
This paper presents a new approach to palmprint retrieval for personal identification. Three key issues in image retrieval are considered - feature selection, similarity measures and dynamic search for the best matching of the sample in the image database. We propose a texture-based method for palmprint feature representation. The concept of texture energy is introduced to define a palm print's global and local features, which are characterized with high convergence of inner-palm similarities and good dispersion of inter-palm discrimination. The search is carried out in a layered fashion: first global features are used to guide the fast selection of a small set of similar candidates from the database from the database and then local features are used to decide the final output within the candidate set. The experimental results demonstrate the effectiveness and accuracy of the proposed method.
Raharimalala, F N; Andrianinarivomanana, T M; Rakotondrasoa, A; Collard, J M; Boyer, S
2017-09-01
Arthropod-borne diseases are important causes of morbidity and mortality. The identification of vector species relies mainly on morphological features and/or molecular biology tools. The first method requires specific technical skills and may result in misidentifications, and the second method is time-consuming and expensive. The aim of the present study is to assess the usefulness and accuracy of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) as a supplementary tool with which to identify mosquito vector species and to invest in the creation of an international database. A total of 89 specimens belonging to 10 mosquito species were selected for the extraction of proteins from legs and for the establishment of a reference database. A blind test with 123 mosquitoes was performed to validate the MS method. Results showed that: (a) the spectra obtained in the study with a given species differed from the spectra of the same species collected in another country, which highlights the need for an international database; (b) MALDI-TOF MS is an accurate method for the rapid identification of mosquito species that are referenced in a database; (c) MALDI-TOF MS allows the separation of groups or complex species, and (d) laboratory specimens undergo a loss of proteins compared with those isolated in the field. In conclusion, MALDI-TOF MS is a useful supplementary tool for mosquito identification and can help inform vector control. © 2017 The Royal Entomological Society.
NASA Astrophysics Data System (ADS)
Ghiorso, M. S.
2013-12-01
Internally consistent thermodynamic databases are critical resources that facilitate the calculation of heterogeneous phase equilibria and thereby support geochemical, petrological, and geodynamical modeling. These 'databases' are actually derived data/model systems that depend on a diverse suite of physical property measurements, calorimetric data, and experimental phase equilibrium brackets. In addition, such databases are calibrated with the adoption of various models for extrapolation of heat capacities and volumetric equations of state to elevated temperature and pressure conditions. Finally, these databases require specification of thermochemical models for the mixing properties of solid, liquid, and fluid solutions, which are often rooted in physical theory and, in turn, depend on additional experimental observations. The process of 'calibrating' a thermochemical database involves considerable effort and an extensive computational infrastructure. Because of these complexities, the community tends to rely on a small number of thermochemical databases, generated by a few researchers; these databases often have limited longevity and are universally difficult to maintain. ThermoFit is a software framework and user interface whose aim is to provide a modeling environment that facilitates creation, maintenance and distribution of thermodynamic data/model collections. Underlying ThermoFit are data archives of fundamental physical property, calorimetric, crystallographic, and phase equilibrium constraints that provide the essential experimental information from which thermodynamic databases are traditionally calibrated. ThermoFit standardizes schema for accessing these data archives and provides web services for data mining these collections. Beyond simple data management and interoperability, ThermoFit provides a collection of visualization and software modeling tools that streamline the model/database generation process. Most notably, ThermoFit facilitates the rapid visualization of predicted model outcomes and permits the user to modify these outcomes using tactile- or mouse-based GUI interaction, permitting real-time updates that reflect users choices, preferences, and priorities involving derived model results. This ability permits some resolution of the problem of correlated model parameters in the common situation where thermodynamic models must be calibrated from inadequate data resources. The ability also allows modeling constraints to be imposed using natural data and observations (i.e. petrologic or geochemical intuition). Once formulated, ThermoFit facilitates deployment of data/model collections by automated creation of web services. Users consume these services via web-, excel-, or desktop-clients. ThermoFit is currently under active development and not yet generally available; a limited capability prototype system has been coded for Macintosh computers and utilized to construct thermochemical models for H2O-CO2 mixed fluid saturation in silicate liquids. The longer term goal is to release ThermoFit as a web portal application client with server-based cloud computations supporting the modeling environment.
ERIC Educational Resources Information Center
Williamson, Ben
2015-01-01
This article examines the emergence of "digital governance" in public education in England. Drawing on and combining concepts from software studies, policy and political studies, it identifies some specific approaches to digital governance facilitated by network-based communications and database-driven information processing software…
2014-01-01
Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner. PMID:25077800
Härtig, Claus
2008-01-04
A multidimensional approach for the identification of fatty acid methyl esters (FAME) based on GC/MS analysis is described. Mass spectra and retention data of more than 130 FAME from various sources (chain lengths in the range from 4 to 24 carbon atoms) were collected in a database. Hints for the interpretation of FAME mass spectra are given and relevant diagnostic marker ions are deduced indicating specific groups of fatty acids. To verify the identity of single species and to ensure an optimized chromatographic resolution, the database was compiled with retention data libraries acquired on columns of different polarity (HP-5, DB-23, and HP-88). For a combined use of mass spectra and retention data standardized methods of measurement for each of these columns are required. Such master methods were developed and always applied under the conditions of retention time locking (RTL) which allowed an excellent reproducibility and comparability of absolute retention times. Moreover, as a relative retention index system, equivalent chain lengths (ECL) of FAME were determined by linear interpolation. To compare and to predict ECL values by means of structural features, fractional chain lengths (FCL) were calculated and fitted as well. As shown in an example, the use of retention data and mass spectral information together in a database search leads to an improved and reliable identification of FAME (including positional and geometrical isomers) without further derivatizations.
Schallmey, Marcus; Koopmeiners, Julia; Wells, Elizabeth; Wardenga, Rainer; Schallmey, Anett
2014-12-01
Halohydrin dehalogenases are very rare enzymes that are naturally involved in the mineralization of halogenated xenobiotics. Due to their catalytic potential and promiscuity, many biocatalytic reactions have been described that have led to several interesting and industrially important applications. Nevertheless, only a few of these enzymes have been made available through recombinant techniques; hence, it is of general interest to expand the repertoire of these enzymes so as to enable novel biocatalytic applications. After the identification of specific sequence motifs, 37 novel enzyme sequences were readily identified in public sequence databases. All enzymes that could be heterologously expressed also catalyzed typical halohydrin dehalogenase reactions. Phylogenetic inference for enzymes of the halohydrin dehalogenase enzyme family confirmed that all enzymes form a distinct monophyletic clade within the short-chain dehydrogenase/reductase superfamily. In addition, the majority of novel enzymes are substantially different from previously known phylogenetic subtypes. Consequently, four additional phylogenetic subtypes were defined, greatly expanding the halohydrin dehalogenase enzyme family. We show that the enormous wealth of environmental and genome sequences present in public databases can be tapped for in silico identification of very rare but biotechnologically important biocatalysts. Our findings help to readily identify halohydrin dehalogenases in ever-growing sequence databases and, as a consequence, make even more members of this interesting enzyme family available to the scientific and industrial community. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Why are they missing? : Bioinformatics characterization of missing human proteins.
Elguoshy, Amr; Magdeldin, Sameh; Xu, Bo; Hirao, Yoshitoshi; Zhang, Ying; Kinoshita, Naohiko; Takisawa, Yusuke; Nameta, Masaaki; Yamamoto, Keiko; El-Refy, Ali; El-Fiky, Fawzy; Yamamoto, Tadashi
2016-10-21
NeXtProt is a web-based protein knowledge platform that supports research on human proteins. NeXtProt (release 2015-04-28) lists 20,060 proteins, among them, 3373 canonical proteins (16.8%) lack credible experimental evidence at protein level (PE2:PE5). Therefore, they are considered as "missing proteins". A comprehensive bioinformatic workflow has been proposed to analyze these "missing" proteins. The aims of current study were to analyze physicochemical properties, existence and distribution of the tryptic cleavage sites, and to pinpoint the signature peptides of the missing proteins. Our findings showed that 23.7% of missing proteins were hydrophobic proteins possessing transmembrane domains (TMD). Also, forty missing entries generate tryptic peptides were either out of mass detection range (>30aa) or mapped to different proteins (<9aa). Additionally, 21% of missing entries didn't generate any unique tryptic peptides. In silico endopeptidase combination strategy increased the possibility of missing proteins identification. Coherently, using both mature protein database and signal peptidome database could be a promising option to identify some missing proteins by targeting their unique N-terminal tryptic peptide from mature protein database and or C-terminus tryptic peptide from signal peptidome database. In conclusion, Identification of missing protein requires additional consideration during sample preparation, extraction, digestion and data analysis to increase its incidence of identification. Copyright © 2016. Published by Elsevier B.V.
Pongor, Lőrinc S; Vera, Roberto; Ligeti, Balázs
2014-01-01
Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner.
The use of high-throughput screening techniques to evaluate mitochondrial toxicity.
Wills, Lauren P
2017-11-01
Toxicologists and chemical regulators depend on accurate and effective methods to evaluate and predict the toxicity of thousands of current and future compounds. Robust high-throughput screening (HTS) experiments have the potential to efficiently test large numbers of chemical compounds for effects on biological pathways. HTS assays can be utilized to examine chemical toxicity across multiple mechanisms of action, experimental models, concentrations, and lengths of exposure. Many agricultural, industrial, and pharmaceutical chemicals classified as harmful to human and environmental health exert their effects through the mechanism of mitochondrial toxicity. Mitochondrial toxicants are compounds that cause a decrease in the number of mitochondria within a cell, and/or decrease the ability of mitochondria to perform normal functions including producing adenosine triphosphate (ATP) and maintaining cellular homeostasis. Mitochondrial dysfunction can lead to apoptosis, necrosis, altered metabolism, muscle weakness, neurodegeneration, decreased organ function, and eventually disease or death of the whole organism. The development of HTS techniques to identify mitochondrial toxicants will provide extensive databases with essential connections between mechanistic mitochondrial toxicity and chemical structure. Computational and bioinformatics approaches can be used to evaluate compound databases for specific chemical structures associated with toxicity, with the goal of developing quantitative structure-activity relationship (QSAR) models and mitochondrial toxicophores. Ultimately these predictive models will facilitate the identification of mitochondrial liabilities in consumer products, industrial compounds, pharmaceuticals and environmental hazards. Copyright © 2017 Elsevier B.V. All rights reserved.
Tikkanen, Tuomas; Leroy, Bernard; Fournier, Jean Louis; Risques, Rosa Ana; Malcikova, Jitka; Soussi, Thierry
2018-07-01
Accurate annotation of genomic variants in human diseases is essential to allow personalized medicine. Assessment of somatic and germline TP53 alterations has now reached the clinic and is required in several circumstances such as the identification of the most effective cancer therapy for patients with chronic lymphocytic leukemia (CLL). Here, we present Seshat, a Web service for annotating TP53 information derived from sequencing data. A flexible framework allows the use of standard file formats such as Mutation Annotation Format (MAF) or Variant Call Format (VCF), as well as common TXT files. Seshat performs accurate variant annotations using the Human Genome Variation Society (HGVS) nomenclature and the stable TP53 genomic reference provided by the Locus Reference Genomic (LRG). In addition, using the 2017 release of the UMD_TP53 database, Seshat provides multiple statistical information for each TP53 variant including database frequency, functional activity, or pathogenicity. The information is delivered in standardized output tables that minimize errors and facilitate comparison of mutational data across studies. Seshat is a beneficial tool to interpret the ever-growing TP53 sequencing data generated by multiple sequencing platforms and it is freely available via the TP53 Website, http://p53.fr or directly at http://vps338341.ovh.net/. © 2018 Wiley Periodicals, Inc.
PCPPI: a comprehensive database for the prediction of Penicillium-crop protein-protein interactions.
Yue, Junyang; Zhang, Danfeng; Ban, Rongjun; Ma, Xiaojing; Chen, Danyang; Li, Guangwei; Liu, Jia; Wisniewski, Michael; Droby, Samir; Liu, Yongsheng
2017-01-01
Penicillium expansum , the causal agent of blue mold, is one of the most prevalent post-harvest pathogens, infecting a wide range of crops after harvest. In response, crops have evolved various defense systems to protect themselves against this and other pathogens. Penicillium -crop interaction is a multifaceted process and mediated by pathogen- and host-derived proteins. Identification and characterization of the inter-species protein-protein interactions (PPIs) are fundamental to elucidating the molecular mechanisms underlying infection processes between P. expansum and plant crops. Here, we have developed PCPPI, the Penicillium -Crop Protein-Protein Interactions database, which is constructed based on the experimentally determined orthologous interactions in pathogen-plant systems and available domain-domain interactions (DDIs) in each PPI. Thus far, it stores information on 9911 proteins, 439 904 interactions and seven host species, including apple, kiwifruit, maize, pear, rice, strawberry and tomato. Further analysis through the gene ontology (GO) annotation indicated that proteins with more interacting partners tend to execute the essential function. Significantly, semantic statistics of the GO terms also provided strong support for the accuracy of our predicted interactions in PCPPI. We believe that all the PCPPI datasets are helpful to facilitate the study of pathogen-crop interactions and freely available to the research community. : http://bdg.hfut.edu.cn/pcppi/index.html. © The Author(s) 2017. Published by Oxford University Press.
Li, Min; Li, Wenkai; Wu, Fang-Xiang; Pan, Yi; Wang, Jianxin
2018-06-14
Essential proteins are important participants in various life activities and play a vital role in the survival and reproduction of living organisms. Identification of essential proteins from protein-protein interaction (PPI) networks has great significance to facilitate the study of human complex diseases, the design of drugs and the development of bioinformatics and computational science. Studies have shown that highly connected proteins in a PPI network tend to be essential. A series of computational methods have been proposed to identify essential proteins by analyzing topological structures of PPI networks. However, the high noise in the PPI data can degrade the accuracy of essential protein prediction. Moreover, proteins must be located in the appropriate subcellular localization to perform their functions, and only when the proteins are located in the same subcellular localization, it is possible that they can interact with each other. In this paper, we propose a new network-based essential protein discovery method based on sub-network partition and prioritization by integrating subcellular localization information, named SPP. The proposed method SPP was tested on two different yeast PPI networks obtained from DIP database and BioGRID database. The experimental results show that SPP can effectively reduce the effect of false positives in PPI networks and predict essential proteins more accurately compared with other existing computational methods DC, BC, CC, SC, EC, IC, NC. Copyright © 2018 Elsevier Ltd. All rights reserved.
Rabal, Obdulia; Link, Wolfgang; Serelde, Beatriz G; Bischoff, James R; Oyarzabal, Julen
2010-04-01
Here we report the development and validation of a complete solution to manage and analyze the data produced by image-based phenotypic screening campaigns of small-molecule libraries. In one step initial crude images are analyzed for multiple cytological features, statistical analysis is performed and molecules that produce the desired phenotypic profile are identified. A naïve Bayes classifier, integrating chemical and phenotypic spaces, is built and utilized during the process to assess those images initially classified as "fuzzy"-an automated iterative feedback tuning. Simultaneously, all this information is directly annotated in a relational database containing the chemical data. This novel fully automated method was validated by conducting a re-analysis of results from a high-content screening campaign involving 33 992 molecules used to identify inhibitors of the PI3K/Akt signaling pathway. Ninety-two percent of confirmed hits identified by the conventional multistep analysis method were identified using this integrated one-step system as well as 40 new hits, 14.9% of the total, originally false negatives. Ninety-six percent of true negatives were properly recognized too. A web-based access to the database, with customizable data retrieval and visualization tools, facilitates the posterior analysis of annotated cytological features which allows identification of additional phenotypic profiles; thus, further analysis of original crude images is not required.
Earthquake detection through computationally efficient similarity search
Yoon, Clara E.; O’Reilly, Ossian; Bergen, Karianne J.; Beroza, Gregory C.
2015-01-01
Seismology is experiencing rapid growth in the quantity of data, which has outpaced the development of processing algorithms. Earthquake detection—identification of seismic events in continuous data—is a fundamental operation for observational seismology. We developed an efficient method to detect earthquakes using waveform similarity that overcomes the disadvantages of existing detection methods. Our method, called Fingerprint And Similarity Thresholding (FAST), can analyze a week of continuous seismic waveform data in less than 2 hours, or 140 times faster than autocorrelation. FAST adapts a data mining algorithm, originally designed to identify similar audio clips within large databases; it first creates compact “fingerprints” of waveforms by extracting key discriminative features, then groups similar fingerprints together within a database to facilitate fast, scalable search for similar fingerprint pairs, and finally generates a list of earthquake detections. FAST detected most (21 of 24) cataloged earthquakes and 68 uncataloged earthquakes in 1 week of continuous data from a station located near the Calaveras Fault in central California, achieving detection performance comparable to that of autocorrelation, with some additional false detections. FAST is expected to realize its full potential when applied to extremely long duration data sets over a distributed network of seismic stations. The widespread application of FAST has the potential to aid in the discovery of unexpected seismic signals, improve seismic monitoring, and promote a greater understanding of a variety of earthquake processes. PMID:26665176
DOE Office of Scientific and Technical Information (OSTI.GOV)
Enghauser, Michael
2016-02-01
The goal of the Domestic Nuclear Detection Office (DNDO) Algorithm Improvement Program (AIP) is to facilitate gamma-radiation detector nuclide identification algorithm development, improvement, and validation. Accordingly, scoring criteria have been developed to objectively assess the performance of nuclide identification algorithms. In addition, a Microsoft Excel spreadsheet application for automated nuclide identification scoring has been developed. This report provides an overview of the equations, nuclide weighting factors, nuclide equivalencies, and configuration weighting factors used by the application for scoring nuclide identification algorithm performance. Furthermore, this report presents a general overview of the nuclide identification algorithm scoring application including illustrative examples.
Kellogg, James A.; Bankert, David A.; Chaturvedi, Vishnu
1998-01-01
The ability of the rapid, computerized Microbial Identification System (MIS; Microbial ID, Inc.) to identify a variety of clinical isolates of yeast species was compared to the abilities of a combination of tests including the Yeast Biochemical Card (bioMerieux Vitek), determination of microscopic morphology on cornmeal agar with Tween 80, and when necessary, conventional biochemical tests and/or the API 20C Aux system (bioMerieux Vitek) to identify the same yeast isolates. The MIS chromatographically analyzes cellular fatty acids and compares the results with the fatty acid profiles in its database. Yeast isolates were subcultured onto Sabouraud dextrose agar and were incubated at 28°C for 24 h. The resulting colonies were saponified, methylated, extracted, and chromatographically analyzed (by version 3.8 of the MIS YSTCLN database) according to the manufacturer’s instructions. Of 477 isolates of 23 species tested, 448 (94%) were given species names by the MIS and 29 (6%) were unidentified (specified as “no match” by the MIS). Of the 448 isolates given names by the MIS, only 335 (75%) of the identifications were correct to the species level. While the MIS correctly identified only 102 (82%) of 124 isolates of Candida glabrata, the predictive value of an MIS identification of unknown isolates as C. glabrata was 100% (102 of 102) because no isolates of other species were misidentified as C. glabrata. In contrast, while the MIS correctly identified 100% (15 of 15) of the isolates of Saccharomyces cerevisiae, the predictive value of an MIS identification of unknown isolates as S. cerevisiae was only 47% (15 of 32), because 17 isolates of C. glabrata were misidentified as S. cerevisiae. The low predictive values for accuracy associated with MIS identifications for most of the remaining yeast species indicate that the procedure and/or database for the system need to be improved. PMID:9574676
Mild Traumatic Brain Injury: Facilitating School Success.
ERIC Educational Resources Information Center
Hux, Karen; Hacksley, Carolyn
1996-01-01
A case study is used to demonstrate the effects of mild traumatic brain injury on educational efforts. Discussion covers factors complicating school reintegration, ways to facilitate school reintegration, identification of cognitive and behavioral consequences, minimization of educators' discomfort, reintegration program design, and family…