Sample records for database oriented variant

  1. Public variant databases: liability?

    PubMed

    Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria

    2017-07-01

    Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing.Genet Med advance online publication 15 December 2016.

  2. Public variant databases: liability?

    PubMed Central

    Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria

    2017-01-01

    Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing. Genet Med advance online publication 15 December 2016 PMID:27977006

  3. Comparison of locus-specific databases for BRCA1 and BRCA2 variants reveals disparity in variant classification within and among databases.

    PubMed

    Vail, Paris J; Morris, Brian; van Kan, Aric; Burdett, Brianna C; Moyes, Kelsey; Theisen, Aaron; Kerr, Iain D; Wenstrup, Richard J; Eggington, Julie M

    2015-10-01

    Genetic variants of uncertain clinical significance (VUSs) are a common outcome of clinical genetic testing. Locus-specific variant databases (LSDBs) have been established for numerous disease-associated genes as a research tool for the interpretation of genetic sequence variants to facilitate variant interpretation via aggregated data. If LSDBs are to be used for clinical practice, consistent and transparent criteria regarding the deposition and interpretation of variants are vital, as variant classifications are often used to make important and irreversible clinical decisions. In this study, we performed a retrospective analysis of 2017 consecutive BRCA1 and BRCA2 genetic variants identified from 24,650 consecutive patient samples referred to our laboratory to establish an unbiased dataset representative of the types of variants seen in the US patient population, submitted by clinicians and researchers for BRCA1 and BRCA2 testing. We compared the clinical classifications of these variants among five publicly accessible BRCA1 and BRCA2 variant databases: BIC, ClinVar, HGMD (paid version), LOVD, and the UMD databases. Our results show substantial disparity of variant classifications among publicly accessible databases. Furthermore, it appears that discrepant classifications are not the result of a single outlier but widespread disagreement among databases. This study also shows that databases sometimes favor a clinical classification when current best practice guidelines (ACMG/AMP/CAP) would suggest an uncertain classification. Although LSDBs have been well established for research applications, our results suggest several challenges preclude their wider use in clinical practice.

  4. The UCL low-density lipoprotein receptor gene variant database: pathogenicity update

    PubMed Central

    Futema, Marta; Whittall, Ros; Taylor-Beadling, Alison; Williams, Maggie; den Dunnen, Johan T; Humphries, Steve E

    2017-01-01

    Background Familial hypercholesterolaemia (OMIM 143890) is most frequently caused by variations in the low-density lipoprotein receptor (LDLR) gene. Predicting whether novel variants are pathogenic may not be straightforward, especially for missense and synonymous variants. In 2013, the Association of Clinical Genetic Scientists published guidelines for the classification of variants, with categories 1 and 2 representing clearly not or unlikely pathogenic, respectively, 3 representing variants of unknown significance (VUS), and 4 and 5 representing likely to be or clearly pathogenic, respectively. Here, we update the University College London (UCL) LDLR variant database according to these guidelines. Methods PubMed searches and alerts were used to identify novel LDLR variants for inclusion in the database. Standard in silico tools were used to predict potential pathogenicity. Variants were designated as class 4/5 only when the predictions from the different programs were concordant and as class 3 when predictions were discordant. Results The updated database (http://www.lovd.nl/LDLR) now includes 2925 curated variants, representing 1707 independent events. All 129 nonsense variants, 337 small frame-shifting and 117/118 large rearrangements were classified as 4 or 5. Of the 795 missense variants, 115 were in classes 1 and 2, 605 in class 4 and 75 in class 3. 111/181 intronic variants, 4/34 synonymous variants and 14/37 promoter variants were assigned to classes 4 or 5. Overall, 112 (7%) of reported variants were class 3. Conclusions This study updates the LDLR variant database and identifies a number of reported VUS where additional family and in vitro studies will be required to confirm or refute their pathogenicity. PMID:27821657

  5. DNA variant databases improve test accuracy and phenotype prediction in Alport syndrome.

    PubMed

    Savige, Judy; Ars, Elisabet; Cotton, Richard G H; Crockett, David; Dagher, Hayat; Deltas, Constantinos; Ding, Jie; Flinter, Frances; Pont-Kingdon, Genevieve; Smaoui, Nizar; Torra, Roser; Storey, Helen

    2014-06-01

    X-linked Alport syndrome is a form of progressive renal failure caused by pathogenic variants in the COL4A5 gene. More than 700 variants have been described and a further 400 are estimated to be known to individual laboratories but are unpublished. The major genetic testing laboratories for X-linked Alport syndrome worldwide have established a Web-based database for published and unpublished COL4A5 variants ( https://grenada.lumc.nl/LOVD2/COL4A/home.php?select_db=COL4A5 ). This conforms with the recommendations of the Human Variome Project: it uses the Leiden Open Variation Database (LOVD) format, describes variants according to the human reference sequence with standardized nomenclature, indicates likely pathogenicity and associated clinical features, and credits the submitting laboratory. The database includes non-pathogenic and recurrent variants, and is linked to another COL4A5 mutation database and relevant bioinformatics sites. Access is free. Increasing the number of COL4A5 variants in the public domain helps patients, diagnostic laboratories, clinicians, and researchers. The database improves the accuracy and efficiency of genetic testing because its variants are already categorized for pathogenicity. The description of further COL4A5 variants and clinical associations will improve our ability to predict phenotype and our understanding of collagen IV biochemistry. The database for X-linked Alport syndrome represents a model for databases in other inherited renal diseases.

  6. Database for Parkinson Disease Mutations and Rare Variants

    DTIC Science & Technology

    2016-09-01

    AWARD NUMBER: W81XWH-14-1-0097 TITLE: “ Database for Parkinson Disease Mutations and Rare Variants” PRINCIPAL INVESTIGATOR: JEFFERY M. VANCE...TO THE ABOVE ADDRESS. 1. REPORT DATE September 2016 2. REPORT TYPE FINAL 3. DATES COVERED 1 Jul 2014 – 30 Jun 2016 4. TITLE AND SUBTITLE Database ...For Parkinson Disease (PD) specifically, the variant databases currently available are incomplete, don’t assess impact and/or are not equipped to

  7. Identification of Alternative Splice Variants Using Unique Tryptic Peptide Sequences for Database Searches.

    PubMed

    Tran, Trung T; Bollineni, Ravi C; Strozynski, Margarita; Koehler, Christian J; Thiede, Bernd

    2017-07-07

    Alternative splicing is a mechanism in eukaryotes by which different forms of mRNAs are generated from the same gene. Identification of alternative splice variants requires the identification of peptides specific for alternative splice forms. For this purpose, we generated a human database that contains only unique tryptic peptides specific for alternative splice forms from Swiss-Prot entries. Using this database allows an easy access to splice variant-specific peptide sequences that match to MS data. Furthermore, we combined this database without alternative splice variant-1-specific peptides with human Swiss-Prot. This combined database can be used as a general database for searching of LC-MS data. LC-MS data derived from in-solution digests of two different cell lines (LNCaP, HeLa) and phosphoproteomics studies were analyzed using these two databases. Several nonalternative splice variant-1-specific peptides were found in both cell lines, and some of them seemed to be cell-line-specific. Control and apoptotic phosphoproteomes from Jurkat T cells revealed several nonalternative splice variant-1-specific peptides, and some of them showed clear quantitative differences between the two states.

  8. Clinical Variant Classification: A Comparison of Public Databases and a Commercial Testing Laboratory.

    PubMed

    Gradishar, William; Johnson, KariAnne; Brown, Krystal; Mundt, Erin; Manley, Susan

    2017-07-01

    There is a growing move to consult public databases following receipt of a genetic test result from a clinical laboratory; however, the well-documented limitations of these databases call into question how often clinicians will encounter discordant variant classifications that may introduce uncertainty into patient management. Here, we evaluate discordance in BRCA1 and BRCA2 variant classifications between a single commercial testing laboratory and a public database commonly consulted in clinical practice. BRCA1 and BRCA2 variant classifications were obtained from ClinVar and compared with the classifications from a reference laboratory. Full concordance and discordance were determined for variants whose ClinVar entries were of the same pathogenicity (pathogenic, benign, or uncertain). Variants with conflicting ClinVar classifications were considered partially concordant if ≥1 of the listed classifications agreed with the reference laboratory classification. Four thousand two hundred and fifty unique BRCA1 and BRCA2 variants were available for analysis. Overall, 73.2% of classifications were fully concordant and 12.3% were partially concordant. The remaining 14.5% of variants had discordant classifications, most of which had a definitive classification (pathogenic or benign) from the reference laboratory compared with an uncertain classification in ClinVar (14.0%). Here, we show that discrepant classifications between a public database and single reference laboratory potentially account for 26.7% of variants in BRCA1 and BRCA2 . The time and expertise required of clinicians to research these discordant classifications call into question the practicality of checking all test results against a database and suggest that discordant classifications should be interpreted with these limitations in mind. With the increasing use of clinical genetic testing for hereditary cancer risk, accurate variant classification is vital to ensuring appropriate medical management

  9. The Clinical Next-Generation Sequencing Database: A Tool for the Unified Management of Clinical Information and Genetic Variants to Accelerate Variant Pathogenicity Classification.

    PubMed

    Nishio, Shin-Ya; Usami, Shin-Ichi

    2017-03-01

    Recent advances in next-generation sequencing (NGS) have given rise to new challenges due to the difficulties in variant pathogenicity interpretation and large dataset management, including many kinds of public population databases as well as public or commercial disease-specific databases. Here, we report a new database development tool, named the "Clinical NGS Database," for improving clinical NGS workflow through the unified management of variant information and clinical information. This database software offers a two-feature approach to variant pathogenicity classification. The first of these approaches is a phenotype similarity-based approach. This database allows the easy comparison of the detailed phenotype of each patient with the average phenotype of the same gene mutation at the variant or gene level. It is also possible to browse patients with the same gene mutation quickly. The other approach is a statistical approach to variant pathogenicity classification based on the use of the odds ratio for comparisons between the case and the control for each inheritance mode (families with apparently autosomal dominant inheritance vs. control, and families with apparently autosomal recessive inheritance vs. control). A number of case studies are also presented to illustrate the utility of this database. © 2016 The Authors. **Human Mutation published by Wiley Periodicals, Inc.

  10. The Saccharomyces Genome Database Variant Viewer

    PubMed Central

    Sheppard, Travis K.; Hitz, Benjamin C.; Engel, Stacia R.; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C.; Dalusag, Kyla S.; Demeter, Janos; Hellerstedt, Sage T.; Karra, Kalpana; Nash, Robert S.; Paskov, Kelley M.; Skrzypek, Marek S.; Weng, Shuai; Wong, Edith D.; Cherry, J. Michael

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. PMID:26578556

  11. Object-oriented structures supporting remote sensing databases

    NASA Technical Reports Server (NTRS)

    Wichmann, Keith; Cromp, Robert F.

    1995-01-01

    Object-oriented databases show promise for modeling the complex interrelationships pervasive in scientific domains. To examine the utility of this approach, we have developed an Intelligent Information Fusion System based on this technology, and applied it to the problem of managing an active repository of remotely-sensed satellite scenes. The design and implementation of the system is compared and contrasted with conventional relational database techniques, followed by a presentation of the underlying object-oriented data structures used to enable fast indexing into the data holdings.

  12. The Saccharomyces Genome Database Variant Viewer.

    PubMed

    Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael

    2016-01-04

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Korean Variant Archive (KOVA): a reference database of genetic variations in the Korean population.

    PubMed

    Lee, Sangmoon; Seo, Jihae; Park, Jinman; Nam, Jae-Yong; Choi, Ahyoung; Ignatius, Jason S; Bjornson, Robert D; Chae, Jong-Hee; Jang, In-Jin; Lee, Sanghyuk; Park, Woong-Yang; Baek, Daehyun; Choi, Murim

    2017-06-27

    Despite efforts to interrogate human genome variation through large-scale databases, systematic preference toward populations of Caucasian descendants has resulted in unintended reduction of power in studying non-Caucasians. Here we report a compilation of coding variants from 1,055 healthy Korean individuals (KOVA; Korean Variant Archive). The samples were sequenced to a mean depth of 75x, yielding 101 singleton variants per individual. Population genetics analysis demonstrates that the Korean population is a distinct ethnic group comparable to other discrete ethnic groups in Africa and Europe, providing a rationale for such independent genomic datasets. Indeed, KOVA conferred 22.8% increased variant filtering power in addition to Exome Aggregation Consortium (ExAC) when used on Korean exomes. Functional assessment of nonsynonymous variant supported the presence of purifying selection in Koreans. Analysis of copy number variants detected 5.2 deletions and 10.3 amplifications per individual with an increased fraction of novel variants among smaller and rarer copy number variable segments. We also report a list of germline variants that are associated with increased tumor susceptibility. This catalog can function as a critical addition to the pre-existing variant databases in pursuing genetic studies of Korean individuals.

  14. EMEN2: An Object Oriented Database and Electronic Lab Notebook

    PubMed Central

    Rees, Ian; Langley, Ed; Chiu, Wah; Ludtke, Steven J.

    2013-01-01

    Transmission electron microscopy and associated methods such as single particle analysis, 2-D crystallography, helical reconstruction and tomography, are highly data-intensive experimental sciences, which also have substantial variability in experimental technique. Object-oriented databases present an attractive alternative to traditional relational databases for situations where the experiments themselves are continually evolving. We present EMEN2, an easy to use object-oriented database with a highly flexible infrastructure originally targeted for transmission electron microscopy and tomography, which has been extended to be adaptable for use in virtually any experimental science. It is a pure object-oriented database designed for easy adoption in diverse laboratory environments, and does not require professional database administration. It includes a full featured, dynamic web interface in addition to APIs for programmatic access. EMEN2 installations currently support roughly 800 scientists worldwide with over 1/2 million experimental records and over 20 TB of experimental data. The software is freely available with complete source. PMID:23360752

  15. HbVar: A relational database of human hemoglobin variants and thalassemia mutations at the globin gene server.

    PubMed

    Hardison, Ross C; Chui, David H K; Giardine, Belinda; Riemer, Cathy; Patrinos, George P; Anagnou, Nicholas; Miller, Webb; Wajcman, Henri

    2002-03-01

    We have constructed a relational database of hemoglobin variants and thalassemia mutations, called HbVar, which can be accessed on the web at http://globin.cse.psu.edu. Extensive information is recorded for each variant and mutation, including a description of the variant and associated pathology, hematology, electrophoretic mobility, methods of isolation, stability information, ethnic occurrence, structure studies, functional studies, and references. The initial information was derived from books by Dr. Titus Huisman and colleagues [Huisman et al., 1996, 1997, 1998]. The current database is updated regularly with the addition of new data and corrections to previous data. Queries can be formulated based on fields in the database. Tables of common categories of variants, such as all those involving the alpha1-globin gene (HBA1) or all those that result in high oxygen affinity, are maintained by automated queries on the database. Users can formulate more precise queries, such as identifying "all beta-globin variants associated with instability and found in Scottish populations." This new database should be useful for clinical diagnosis as well as in fundamental studies of hemoglobin biochemistry, globin gene regulation, and human sequence variation at these loci. Copyright 2002 Wiley-Liss, Inc.

  16. LenVarDB: database of length-variant protein domains.

    PubMed

    Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan

    2014-01-01

    Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.

  17. Genetic variants of the DNA repair genes from Exome Aggregation Consortium (EXAC) database: significance in cancer.

    PubMed

    Das, Raima; Ghosh, Sankar Kumar

    2017-04-01

    DNA repair pathway is a primary defense system that eliminates wide varieties of DNA damage. Any deficiencies in them are likely to cause the chromosomal instability that leads to cell malfunctioning and tumorigenesis. Genetic polymorphisms in DNA repair genes have demonstrated a significant association with cancer risk. Our study attempts to give a glimpse of the overall scenario of the germline polymorphisms in the DNA repair genes by taking into account of the Exome Aggregation Consortium (ExAC) database as well as the Human Gene Mutation Database (HGMD) for evaluating the disease link, particularly in cancer. It has been found that ExAC DNA repair dataset (which consists of 228 DNA repair genes) comprises 30.4% missense, 12.5% dbSNP reported and 3.2% ClinVar significant variants. 27% of all the missense variants has the deleterious SIFT score of 0.00 and 6% variants carrying the most damaging Polyphen-2 score of 1.00, thus affecting the protein structure and function. However, as per HGMD, only a fraction (1.2%) of ExAC DNA repair variants was found to be cancer-related, indicating remaining variants reported in both the databases to be further analyzed. This, in turn, may provide an increased spectrum of the reported cancer linked variants in the DNA repair genes present in ExAC database. Moreover, further in silico functional assay of the identified vital cancer-associated variants, which is essential to get their actual biological significance, may shed some lights in the field of targeted drug development in near future. Copyright © 2017. Published by Elsevier B.V.

  18. Asynchronous Data Retrieval from an Object-Oriented Database

    NASA Astrophysics Data System (ADS)

    Gilbert, Jonathan P.; Bic, Lubomir

    We present an object-oriented semantic database model which, similar to other object-oriented systems, combines the virtues of four concepts: the functional data model, a property inheritance hierarchy, abstract data types and message-driven computation. The main emphasis is on the last of these four concepts. We describe generic procedures that permit queries to be processed in a purely message-driven manner. A database is represented as a network of nodes and directed arcs, in which each node is a logical processing element, capable of communicating with other nodes by exchanging messages. This eliminates the need for shared memory and for centralized control during query processing. Hence, the model is suitable for implementation on a multiprocessor computer architecture, consisting of large numbers of loosely coupled processing elements.

  19. Prototyping Visual Database Interface by Object-Oriented Language

    DTIC Science & Technology

    1988-06-01

    approach is to use object-oriented programming. Object-oriented languages are characterized by three criteria [Ref. 4:p. 1.2.1]: - encapsulation of...made it a sub-class of our DMWindow.Cls, which is discussed later in this chapter. This extension to the application had to be intergrated with our... abnormal behaviors similar to Korth’s discussion of pitfalls in relational database designing. Even extensions like GEM [Ref. 8] that are powerful and

  20. CFTR-France, a national relational patient database for sharing genetic and phenotypic data associated with rare CFTR variants.

    PubMed

    Claustres, Mireille; Thèze, Corinne; des Georges, Marie; Baux, David; Girodon, Emmanuelle; Bienvenu, Thierry; Audrezet, Marie-Pierre; Dugueperoux, Ingrid; Férec, Claude; Lalau, Guy; Pagin, Adrien; Kitzis, Alain; Thoreau, Vincent; Gaston, Véronique; Bieth, Eric; Malinge, Marie-Claire; Reboul, Marie-Pierre; Fergelot, Patricia; Lemonnier, Lydie; Mekki, Chadia; Fanen, Pascale; Bergougnoux, Anne; Sasorith, Souphatta; Raynal, Caroline; Bareil, Corinne

    2017-10-01

    Most of the 2,000 variants identified in the CFTR (cystic fibrosis transmembrane regulator) gene are rare or private. Their interpretation is hampered by the lack of available data and resources, making patient care and genetic counseling challenging. We developed a patient-based database dedicated to the annotations of rare CFTR variants in the context of their cis- and trans-allelic combinations. Based on almost 30 years of experience of CFTR testing, CFTR-France (https://cftr.iurc.montp.inserm.fr/cftr) currently compiles 16,819 variant records from 4,615 individuals with cystic fibrosis (CF) or CFTR-RD (related disorders), fetuses with ultrasound bowel anomalies, newborns awaiting clinical diagnosis, and asymptomatic compound heterozygotes. For each of the 736 different variants reported in the database, patient characteristics and genetic information (other variations in cis or in trans) have been thoroughly checked by a dedicated curator. Combining updated clinical, epidemiological, in silico, or in vitro functional data helps to the interpretation of unclassified and the reassessment of misclassified variants. This comprehensive CFTR database is now an invaluable tool for diagnostic laboratories gathering information on rare variants, especially in the context of genetic counseling, prenatal and preimplantation genetic diagnosis. CFTR-France is thus highly complementary to the international database CFTR2 focused so far on the most common CF-causing alleles. © 2017 Wiley Periodicals, Inc.

  1. CYP21A2 mutation update: Comprehensive analysis of databases and published genetic variants.

    PubMed

    Simonetti, Leandro; Bruque, Carlos D; Fernández, Cecilia S; Benavides-Mori, Belén; Delea, Marisol; Kolomenski, Jorge E; Espeche, Lucía D; Buzzalino, Noemí D; Nadra, Alejandro D; Dain, Liliana

    2018-01-01

    Congenital adrenal hyperplasia (CAH) is a group of autosomal recessive disorders of adrenal steroidogenesis. Disorders in steroid 21-hydroxylation account for over 95% of patients with CAH. Clinically, the 21-hydroxylase deficiency has been classified in a broad spectrum of clinical forms, ranging from severe or classical, to mild late onset or non-classical. Known allelic variants in the disease causing CYP21A2 gene are spread among different sources. Until recently, most variants reported have been identified in the clinical setting, which presumably bias described variants to pathogenic ones, as those found in the CYPAlleles database. Nevertheless, a large number of variants are being described in massive genome projects, many of which are found in dbSNP, but lack functional implications and/or their phenotypic effect. In this work, we gathered a total of 1,340 GVs in the CYP21A2 gene, from which 899 variants were unique and 230 have an effect on human health, and compiled all this information in an integrated database. We also connected CYP21A2 sequence information to phenotypic effects for all available mutations, including double mutants in cis. Data compiled in the present work could help physicians in the genetic counseling of families affected with 21-hydroxylase deficiency. © 2017 Wiley Periodicals, Inc.

  2. eMelanoBase: an online locus-specific variant database for familial melanoma.

    PubMed

    Fung, David C Y; Holland, Elizabeth A; Becker, Therese M; Hayward, Nicholas K; Bressac-de Paillerets, Brigitte; Mann, Graham J

    2003-01-01

    A proportion of melanoma-prone individuals in both familial and non-familial contexts has been shown to carry inactivating mutations in either CDKN2A or, rarely, CDK4. CDKN2A is a complex locus that encodes two unrelated proteins from alternately spliced transcripts that are read in different frames. The alpha transcript (exons 1alpha, 2, and 3) produces the p16INK4A cyclin-dependent kinase inhibitor, while the beta transcript (exons 1beta and 2) is translated as p14ARF, a stabilizing factor of p53 levels through binding to MDM2. Mutations in exon 2 can impair both polypeptides and insertions and deletions in exons 1alpha, 1beta, and 2, which can theoretically generate p16INK4A-p14ARF fusion proteins. No online database currently takes into account all the consequences of these genotypes, a situation compounded by some problematic previous annotations of CDKN2A-related sequences and descriptions of their mutations. As an initiative of the international Melanoma Genetics Consortium, we have therefore established a database of germline variants observed in all loci implicated in familial melanoma susceptibility. Such a comprehensive, publicly accessible database is an essential foundation for research on melanoma susceptibility and its clinical application. Our database serves two types of data as defined by HUGO. The core dataset includes the nucleotide variants on the genomic and transcript levels, amino acid variants, and citation. The ancillary dataset includes keyword description of events at the transcription and translation levels and epidemiological data. The application that handles users' queries was designed in the model-view-controller architecture and was implemented in Java. The object-relational database schema was deduced using functional dependency analysis. We hereby present our first functional prototype of eMelanoBase. The service is accessible via the URL www.wmi.usyd.edu.au:8080/melanoma.html. Copyright 2002 Wiley-Liss, Inc.

  3. Large scale database scrubbing using object oriented software components.

    PubMed

    Herting, R L; Barnes, M R

    1998-01-01

    Now that case managers, quality improvement teams, and researchers use medical databases extensively, the ability to share and disseminate such databases while maintaining patient confidentiality is paramount. A process called scrubbing addresses this problem by removing personally identifying information while keeping the integrity of the medical information intact. Scrubbing entire databases, containing multiple tables, requires that the implicit relationships between data elements in different tables of the database be maintained. To address this issue we developed DBScrub, a Java program that interfaces with any JDBC compliant database and scrubs the database while maintaining the implicit relationships within it. DBScrub uses a small number of highly configurable object-oriented software components to carry out the scrubbing. We describe the structure of these software components and how they maintain the implicit relationships within the database.

  4. A survey of commercial object-oriented database management systems

    NASA Technical Reports Server (NTRS)

    Atkins, John

    1992-01-01

    The object-oriented data model is the culmination of over thirty years of database research. Initially, database research focused on the need to provide information in a consistent and efficient manner to the business community. Early data models such as the hierarchical model and the network model met the goal of consistent and efficient access to data and were substantial improvements over simple file mechanisms for storing and accessing data. However, these models required highly skilled programmers to provide access to the data. Consequently, in the early 70's E.F. Codd, an IBM research computer scientists, proposed a new data model based on the simple mathematical notion of the relation. This model is known as the Relational Model. In the relational model, data is represented in flat tables (or relations) which have no physical or internal links between them. The simplicity of this model fostered the development of powerful but relatively simple query languages that now made data directly accessible to the general database user. Except for large, multi-user database systems, a database professional was in general no longer necessary. Database professionals found that traditional data in the form of character data, dates, and numeric data were easily represented and managed via the relational model. Commercial relational database management systems proliferated and performance of relational databases improved dramatically. However, there was a growing community of potential database users whose needs were not met by the relational model. These users needed to store data with data types not available in the relational model and who required a far richer modelling environment than that provided by the relational model. Indeed, the complexity of the objects to be represented in the model mandated a new approach to database technology. The Object-Oriented Model was the result.

  5. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

    PubMed

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. © The Author(s) 2016. Published by Oxford University Press.

  6. VAS: A Vision Advisor System combining agents and object-oriented databases

    NASA Technical Reports Server (NTRS)

    Eilbert, James L.; Lim, William; Mendelsohn, Jay; Braun, Ron; Yearwood, Michael

    1994-01-01

    A model-based approach to identifying and finding the orientation of non-overlapping parts on a tray has been developed. The part models contain both exact and fuzzy descriptions of part features, and are stored in an object-oriented database. Full identification of the parts involves several interacting tasks each of which is handled by a distinct agent. Using fuzzy information stored in the model allowed part features that were essentially at the noise level to be extracted and used for identification. This was done by focusing attention on the portion of the part where the feature must be found if the current hypothesis of the part ID is correct. In going from one set of parts to another the only thing that needs to be changed is the database of part models. This work is part of an effort in developing a Vision Advisor System (VAS) that combines agents and objected-oriented databases.

  7. A Toolkit for Active Object-Oriented Databases with Application to Interoperability

    NASA Technical Reports Server (NTRS)

    King, Roger

    1996-01-01

    In our original proposal we stated that our research would 'develop a novel technology that provides a foundation for collaborative information processing.' The essential ingredient of this technology is the notion of 'deltas,' which are first-class values representing collections of proposed updates to a database. The Heraclitus framework provides a variety of algebraic operators for building up, combining, inspecting, and comparing deltas. Deltas can be directly applied to the database to yield a new state, or used 'hypothetically' in queries against the state that would arise if the delta were applied. The central point here is that the step of elevating deltas to 'first-class' citizens in database programming languages will yield tremendous leverage on the problem of supporting updates in collaborative information processing. In short, our original intention was to develop the theoretical and practical foundation for a technology based on deltas in an object-oriented database context, develop a toolkit for active object-oriented databases, and apply this toward collaborative information processing.

  8. A Toolkit for Active Object-Oriented Databases with Application to Interoperability

    NASA Technical Reports Server (NTRS)

    King, Roger

    1996-01-01

    In our original proposal we stated that our research would 'develop a novel technology that provides a foundation for collaborative information processing.' The essential ingredient of this technology is the notion of 'deltas,' which are first-class values representing collections of proposed updates to a database. The Heraclitus framework provides a variety of algebraic operators for building up, combining, inspecting, and comparing deltas. Deltas can be directly applied to the database to yield a new state, or used 'hypothetically' in queries against the state that would arise if the delta were applied. The central point here is that the step of elevating deltas to 'first-class' citizens in database programming languages will yield tremendous leverage on the problem of supporting updates in collaborative information processing. In short, our original intention was to develop the theoretical and practical foundation for a technology based on deltas in an object- oriented database context, develop a toolkit for active object-oriented databases, and apply this toward collaborative information processing.

  9. Object-oriented parsing of biological databases with Python.

    PubMed

    Ramu, C; Gemünd, C; Gibson, T J

    2000-07-01

    While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way. We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.

  10. TransAtlasDB: an integrated database connecting expression data, metadata and variants

    PubMed Central

    Adetunji, Modupeore O; Lamont, Susan J; Schmidt, Carl J

    2018-01-01

    Abstract High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis. Database URL: https://modupeore.github.io/TransAtlasDB/ PMID:29688361

  11. Knowledge Discovery in Variant Databases Using Inductive Logic Programming

    PubMed Central

    Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D.

    2013-01-01

    Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/. PMID:23589683

  12. Knowledge discovery in variant databases using inductive logic programming.

    PubMed

    Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D

    2013-01-01

    Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.

  13. Graphical user interfaces for symbol-oriented database visualization and interaction

    NASA Astrophysics Data System (ADS)

    Brinkschulte, Uwe; Siormanolakis, Marios; Vogelsang, Holger

    1997-04-01

    In this approach, two basic services designed for the engineering of computer based systems are combined: a symbol-oriented man-machine-service and a high speed database-service. The man-machine service is used to build graphical user interfaces (GUIs) for the database service; these interfaces are stored using the database service. The idea is to create a GUI-builder and a GUI-manager for the database service based upon the man-machine service using the concept of symbols. With user-definable and predefined symbols, database contents can be visualized and manipulated in a very flexible and intuitive way. Using the GUI-builder and GUI-manager, a user can build and operate its own graphical user interface for a given database according to its needs without writing a single line of code.

  14. BISQUE: locus- and variant-specific conversion of genomic, transcriptomic and proteomic database identifiers.

    PubMed

    Meyer, Michael J; Geske, Philip; Yu, Haiyuan

    2016-05-15

    Biological sequence databases are integral to efforts to characterize and understand biological molecules and share biological data. However, when analyzing these data, scientists are often left holding disparate biological currency-molecular identifiers from different databases. For downstream applications that require converting the identifiers themselves, there are many resources available, but analyzing associated loci and variants can be cumbersome if data is not given in a form amenable to particular analyses. Here we present BISQUE, a web server and customizable command-line tool for converting molecular identifiers and their contained loci and variants between different database conventions. BISQUE uses a graph traversal algorithm to generalize the conversion process for residues in the human genome, genes, transcripts and proteins, allowing for conversion across classes of molecules and in all directions through an intuitive web interface and a URL-based web service. BISQUE is freely available via the web using any major web browser (http://bisque.yulab.org/). Source code is available in a public GitHub repository (https://github.com/hyulab/BISQUE). haiyuan.yu@cornell.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Embedding CLIPS in a database-oriented diagnostic system

    NASA Technical Reports Server (NTRS)

    Conway, Tim

    1990-01-01

    This paper describes the integration of C Language Production Systems (CLIPS) into a powerful portable maintenance aid (PMA) system used for flightline diagnostics. The current diagnostic target of the system is the Garrett GTCP85-180L, a gas turbine engine used as an Auxiliary Power Unit (APU) on some C-130 military transport aircraft. This project is a database oriented approach to a generic diagnostic system. CLIPS is used for 'many-to-many' pattern matching within the diagnostics process. Patterns are stored in database format, and CLIPS code is generated by a 'compilation' process on the database. Multiple CLIPS rule sets and working memories (in sequence) are supported and communication between the rule sets is achieved via the export and import commands. Work is continuing on using CLIPS in other portions of the diagnostic system and in re-implementing the diagnostic system in the Ada language.

  16. Description and analysis of genetic variants in French hereditary breast and ovarian cancer families recorded in the UMD-BRCA1/BRCA2 databases.

    PubMed

    Caputo, Sandrine; Benboudjema, Louisa; Sinilnikova, Olga; Rouleau, Etienne; Béroud, Christophe; Lidereau, Rosette

    2012-01-01

    BRCA1 and BRCA2 are the two main genes responsible for predisposition to breast and ovarian cancers, as a result of protein-inactivating monoallelic mutations. It remains to be established whether many of the variants identified in these two genes, so-called unclassified/unknown variants (UVs), contribute to the disease phenotype or are simply neutral variants (or polymorphisms). Given the clinical importance of establishing their status, a nationwide effort to annotate these UVs was launched by laboratories belonging to the French GGC consortium (Groupe Génétique et Cancer), leading to the creation of the UMD-BRCA1/BRCA2 databases (http://www.umd.be/BRCA1/ and http://www.umd.be/BRCA2/). These databases have been endorsed by the French National Cancer Institute (INCa) and are designed to collect all variants detected in France, whether causal, neutral or UV. They differ from other BRCA databases in that they contain co-occurrence data for all variants. Using these data, the GGC French consortium has been able to classify certain UVs also contained in other databases. In this article, we report some novel UVs not contained in the BIC database and explore their impact in cancer predisposition based on a structural approach.

  17. Applying AN Object-Oriented Database Model to a Scientific Database Problem: Managing Experimental Data at Cebaf.

    NASA Astrophysics Data System (ADS)

    Ehlmann, Bryon K.

    Current scientific experiments are often characterized by massive amounts of very complex data and the need for complex data analysis software. Object-oriented database (OODB) systems have the potential of improving the description of the structure and semantics of this data and of integrating the analysis software with the data. This dissertation results from research to enhance OODB functionality and methodology to support scientific databases (SDBs) and, more specifically, to support a nuclear physics experiments database for the Continuous Electron Beam Accelerator Facility (CEBAF). This research to date has identified a number of problems related to the practical application of OODB technology to the conceptual design of the CEBAF experiments database and other SDBs: the lack of a generally accepted OODB design methodology, the lack of a standard OODB model, the lack of a clear conceptual level in existing OODB models, and the limited support in existing OODB systems for many common object relationships inherent in SDBs. To address these problems, the dissertation describes an Object-Relationship Diagram (ORD) and an Object-oriented Database Definition Language (ODDL) that provide tools that allow SDB design and development to proceed systematically and independently of existing OODB systems. These tools define multi-level, conceptual data models for SDB design, which incorporate a simple notation for describing common types of relationships that occur in SDBs. ODDL allows these relationships and other desirable SDB capabilities to be supported by an extended OODB system. A conceptual model of the CEBAF experiments database is presented in terms of ORDs and the ODDL to demonstrate their functionality and use and provide a foundation for future development of experimental nuclear physics software using an OODB approach.

  18. Standardisation of the FAERS database: a systematic approach to manually recoding drug name variants.

    PubMed

    Wong, Carmen K; Ho, Samuel S; Saini, Bandana; Hibbs, David E; Fois, Romano A

    2015-07-01

    The US Food and Drug Administration Adverse Event Reporting System (FAERS), one of the world's largest spontaneous reporting systems, is difficult to use because of report duplication and a lack of standardisation in the recording of drug names. Unresolved data quality issues may distort statistical analyses, rendering the results difficult to interpret when detecting and monitoring adverse effects of pharmaceutical products. The aim of this study was to develop and implement a data cleaning protocol to identify and resolve drug nomenclature issues. The key 'data treatment' plan involved standardising drug names held in the FAERS database. Four million five hundred and six thousand five hundred and seventy-seven. Individual Safety Reports submitted to the FAERS between 1 January 2003 and 31 August 2012 were included for this study. OpenRefine was used to standardise drug name variants in the database such that they were consistent with international non-proprietary nomenclature defined by the World Health Organisation Anatomical Therapeutic Chemical classification. Drug variants where generic constituents could not be confidently determined, undecipherable drug names and non-medicinal products were retained verbatim. After the standardisation process, more than 16 611 916 drug entries were cleaned to their relevant international non-proprietary name. The cleaned drug table comprised 71 858 drug name variants and includes both standardised and original terms. Ninety-nine per cent of drug names was standardised using this method. The millions of reports enclosed in the FAERS contain valuable information that is of interest to pharmacovigilance, toxicology and post-marketing surveillance researchers. With the standardisation of the drug nomenclature, the database can be better utilised by research groups around the world. Copyright © 2015 John Wiley & Sons, Ltd.

  19. Building a genome database using an object-oriented approach.

    PubMed

    Barbasiewicz, Anna; Liu, Lin; Lang, B Franz; Burger, Gertraud

    2002-01-01

    GOBASE is a relational database that integrates data associated with mitochondria and chloroplasts. The most important data in GOBASE, i. e., molecular sequences and taxonomic information, are obtained from the public sequence data repository at the National Center for Biotechnology Information (NCBI), and are validated by our experts. Maintaining a curated genomic database comes with a towering labor cost, due to the shear volume of available genomic sequences and the plethora of annotation errors and omissions in records retrieved from public repositories. Here we describe our approach to increase automation of the database population process, thereby reducing manual intervention. As a first step, we used Unified Modeling Language (UML) to construct a list of potential errors. Each case was evaluated independently, and an expert solution was devised, and represented as a diagram. Subsequently, the UML diagrams were used as templates for writing object-oriented automation programs in the Java programming language.

  20. Reliability database development for use with an object-oriented fault tree evaluation program

    NASA Technical Reports Server (NTRS)

    Heger, A. Sharif; Harringtton, Robert J.; Koen, Billy V.; Patterson-Hine, F. Ann

    1989-01-01

    A description is given of the development of a fault-tree analysis method using object-oriented programming. In addition, the authors discuss the programs that have been developed or are under development to connect a fault-tree analysis routine to a reliability database. To assess the performance of the routines, a relational database simulating one of the nuclear power industry databases has been constructed. For a realistic assessment of the results of this project, the use of one of existing nuclear power reliability databases is planned.

  1. Integrating heterogeneous databases in clustered medic care environments using object-oriented technology

    NASA Astrophysics Data System (ADS)

    Thakore, Arun K.; Sauer, Frank

    1994-05-01

    The organization of modern medical care environments into disease-related clusters, such as a cancer center, a diabetes clinic, etc., has the side-effect of introducing multiple heterogeneous databases, often containing similar information, within the same organization. This heterogeneity fosters incompatibility and prevents the effective sharing of data amongst applications at different sites. Although integration of heterogeneous databases is now feasible, in the medical arena this is often an ad hoc process, not founded on proven database technology or formal methods. In this paper we illustrate the use of a high-level object- oriented semantic association method to model information found in different databases into an integrated conceptual global model that integrates the databases. We provide examples from the medical domain to illustrate an integration approach resulting in a consistent global view, without attacking the autonomy of the underlying databases.

  2. Benchmarking distributed data warehouse solutions for storing genomic variant information

    PubMed Central

    Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.

    2017-01-01

    Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require

  3. ARACHNID: A prototype object-oriented database tool for distributed systems

    NASA Technical Reports Server (NTRS)

    Younger, Herbert; Oreilly, John; Frogner, Bjorn

    1994-01-01

    This paper discusses the results of a Phase 2 SBIR project sponsored by NASA and performed by MIMD Systems, Inc. A major objective of this project was to develop specific concepts for improved performance in accessing large databases. An object-oriented and distributed approach was used for the general design, while a geographical decomposition was used as a specific solution. The resulting software framework is called ARACHNID. The Faint Source Catalog developed by NASA was the initial database testbed. This is a database of many giga-bytes, where an order of magnitude improvement in query speed is being sought. This database contains faint infrared point sources obtained from telescope measurements of the sky. A geographical decomposition of this database is an attractive approach to dividing it into pieces. Each piece can then be searched on individual processors with only a weak data linkage between the processors being required. As a further demonstration of the concepts implemented in ARACHNID, a tourist information system is discussed. This version of ARACHNID is the commercial result of the project. It is a distributed, networked, database application where speed, maintenance, and reliability are important considerations. This paper focuses on the design concepts and technologies that form the basis for ARACHNID.

  4. Nominal ISOMERs (Incorrect Spellings Of Medicines Eluding Researchers)-variants in the spellings of drug names in PubMed: a database review.

    PubMed

    Ferner, Robin E; Aronson, Jeffrey K

    2016-12-14

     To examine how misspellings of drug names could impede searches for published literature.  Database review.  PubMed.  The study included 30 drug names that are commonly misspelt on prescription charts in hospitals in Birmingham, UK (test set), and 30 control names randomly chosen from a hospital formulary (control set). The following definitions were used: standard names-the international non-proprietary names, variant names-deviations in spelling from standard names that are not themselves standard names in English language nomenclature, and hidden reference variants-variant spellings that identified publications in textword (tw) searches of PubMed or other databases, and which were not identified by textword searches for the standard names. Variant names were generated from standard names by applying letter substitutions, omissions, additions, transpositions, duplications, deduplications, and combinations of these. Searches were carried out in PubMed (30 June 2016) for "standard name[tw]" and "variant name[tw] NOT standard name[tw]."  The 30 standard names of drugs in the test set gave 325 979 hits in total, and 160 hidden reference variants gave 3872 hits (1.17%). The standard names of the control set gave 470 064 hits, and 79 hidden reference variants gave 766 hits (0.16%). Letter substitutions (particularly i to y and vice versa) and omissions together accounted for 2924 (74%) of the variants. Amitriptyline (8530 hits) yielded 18 hidden reference variants (179 (2.1%) hits). Names ending in "in," "ine," or "micin" were commonly misspelt. Failing to search for hidden reference variants of "gentamicin," "amitriptyline," "mirtazapine," and "trazodone" would miss at least 19 systematic reviews. A hidden reference variant related to Christmas, "No-el", was rare; variants of "X-miss" were rarer.  When performing searches, researchers should include misspellings of drug names among their search terms. Published by the BMJ Publishing Group Limited. For

  5. Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing.

    PubMed

    Fan, Jianping; Luo, Hangzai; Elmagarmid, Ahmed K

    2004-07-01

    Digital video now plays an important role in medical education, health care, telemedicine and other medical applications. Several content-based video retrieval (CBVR) systems have been proposed in the past, but they still suffer from the following challenging problems: semantic gap, semantic video concept modeling, semantic video classification, and concept-oriented video database indexing and access. In this paper, we propose a novel framework to make some advances toward the final goal to solve these problems. Specifically, the framework includes: 1) a semantic-sensitive video content representation framework by using principal video shots to enhance the quality of features; 2) semantic video concept interpretation by using flexible mixture model to bridge the semantic gap; 3) a novel semantic video-classifier training framework by integrating feature selection, parameter estimation, and model selection seamlessly in a single algorithm; and 4) a concept-oriented video database organization technique through a certain domain-dependent concept hierarchy to enable semantic-sensitive video retrieval and browsing.

  6. Benefits of an Object-oriented Database Representation for Controlled Medical Terminologies

    PubMed Central

    Gu, Huanying; Halper, Michael; Geller, James; Perl, Yehoshua

    1999-01-01

    Objective: Controlled medical terminologies (CMTs) have been recognized as important tools in a variety of medical informatics applications, ranging from patient-record systems to decision-support systems. Controlled medical terminologies are typically organized in semantic network structures consisting of tens to hundreds of thousands of concepts. This overwhelming size and complexity can be a serious barrier to their maintenance and widespread utilization. The authors propose the use of object-oriented databases to address the problems posed by the extensive scope and high complexity of most CMTs for maintenance personnel and general users alike. Design: The authors present a methodology that allows an existing CMT, modeled as a semantic network, to be represented as an equivalent object-oriented database. Such a representation is called an object-oriented health care terminology repository (OOHTR). Results: The major benefit of an OOHTR is its schema, which provides an important layer of structural abstraction. Using the high-level view of a CMT afforded by the schema, one can gain insight into the CMT's overarching organization and begin to better comprehend it. The authors' methodology is applied to the Medical Entities Dictionary (MED), a large CMT developed at Columbia-Presbyterian Medical Center. Examples of how the OOHTR schema facilitated updating, correcting, and improving the design of the MED are presented. Conclusion: The OOHTR schema can serve as an important abstraction mechanism for enhancing comprehension of a large CMT, and thus promotes its usability. PMID:10428002

  7. Application of a five-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants lodged on the InSiGHT locus-specific database

    PubMed Central

    Plazzer, John-Paul; Greenblatt, Marc S.; Akagi, Kiwamu; Al-Mulla, Fahd; Bapat, Bharati; Bernstein, Inge; Capellá, Gabriel; den Dunnen, Johan T.; du Sart, Desiree; Fabre, Aurelie; Farrell, Michael P.; Farrington, Susan M.; Frayling, Ian M.; Frebourg, Thierry; Goldgar, David E.; Heinen, Christopher D.; Holinski-Feder, Elke; Kohonen-Corish, Maija; Robinson, Kristina Lagerstedt; Leung, Suet Yi; Martins, Alexandra; Moller, Pal; Morak, Monika; Nystrom, Minna; Peltomaki, Paivi; Pineda, Marta; Qi, Ming; Ramesar, Rajkumar; Rasmussen, Lene Juel; Royer-Pokora, Brigitte; Scott, Rodney J.; Sijmons, Rolf; Tavtigian, Sean V.; Tops, Carli M.; Weber, Thomas; Wijnen, Juul; Woods, Michael O.; Macrae, Finlay; Genuardi, Maurizio

    2015-01-01

    Clinical classification of sequence variants identified in hereditary disease genes directly affects clinical management of patients and their relatives. The International Society for Gastrointestinal Hereditary Tumours (InSiGHT) undertook a collaborative effort to develop, test and apply a standardized classification scheme to constitutional variants in the Lynch Syndrome genes MLH1, MSH2, MSH6 and PMS2. Unpublished data submission was encouraged to assist variant classification, and recognized by microattribution. The scheme was refined by multidisciplinary expert committee review of clinical and functional data available for variants, applied to 2,360 sequence alterations, and disseminated online. Assessment using validated criteria altered classifications for 66% of 12,006 database entries. Clinical recommendations based on transparent evaluation are now possible for 1,370 variants not obviously protein-truncating from nomenclature. This large-scale endeavor will facilitate consistent management of suspected Lynch Syndrome families, and demonstrates the value of multidisciplinary collaboration for curation and classification of variants in public locus-specific databases. PMID:24362816

  8. Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database.

    PubMed

    Thompson, Bryony A; Spurdle, Amanda B; Plazzer, John-Paul; Greenblatt, Marc S; Akagi, Kiwamu; Al-Mulla, Fahd; Bapat, Bharati; Bernstein, Inge; Capellá, Gabriel; den Dunnen, Johan T; du Sart, Desiree; Fabre, Aurelie; Farrell, Michael P; Farrington, Susan M; Frayling, Ian M; Frebourg, Thierry; Goldgar, David E; Heinen, Christopher D; Holinski-Feder, Elke; Kohonen-Corish, Maija; Robinson, Kristina Lagerstedt; Leung, Suet Yi; Martins, Alexandra; Moller, Pal; Morak, Monika; Nystrom, Minna; Peltomaki, Paivi; Pineda, Marta; Qi, Ming; Ramesar, Rajkumar; Rasmussen, Lene Juel; Royer-Pokora, Brigitte; Scott, Rodney J; Sijmons, Rolf; Tavtigian, Sean V; Tops, Carli M; Weber, Thomas; Wijnen, Juul; Woods, Michael O; Macrae, Finlay; Genuardi, Maurizio

    2014-02-01

    The clinical classification of hereditary sequence variants identified in disease-related genes directly affects clinical management of patients and their relatives. The International Society for Gastrointestinal Hereditary Tumours (InSiGHT) undertook a collaborative effort to develop, test and apply a standardized classification scheme to constitutional variants in the Lynch syndrome-associated genes MLH1, MSH2, MSH6 and PMS2. Unpublished data submission was encouraged to assist in variant classification and was recognized through microattribution. The scheme was refined by multidisciplinary expert committee review of the clinical and functional data available for variants, applied to 2,360 sequence alterations, and disseminated online. Assessment using validated criteria altered classifications for 66% of 12,006 database entries. Clinical recommendations based on transparent evaluation are now possible for 1,370 variants that were not obviously protein truncating from nomenclature. This large-scale endeavor will facilitate the consistent management of families suspected to have Lynch syndrome and demonstrates the value of multidisciplinary collaboration in the curation and classification of variants in public locus-specific databases.

  9. Determining object orientation with a hierarchical database of binary synthetic discriminant function filters

    NASA Technical Reports Server (NTRS)

    Reid, Max B.; Ma, Paul W.; Downie, John D.

    1990-01-01

    An optical correlation-based system is demonstrated which recognizes an object and determines its angular orientation by traversing a hierarchical data base of binary filters. The data-base architecture is made possible by the development of binary synthetic discriminant function filters.

  10. TMC-SNPdb: an Indian germline variant database derived from whole exome sequences.

    PubMed

    Upadhyay, Pawan; Gardi, Nilesh; Desai, Sanket; Sahoo, Bikram; Singh, Ankita; Togar, Trupti; Iyer, Prajish; Prasad, Ratnam; Chandrani, Pratik; Gupta, Sudeep; Dutt, Amit

    2016-01-01

    Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it's absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the T: ata M: emorial C: entre-SNP D: ata B: ase (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)-representing 114 309 unique germline variants-generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following:Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html. © The Author(s) 2016. Published by Oxford University Press.

  11. Nominal ISOMERs (Incorrect Spellings Of Medicines Eluding Researchers)—variants in the spellings of drug names in PubMed: a database review

    PubMed Central

    Aronson, Jeffrey K

    2016-01-01

    Objective To examine how misspellings of drug names could impede searches for published literature. Design Database review. Data source PubMed. Review methods The study included 30 drug names that are commonly misspelt on prescription charts in hospitals in Birmingham, UK (test set), and 30 control names randomly chosen from a hospital formulary (control set). The following definitions were used: standard names—the international non-proprietary names, variant names—deviations in spelling from standard names that are not themselves standard names in English language nomenclature, and hidden reference variants—variant spellings that identified publications in textword (tw) searches of PubMed or other databases, and which were not identified by textword searches for the standard names. Variant names were generated from standard names by applying letter substitutions, omissions, additions, transpositions, duplications, deduplications, and combinations of these. Searches were carried out in PubMed (30 June 2016) for “standard name[tw]” and “variant name[tw] NOT standard name[tw].” Results The 30 standard names of drugs in the test set gave 325 979 hits in total, and 160 hidden reference variants gave 3872 hits (1.17%). The standard names of the control set gave 470 064 hits, and 79 hidden reference variants gave 766 hits (0.16%). Letter substitutions (particularly i to y and vice versa) and omissions together accounted for 2924 (74%) of the variants. Amitriptyline (8530 hits) yielded 18 hidden reference variants (179 (2.1%) hits). Names ending in “in,” “ine,” or “micin” were commonly misspelt. Failing to search for hidden reference variants of “gentamicin,” “amitriptyline,” “mirtazapine,” and “trazodone” would miss at least 19 systematic reviews. A hidden reference variant related to Christmas, “No-el”, was rare; variants of “X-miss” were rarer. Conclusion When performing searches, researchers should include

  12. A comparative study of six European databases of medically oriented Web resources.

    PubMed

    Abad García, Francisca; González Teruel, Aurora; Bayo Calduch, Patricia; de Ramón Frias, Rosa; Castillo Blasco, Lourdes

    2005-10-01

    The paper describes six European medically oriented databases of Web resources, pertaining to five quality-controlled subject gateways, and compares their performance. The characteristics, coverage, procedure for selecting Web resources, record structure, searching possibilities, and existence of user assistance were described for each database. Performance indicators for each database were obtained by means of searches carried out using the key words, "myocardial infarction." Most of the databases originated in the 1990s in an academic or library context and include all types of Web resources of an international nature. Five databases use Medical Subject Headings. The number of fields per record varies between three and nineteen. The language of the search interfaces is mostly English, and some of them allow searches in other languages. In some databases, the search can be extended to Pubmed. Organizing Medical Networked Information, Catalogue et Index des Sites Médicaux Francophones, and Diseases, Disorders and Related Topics produced the best results. The usefulness of these databases as quick reference resources is clear. In addition, their lack of content overlap means that, for the user, they complement each other. Their continued survival faces three challenges: the instability of the Internet, maintenance costs, and lack of use in spite of their potential usefulness.

  13. A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics*

    PubMed Central

    Li, Jing; Su, Zengliu; Ma, Ze-Qiang; Slebos, Robbert J. C.; Halvey, Patrick; Tabb, David L.; Liebler, Daniel C.; Pao, William; Zhang, Bing

    2011-01-01

    Shotgun proteomics data analysis usually relies on database search. However, commonly used protein sequence databases do not contain information on protein variants and thus prevent variant peptides and proteins from been identified. Including known coding variations into protein sequence databases could help alleviate this problem. Based on our recently published human Cancer Proteome Variation Database, we have created a protein sequence database that comprehensively annotates thousands of cancer-related coding variants collected in the Cancer Proteome Variation Database as well as noncancer-specific ones from the Single Nucleotide Polymorphism Database (dbSNP). Using this database, we then developed a data analysis workflow for variant peptide identification in shotgun proteomics. The high risk of false positive variant identifications was addressed by a modified false discovery rate estimation method. Analysis of colorectal cancer cell lines SW480, RKO, and HCT-116 revealed a total of 81 peptides that contain either noncancer-specific or cancer-related variations. Twenty-three out of 26 variants randomly selected from the 81 were confirmed by genomic sequencing. We further applied the workflow on data sets from three individual colorectal tumor specimens. A total of 204 distinct variant peptides were detected, and five carried known cancer-related mutations. Each individual showed a specific pattern of cancer-related mutations, suggesting potential use of this type of information for personalized medicine. Compatibility of the workflow has been tested with four popular database search engines including Sequest, Mascot, X!Tandem, and MyriMatch. In summary, we have developed a workflow that effectively uses existing genomic data to enable variant peptide detection in proteomics. PMID:21389108

  14. Compression of Index Term Dictionary in an Inverted-File-Oriented Database: Some Effective Algorithms.

    ERIC Educational Resources Information Center

    Wisniewski, Janusz L.

    1986-01-01

    Discussion of a new method of index term dictionary compression in an inverted-file-oriented database highlights a technique of word coding, which generates short fixed-length codes obtained from the index terms themselves by analysis of monogram and bigram statistical distributions. Substantial savings in communication channel utilization are…

  15. Difficulties in diagnosing Marfan syndrome using current FBN1 databases.

    PubMed

    Groth, Kristian A; Gaustadnes, Mette; Thorsen, Kasper; Østergaard, John R; Jensen, Uffe Birk; Gravholt, Claus H; Andersen, Niels H

    2016-01-01

    The diagnostic criteria of Marfan syndrome (MFS) highlight the importance of a FBN1 mutation test in diagnosing MFS. As genetic sequencing becomes better, cheaper, and more accessible, the expected increase in the number of genetic tests will become evident, resulting in numerous genetic variants that need to be evaluated for disease-causing effects based on database information. The aim of this study was to evaluate genetic variants in four databases and review the relevant literature. We assessed background data on 23 common variants registered in ESP6500 and classified as causing MFS in the Human Gene Mutation Database (HGMD). We evaluated data in four variant databases (HGMD, UMD-FBN1, ClinVar, and UniProt) according to the diagnostic criteria for MFS and compared the results with the classification of each variant in the four databases. None of the 23 variants was clearly associated with MFS, even though all classifications in the databases stated otherwise. A genetic diagnosis of MFS cannot reliably be based on current variant databases because they contain incorrectly interpreted conclusions on variants. Variants must be evaluated by time-consuming review of the background material in the databases and by combining these data with expert knowledge on MFS. This is a major problem because we expect even more genetic test results in the near future as a result of the reduced cost and process time for next-generation sequencing.Genet Med 18 1, 98-102.

  16. UMD-USHbases: a comprehensive set of databases to record and analyse pathogenic mutations and unclassified variants in seven Usher syndrome causing genes.

    PubMed

    Baux, David; Faugère, Valérie; Larrieu, Lise; Le Guédard-Méreuze, Sandie; Hamroun, Dalil; Béroud, Christophe; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

    2008-08-01

    Using the Universal Mutation Database (UMD) software, we have constructed "UMD-USHbases", a set of relational databases of nucleotide variations for seven genes involved in Usher syndrome (MYO7A, CDH23, PCDH15, USH1C, USH1G, USH3A and USH2A). Mutations in the Usher syndrome type I causing genes are also recorded in non-syndromic hearing loss cases and mutations in USH2A in non-syndromic retinitis pigmentosa. Usher syndrome provides a particular challenge for molecular diagnostics because of the clinical and molecular heterogeneity. As many mutations are missense changes, and all the genes also contain apparently non-pathogenic polymorphisms, well-curated databases are crucial for accurate interpretation of pathogenicity. Tools are provided to assess the pathogenicity of mutations, including conservation of amino acids and analysis of splice-sites. Reference amino acid alignments are provided. Apparently non-pathogenic variants in patients with Usher syndrome, at both the nucleotide and amino acid level, are included. The UMD-USHbases currently contain more than 2,830 entries including disease causing mutations, unclassified variants or non-pathogenic polymorphisms identified in over 938 patients. In addition to data collected from 89 publications, 15 novel mutations identified in our laboratory are recorded in MYO7A (6), CDH23 (8), or PCDH15 (1) genes. Information is given on the relative involvement of the seven genes, the number and distribution of variants in each gene. UMD-USHbases give access to a software package that provides specific routines and optimized multicriteria research and sorting tools. These databases should assist clinicians and geneticists seeking information about mutations responsible for Usher syndrome.

  17. Standards for Clinical Grade Genomic Databases.

    PubMed

    Yohe, Sophia L; Carter, Alexis B; Pfeifer, John D; Crawford, James M; Cushman-Vokoun, Allison; Caughron, Samuel; Leonard, Debra G B

    2015-11-01

    Next-generation sequencing performed in a clinical environment must meet clinical standards, which requires reproducibility of all aspects of the testing. Clinical-grade genomic databases (CGGDs) are required to classify a variant and to assist in the professional interpretation of clinical next-generation sequencing. Applying quality laboratory standards to the reference databases used for sequence-variant interpretation presents a new challenge for validation and curation. To define CGGD and the categories of information contained in CGGDs and to frame recommendations for the structure and use of these databases in clinical patient care. Members of the College of American Pathologists Personalized Health Care Committee reviewed the literature and existing state of genomic databases and developed a framework for guiding CGGD development in the future. Clinical-grade genomic databases may provide different types of information. This work group defined 3 layers of information in CGGDs: clinical genomic variant repositories, genomic medical data repositories, and genomic medicine evidence databases. The layers are differentiated by the types of genomic and medical information contained and the utility in assisting with clinical interpretation of genomic variants. Clinical-grade genomic databases must meet specific standards regarding submission, curation, and retrieval of data, as well as the maintenance of privacy and security. These organizing principles for CGGDs should serve as a foundation for future development of specific standards that support the use of such databases for patient care.

  18. A comprehensive global genotype-phenotype database for rare diseases.

    PubMed

    Trujillano, Daniel; Oprea, Gabriela-Elena; Schmitz, Yvonne; Bertoli-Avella, Aida M; Abou Jamra, Rami; Rolfs, Arndt

    2017-01-01

    The ability to discover genetic variants in a patient runs far ahead of the ability to interpret them. Databases with accurate descriptions of the causal relationship between the variants and the phenotype are valuable since these are critical tools in clinical genetic diagnostics. Here, we introduce a comprehensive and global genotype-phenotype database focusing on rare diseases. This database (CentoMD ® ) is a browser-based tool that enables access to a comprehensive, independently curated system utilizing stringent high-quality criteria and a quickly growing repository of genetic and human phenotype ontology (HPO)-based clinical information. Its main goals are to aid the evaluation of genetic variants, to enhance the validity of the genetic analytical workflow, to increase the quality of genetic diagnoses, and to improve evaluation of treatment options for patients with hereditary diseases. The database software correlates clinical information from consented patients and probands of different geographical backgrounds with a large dataset of genetic variants and, when available, biomarker information. An automated follow-up tool is incorporated that informs all users whenever a variant classification has changed. These unique features fully embedded in a CLIA/CAP-accredited quality management system allow appropriate data quality and enhanced patient safety. More than 100,000 genetically screened individuals are documented in the database, resulting in more than 470 million variant detections. Approximately, 57% of the clinically relevant and uncertain variants in the database are novel. Notably, 3% of the genetic variants identified and previously reported in the literature as being associated with a particular rare disease were reclassified, based on internal evidence, as clinically irrelevant. The database offers a comprehensive summary of the clinical validity and causality of detected gene variants with their associated phenotypes, and is a valuable tool

  19. Evaluating the quality of Marfan genotype-phenotype correlations in existing FBN1 databases.

    PubMed

    Groth, Kristian A; Von Kodolitsch, Yskert; Kutsche, Kerstin; Gaustadnes, Mette; Thorsen, Kasper; Andersen, Niels H; Gravholt, Claus H

    2017-07-01

    Genetic FBN1 testing is pivotal for confirming the clinical diagnosis of Marfan syndrome. In an effort to evaluate variant causality, FBN1 databases are often used. We evaluated the current databases regarding FBN1 variants and validated associated phenotype records with a new Marfan syndrome geno-phenotyping tool called the Marfan score. We evaluated four databases (UMD-FBN1, ClinVar, the Human Gene Mutation Database (HGMD), and Uniprot) containing 2,250 FBN1 variants supported by 4,904 records presented in 307 references. The Marfan score calculated for phenotype data from the records quantified variant associations with Marfan syndrome phenotype. We calculated a Marfan score for 1,283 variants, of which we confirmed the database diagnosis of Marfan syndrome in 77.1%. This represented only 35.8% of the total registered variants; 18.5-33.3% (UMD-FBN1 versus HGMD) of variants associated with Marfan syndrome in the databases could not be confirmed by the recorded phenotype. FBN1 databases can be imprecise and incomplete. Data should be used with caution when evaluating FBN1 variants. At present, the UMD-FBN1 database seems to be the biggest and best curated; therefore, it is the most comprehensive database. However, the need for better genotype-phenotype curated databases is evident, and we hereby present such a database.Genet Med advance online publication 01 December 2016.

  20. BRCA Share: A Collection of Clinical BRCA Gene Variants.

    PubMed

    Béroud, Christophe; Letovsky, Stanley I; Braastad, Corey D; Caputo, Sandrine M; Beaudoux, Olivia; Bignon, Yves Jean; Bressac-De Paillerets, Brigitte; Bronner, Myriam; Buell, Crystal M; Collod-Béroud, Gwenaëlle; Coulet, Florence; Derive, Nicolas; Divincenzo, Christina; Elzinga, Christopher D; Garrec, Céline; Houdayer, Claude; Karbassi, Izabela; Lizard, Sarab; Love, Angela; Muller, Danièle; Nagan, Narasimhan; Nery, Camille R; Rai, Ghadi; Revillion, Françoise; Salgado, David; Sévenet, Nicolas; Sinilnikova, Olga; Sobol, Hagay; Stoppa-Lyonnet, Dominique; Toulas, Christine; Trautman, Edwin; Vaur, Dominique; Vilquin, Paul; Weymouth, Katelyn S; Willis, Alecia; Eisenberg, Marcia; Strom, Charles M

    2016-12-01

    As next-generation sequencing increases access to human genetic variation, the challenge of determining clinical significance of variants becomes ever more acute. Germline variants in the BRCA1 and BRCA2 genes can confer substantial lifetime risk of breast and ovarian cancer. Assessment of variant pathogenicity is a vital part of clinical genetic testing for these genes. A database of clinical observations of BRCA variants is a critical resource in that process. This article describes BRCA Share™, a database created by a unique international alliance of academic centers and commercial testing laboratories. By integrating the content of the Universal Mutation Database generated by the French Unicancer Genetic Group with the testing results of two large commercial laboratories, Quest Diagnostics and Laboratory Corporation of America (LabCorp), BRCA Share™ has assembled one of the largest publicly accessible collections of BRCA variants currently available. Although access is available to academic researchers without charge, commercial participants in the project are required to pay a support fee and contribute their data. The fees fund the ongoing curation effort, as well as planned experiments to functionally characterize variants of uncertain significance. BRCA Share™ databases can therefore be considered as models of successful data sharing between private companies and the academic world. © 2016 WILEY PERIODICALS, INC.

  1. GetData: A filesystem-based, column-oriented database format for time-ordered binary data

    NASA Astrophysics Data System (ADS)

    Wiebe, Donald V.; Netterfield, Calvin B.; Kisner, Theodore S.

    2015-12-01

    The GetData Project is the reference implementation of the Dirfile Standards, a filesystem-based, column-oriented database format for time-ordered binary data. Dirfiles provide a fast, simple format for storing and reading data, suitable for both quicklook and analysis pipelines. GetData provides a C API and bindings exist for various other languages. GetData is distributed under the terms of the GNU Lesser General Public License.

  2. The utilization of neural nets in populating an object-oriented database

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Hill, Scott E.; Cromp, Robert F.

    1989-01-01

    Existing NASA supported scientific data bases are usually developed, managed and populated in a tedious, error prone and self-limiting way in terms of what can be described in a relational Data Base Management System (DBMS). The next generation Earth remote sensing platforms (i.e., Earth Observation System, (EOS), will be capable of generating data at a rate of over 300 Mbs per second from a suite of instruments designed for different applications. What is needed is an innovative approach that creates object-oriented databases that segment, characterize, catalog and are manageable in a domain-specific context and whose contents are available interactively and in near-real-time to the user community. Described here is work in progress that utilizes an artificial neural net approach to characterize satellite imagery of undefined objects into high-level data objects. The characterized data is then dynamically allocated to an object-oriented data base where it can be reviewed and assessed by a user. The definition, development, and evolution of the overall data system model are steps in the creation of an application-driven knowledge-based scientific information system.

  3. Mutation databases for inherited renal disease: are they complete, accurate, clinically relevant, and freely available?

    PubMed

    Savige, Judy; Dagher, Hayat; Povey, Sue

    2014-07-01

    This study examined whether gene-specific DNA variant databases for inherited diseases of the kidney fulfilled the Human Variome Project recommendations of being complete, accurate, clinically relevant and freely available. A recent review identified 60 inherited renal diseases caused by mutations in 132 genes. The disease name, MIM number, gene name, together with "mutation" or "database," were used to identify web-based databases. Fifty-nine diseases (98%) due to mutations in 128 genes had a variant database. Altogether there were 349 databases (a median of 3 per gene, range 0-6), but no gene had two databases with the same number of variants, and 165 (50%) databases included fewer than 10 variants. About half the databases (180, 54%) had been updated in the previous year. Few (77, 23%) were curated by "experts" but these included nine of the 11 with the most variants. Even fewer databases (41, 12%) included clinical features apart from the name of the associated disease. Most (223, 67%) could be accessed without charge, including those for 50 genes (40%) with the maximum number of variants. Future efforts should focus on encouraging experts to collaborate on a single database for each gene affected in inherited renal disease, including both unpublished variants, and clinical phenotypes. © 2014 WILEY PERIODICALS, INC.

  4. Constraints on Biological Mechanism from Disease Comorbidity Using Electronic Medical Records and Database of Genetic Variants

    PubMed Central

    Bagley, Steven C.; Sirota, Marina; Chen, Richard; Butte, Atul J.; Altman, Russ B.

    2016-01-01

    Patterns of disease co-occurrence that deviate from statistical independence may represent important constraints on biological mechanism, which sometimes can be explained by shared genetics. In this work we study the relationship between disease co-occurrence and commonly shared genetic architecture of disease. Records of pairs of diseases were combined from two different electronic medical systems (Columbia, Stanford), and compared to a large database of published disease-associated genetic variants (VARIMED); data on 35 disorders were available across all three sources, which include medical records for over 1.2 million patients and variants from over 17,000 publications. Based on the sources in which they appeared, disease pairs were categorized as having predominant clinical, genetic, or both kinds of manifestations. Confounding effects of age on disease incidence were controlled for by only comparing diseases when they fall in the same cluster of similarly shaped incidence patterns. We find that disease pairs that are overrepresented in both electronic medical record systems and in VARIMED come from two main disease classes, autoimmune and neuropsychiatric. We furthermore identify specific genes that are shared within these disease groups. PMID:27115429

  5. Constraints on Biological Mechanism from Disease Comorbidity Using Electronic Medical Records and Database of Genetic Variants.

    PubMed

    Bagley, Steven C; Sirota, Marina; Chen, Richard; Butte, Atul J; Altman, Russ B

    2016-04-01

    Patterns of disease co-occurrence that deviate from statistical independence may represent important constraints on biological mechanism, which sometimes can be explained by shared genetics. In this work we study the relationship between disease co-occurrence and commonly shared genetic architecture of disease. Records of pairs of diseases were combined from two different electronic medical systems (Columbia, Stanford), and compared to a large database of published disease-associated genetic variants (VARIMED); data on 35 disorders were available across all three sources, which include medical records for over 1.2 million patients and variants from over 17,000 publications. Based on the sources in which they appeared, disease pairs were categorized as having predominant clinical, genetic, or both kinds of manifestations. Confounding effects of age on disease incidence were controlled for by only comparing diseases when they fall in the same cluster of similarly shaped incidence patterns. We find that disease pairs that are overrepresented in both electronic medical record systems and in VARIMED come from two main disease classes, autoimmune and neuropsychiatric. We furthermore identify specific genes that are shared within these disease groups.

  6. Generalized Database Management System Support for Numeric Database Environments.

    ERIC Educational Resources Information Center

    Dominick, Wayne D.; Weathers, Peggy G.

    1982-01-01

    This overview of potential for utilizing database management systems (DBMS) within numeric database environments highlights: (1) major features, functions, and characteristics of DBMS; (2) applicability to numeric database environment needs and user needs; (3) current applications of DBMS technology; and (4) research-oriented and…

  7. The curation of genetic variants: difficulties and possible solutions.

    PubMed

    Pandey, Kapil Raj; Maden, Narendra; Poudel, Barsha; Pradhananga, Sailendra; Sharma, Amit Kumar

    2012-12-01

    The curation of genetic variants from biomedical articles is required for various clinical and research purposes. Nowadays, establishment of variant databases that include overall information about variants is becoming quite popular. These databases have immense utility, serving as a user-friendly information storehouse of variants for information seekers. While manual curation is the gold standard method for curation of variants, it can turn out to be time-consuming on a large scale thus necessitating the need for automation. Curation of variants described in biomedical literature may not be straightforward mainly due to various nomenclature and expression issues. Though current trends in paper writing on variants is inclined to the standard nomenclature such that variants can easily be retrieved, we have a massive store of variants in the literature that are present as non-standard names and the online search engines that are predominantly used may not be capable of finding them. For effective curation of variants, knowledge about the overall process of curation, nature and types of difficulties in curation, and ways to tackle the difficulties during the task are crucial. Only by effective curation, can variants be correctly interpreted. This paper presents the process and difficulties of curation of genetic variants with possible solutions and suggestions from our work experience in the field including literature support. The paper also highlights aspects of interpretation of genetic variants and the importance of writing papers on variants following standard and retrievable methods. Copyright © 2012. Published by Elsevier Ltd.

  8. The Curation of Genetic Variants: Difficulties and Possible Solutions

    PubMed Central

    Pandey, Kapil Raj; Maden, Narendra; Poudel, Barsha; Pradhananga, Sailendra; Sharma, Amit Kumar

    2012-01-01

    The curation of genetic variants from biomedical articles is required for various clinical and research purposes. Nowadays, establishment of variant databases that include overall information about variants is becoming quite popular. These databases have immense utility, serving as a user-friendly information storehouse of variants for information seekers. While manual curation is the gold standard method for curation of variants, it can turn out to be time-consuming on a large scale thus necessitating the need for automation. Curation of variants described in biomedical literature may not be straightforward mainly due to various nomenclature and expression issues. Though current trends in paper writing on variants is inclined to the standard nomenclature such that variants can easily be retrieved, we have a massive store of variants in the literature that are present as non-standard names and the online search engines that are predominantly used may not be capable of finding them. For effective curation of variants, knowledge about the overall process of curation, nature and types of difficulties in curation, and ways to tackle the difficulties during the task are crucial. Only by effective curation, can variants be correctly interpreted. This paper presents the process and difficulties of curation of genetic variants with possible solutions and suggestions from our work experience in the field including literature support. The paper also highlights aspects of interpretation of genetic variants and the importance of writing papers on variants following standard and retrievable methods. PMID:23317699

  9. DBATE: database of alternative transcripts expression.

    PubMed

    Bianchi, Valerio; Colantoni, Alessio; Calderone, Alberto; Ausiello, Gabriele; Ferrè, Fabrizio; Helmer-Citterich, Manuela

    2013-01-01

    The use of high-throughput RNA sequencing technology (RNA-seq) allows whole transcriptome analysis, providing an unbiased and unabridged view of alternative transcript expression. Coupling splicing variant-specific expression with its functional inference is still an open and difficult issue for which we created the DataBase of Alternative Transcripts Expression (DBATE), a web-based repository storing expression values and functional annotation of alternative splicing variants. We processed 13 large RNA-seq panels from human healthy tissues and in disease conditions, reporting expression levels and functional annotations gathered and integrated from different sources for each splicing variant, using a variant-specific annotation transfer pipeline. The possibility to perform complex queries by cross-referencing different functional annotations permits the retrieval of desired subsets of splicing variant expression values that can be visualized in several ways, from simple to more informative. DBATE is intended as a novel tool to help appreciate how, and possibly why, the transcriptome expression is shaped. DATABASE URL: http://bioinformatica.uniroma2.it/DBATE/.

  10. Image Engine: an object-oriented multimedia database for storing, retrieving and sharing medical images and text.

    PubMed Central

    Lowe, H. J.

    1993-01-01

    This paper describes Image Engine, an object-oriented, microcomputer-based, multimedia database designed to facilitate the storage and retrieval of digitized biomedical still images, video, and text using inexpensive desktop computers. The current prototype runs on Apple Macintosh computers and allows network database access via peer to peer file sharing protocols. Image Engine supports both free text and controlled vocabulary indexing of multimedia objects. The latter is implemented using the TView thesaurus model developed by the author. The current prototype of Image Engine uses the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary (with UMLS Meta-1 extensions) as its indexing thesaurus. PMID:8130596

  11. GTRAC: fast retrieval from compressed collections of genomic variants

    PubMed Central

    Tatwawadi, Kedar; Hernaez, Mikel; Ochoa, Idoia; Weissman, Tsachy

    2016-01-01

    Motivation: The dramatic decrease in the cost of sequencing has resulted in the generation of huge amounts of genomic data, as evidenced by projects such as the UK10K and the Million Veteran Project, with the number of sequenced genomes ranging in the order of 10 K to 1 M. Due to the large redundancies among genomic sequences of individuals from the same species, most of the medical research deals with the variants in the sequences as compared with a reference sequence, rather than with the complete genomic sequences. Consequently, millions of genomes represented as variants are stored in databases. These databases are constantly updated and queried to extract information such as the common variants among individuals or groups of individuals. Previous algorithms for compression of this type of databases lack efficient random access capabilities, rendering querying the database for particular variants and/or individuals extremely inefficient, to the point where compression is often relinquished altogether. Results: We present a new algorithm for this task, called GTRAC, that achieves significant compression ratios while allowing fast random access over the compressed database. For example, GTRAC is able to compress a Homo sapiens dataset containing 1092 samples in 1.1 GB (compression ratio of 160), while allowing for decompression of specific samples in less than a second and decompression of specific variants in 17 ms. GTRAC uses and adapts techniques from information theory, such as a specialized Lempel-Ziv compressor, and tailored succinct data structures. Availability and Implementation: The GTRAC algorithm is available for download at: https://github.com/kedartatwawadi/GTRAC Contact: kedart@stanford.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27587665

  12. Using semantic data modeling techniques to organize an object-oriented database for extending the mass storage model

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Short, Nicholas M., Jr.; Roelofs, Larry H.; Dorfman, Erik

    1991-01-01

    A methodology for optimizing organization of data obtained by NASA earth and space missions is discussed. The methodology uses a concept based on semantic data modeling techniques implemented in a hierarchical storage model. The modeling is used to organize objects in mass storage devices, relational database systems, and object-oriented databases. The semantic data modeling at the metadata record level is examined, including the simulation of a knowledge base and semantic metadata storage issues. The semantic data model hierarchy and its application for efficient data storage is addressed, as is the mapping of the application structure to the mass storage.

  13. Human Variome Project Quality Assessment Criteria for Variation Databases.

    PubMed

    Vihinen, Mauno; Hancock, John M; Maglott, Donna R; Landrum, Melissa J; Schaafsma, Gerard C P; Taschner, Peter

    2016-06-01

    Numerous databases containing information about DNA, RNA, and protein variations are available. Gene-specific variant databases (locus-specific variation databases, LSDBs) are typically curated and maintained for single genes or groups of genes for a certain disease(s). These databases are widely considered as the most reliable information source for a particular gene/protein/disease, but it should also be made clear they may have widely varying contents, infrastructure, and quality. Quality is very important to evaluate because these databases may affect health decision-making, research, and clinical practice. The Human Variome Project (HVP) established a Working Group for Variant Database Quality Assessment. The basic principle was to develop a simple system that nevertheless provides a good overview of the quality of a database. The HVP quality evaluation criteria that resulted are divided into four main components: data quality, technical quality, accessibility, and timeliness. This report elaborates on the developed quality criteria and how implementation of the quality scheme can be achieved. Examples are provided for the current status of the quality items in two different databases, BTKbase, an LSDB, and ClinVar, a central archive of submissions about variants and their clinical significance. © 2016 WILEY PERIODICALS, INC.

  14. Automated database-guided expert-supervised orientation for immunophenotypic diagnosis and classification of acute leukemia

    PubMed Central

    Lhermitte, L; Mejstrikova, E; van der Sluijs-Gelling, A J; Grigore, G E; Sedek, L; Bras, A E; Gaipa, G; Sobral da Costa, E; Novakova, M; Sonneveld, E; Buracchi, C; de Sá Bacelar, T; te Marvelde, J G; Trinquand, A; Asnafi, V; Szczepanski, T; Matarraz, S; Lopez, A; Vidriales, B; Bulsa, J; Hrusak, O; Kalina, T; Lecrevisse, Q; Martin Ayuso, M; Brüggemann, M; Verde, J; Fernandez, P; Burgos, L; Paiva, B; Pedreira, C E; van Dongen, J J M; Orfao, A; van der Velden, V H J

    2018-01-01

    Precise classification of acute leukemia (AL) is crucial for adequate treatment. EuroFlow has previously designed an AL orientation tube (ALOT) to guide towards the relevant classification panel (T-cell acute lymphoblastic leukemia (T-ALL), B-cell precursor (BCP)-ALL and/or acute myeloid leukemia (AML)) and final diagnosis. Now we built a reference database with 656 typical AL samples (145 T-ALL, 377 BCP-ALL, 134 AML), processed and analyzed via standardized protocols. Using principal component analysis (PCA)-based plots and automated classification algorithms for direct comparison of single-cells from individual patients against the database, another 783 cases were subsequently evaluated. Depending on the database-guided results, patients were categorized as: (i) typical T, B or Myeloid without or; (ii) with a transitional component to another lineage; (iii) atypical; or (iv) mixed-lineage. Using this automated algorithm, in 781/783 cases (99.7%) the right panel was selected, and data comparable to the final WHO-diagnosis was already provided in >93% of cases (85% T-ALL, 97% BCP-ALL, 95% AML and 87% mixed-phenotype AL patients), even without data on the full-characterization panels. Our results show that database-guided analysis facilitates standardized interpretation of ALOT results and allows accurate selection of the relevant classification panels, hence providing a solid basis for designing future WHO AL classifications. PMID:29089646

  15. Database for Safety-Oriented Tracking of Chemicals

    NASA Technical Reports Server (NTRS)

    Stump, Jacob; Carr, Sandra; Plumlee, Debrah; Slater, Andy; Samson, Thomas M.; Holowaty, Toby L.; Skeete, Darren; Haenz, Mary Alice; Hershman, Scot; Raviprakash, Pushpa

    2010-01-01

    SafetyChem is a computer program that maintains a relational database for tracking chemicals and associated hazards at Johnson Space Center (JSC) by use of a Web-based graphical user interface. The SafetyChem database is accessible to authorized users via a JSC intranet. All new chemicals pass through a safety office, where information on hazards, required personal protective equipment (PPE), fire-protection warnings, and target organ effects (TOEs) is extracted from material safety data sheets (MSDSs) and recorded in the database. The database facilitates real-time management of inventory with attention to such issues as stability, shelf life, reduction of waste through transfer of unused chemicals to laboratories that need them, quantification of chemical wastes, and identification of chemicals for which disposal is required. Upon searching the database for a chemical, the user receives information on physical properties of the chemical, hazard warnings, required PPE, a link to the MSDS, and references to the applicable International Standards Organization (ISO) 9000 standard work instructions and the applicable job hazard analysis. Also, to reduce the labor hours needed to comply with reporting requirements of the Occupational Safety and Health Administration, the data can be directly exported into the JSC hazardous- materials database.

  16. GTRAC: fast retrieval from compressed collections of genomic variants.

    PubMed

    Tatwawadi, Kedar; Hernaez, Mikel; Ochoa, Idoia; Weissman, Tsachy

    2016-09-01

    The dramatic decrease in the cost of sequencing has resulted in the generation of huge amounts of genomic data, as evidenced by projects such as the UK10K and the Million Veteran Project, with the number of sequenced genomes ranging in the order of 10 K to 1 M. Due to the large redundancies among genomic sequences of individuals from the same species, most of the medical research deals with the variants in the sequences as compared with a reference sequence, rather than with the complete genomic sequences. Consequently, millions of genomes represented as variants are stored in databases. These databases are constantly updated and queried to extract information such as the common variants among individuals or groups of individuals. Previous algorithms for compression of this type of databases lack efficient random access capabilities, rendering querying the database for particular variants and/or individuals extremely inefficient, to the point where compression is often relinquished altogether. We present a new algorithm for this task, called GTRAC, that achieves significant compression ratios while allowing fast random access over the compressed database. For example, GTRAC is able to compress a Homo sapiens dataset containing 1092 samples in 1.1 GB (compression ratio of 160), while allowing for decompression of specific samples in less than a second and decompression of specific variants in 17 ms. GTRAC uses and adapts techniques from information theory, such as a specialized Lempel-Ziv compressor, and tailored succinct data structures. The GTRAC algorithm is available for download at: https://github.com/kedartatwawadi/GTRAC CONTACT: : kedart@stanford.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. The Israeli National Genetic database: a 10-year experience.

    PubMed

    Zlotogora, Joël; Patrinos, George P

    2017-03-16

    The Israeli National and Ethnic Mutation database ( http://server.goldenhelix.org/israeli ) was launched in September 2006 on the ETHNOS software to include clinically relevant genomic variants reported among Jewish and Arab Israeli patients. In 2016, the database was reviewed and corrected according to ClinVar ( https://www.ncbi.nlm.nih.gov/clinvar ) and ExAC ( http://exac.broadinstitute.org ) database entries. The present article summarizes some key aspects from the development and continuous update of the database over a 10-year period, which could serve as a paradigm of successful database curation for other similar resources. In September 2016, there were 2444 entries in the database, 890 among Jews, 1376 among Israeli Arabs, and 178 entries among Palestinian Arabs, corresponding to an ~4× data content increase compared to when originally launched. While the Israeli Arab population is much smaller than the Jewish population, the number of pathogenic variants causing recessive disorders reported in the database is higher among Arabs (934) than among Jews (648). Nevertheless, the number of pathogenic variants classified as founder mutations in the database is smaller among Arabs (175) than among Jews (192). In 2016, the entire database content was compared to that of other databases such as ClinVar and ExAC. We show that a significant difference in the percentage of pathogenic variants from the Israeli genetic database that were present in ExAC was observed between the Jewish population (31.8%) and the Israeli Arab population (20.6%). The Israeli genetic database was launched in 2006 on the ETHNOS software and is available online ever since. It allows querying the database according to the disorder and the ethnicity; however, many other features are not available, in particular the possibility to search according to the name of the gene. In addition, due to the technical limitations of the previous ETHNOS software, new features and data are not included in the

  18. A case study for a digital seabed database: Bohai Sea engineering geology database

    NASA Astrophysics Data System (ADS)

    Tianyun, Su; Shikui, Zhai; Baohua, Liu; Ruicai, Liang; Yanpeng, Zheng; Yong, Wang

    2006-07-01

    This paper discusses the designing plan of ORACLE-based Bohai Sea engineering geology database structure from requisition analysis, conceptual structure analysis, logical structure analysis, physical structure analysis and security designing. In the study, we used the object-oriented Unified Modeling Language (UML) to model the conceptual structure of the database and used the powerful function of data management which the object-oriented and relational database ORACLE provides to organize and manage the storage space and improve its security performance. By this means, the database can provide rapid and highly effective performance in data storage, maintenance and query to satisfy the application requisition of the Bohai Sea Oilfield Paradigm Area Information System.

  19. Initial experiences with building a health care infrastructure based on Java and object-oriented database technology.

    PubMed

    Dionisio, J D; Sinha, U; Dai, B; Johnson, D B; Taira, R K

    1999-01-01

    A multi-tiered telemedicine system based on Java and object-oriented database technology has yielded a number of practical insights and experiences on their effectiveness and suitability as implementation bases for a health care infrastructure. The advantages and drawbacks to their use, as seen within the context of the telemedicine system's development, are discussed. Overall, these technologies deliver on their early promise, with a few remaining issues that are due primarily to their relative newness.

  20. Query by forms: User-oriented relational database retrieving system and its application in analysis of experiment data

    NASA Astrophysics Data System (ADS)

    Skotniczny, Zbigniew

    1989-12-01

    The Query by Forms (QbF) system is a user-oriented interactive tool for querying large relational database with minimal queries difinition cost. The system was worked out under the assumption that user's time and effort for defining needed queries is the most severe bottleneck. The system may be applied in any Rdb/VMS databases system and is recommended for specific information systems of any project where end-user queries cannot be foreseen. The tool is dedicated to specialist of an application domain who have to analyze data maintained in database from any needed point of view, who do not need to know commercial databases languages. The paper presents the system developed as a compromise between its functionality and usability. User-system communication via a menu-driven "tree-like" structure of screen-forms which produces a query difinition and execution is discussed in detail. Output of query results (printed reports and graphics) is also discussed. Finally the paper shows one application of QbF to a HERA-project.

  1. The Finnish disease heritage database (FinDis) update-a database for the genes mutated in the Finnish disease heritage brought to the next-generation sequencing era.

    PubMed

    Polvi, Anne; Linturi, Henna; Varilo, Teppo; Anttonen, Anna-Kaisa; Byrne, Myles; Fokkema, Ivo F A C; Almusa, Henrikki; Metzidis, Anthony; Avela, Kristiina; Aula, Pertti; Kestilä, Marjo; Muilu, Juha

    2013-11-01

    The Finnish Disease Heritage Database (FinDis) (http://findis.org) was originally published in 2004 as a centralized information resource for rare monogenic diseases enriched in the Finnish population. The FinDis database originally contained 405 causative variants for 30 diseases. At the time, the FinDis database was a comprehensive collection of data, but since 1994, a large amount of new information has emerged, making the necessity to update the database evident. We collected information and updated the database to contain genes and causative variants for 35 diseases, including six more genes and more than 1,400 additional disease-causing variants. Information for causative variants for each gene is collected under the LOVD 3.0 platform, enabling easy updating. The FinDis portal provides a centralized resource and user interface to link information on each disease and gene with variant data in the LOVD 3.0 platform. The software written to achieve this has been open-sourced and made available on GitHub (http://github.com/findis-db), allowing biomedical institutions in other countries to present their national data in a similar way, and to both contribute to, and benefit from, standardized variation data. The updated FinDis portal provides a unique resource to assist patient diagnosis, research, and the development of new cures. © 2013 WILEY PERIODICALS, INC.

  2. Comparison and optimization of in silico algorithms for predicting the pathogenicity of sodium channel variants in epilepsy.

    PubMed

    Holland, Katherine D; Bouley, Thomas M; Horn, Paul S

    2017-07-01

    Variants in neuronal voltage-gated sodium channel α-subunits genes SCN1A, SCN2A, and SCN8A are common in early onset epileptic encephalopathies and other autosomal dominant childhood epilepsy syndromes. However, in clinical practice, missense variants are often classified as variants of uncertain significance when missense variants are identified but heritability cannot be determined. Genetic testing reports often include results of computational tests to estimate pathogenicity and the frequency of that variant in population-based databases. The objective of this work was to enhance clinicians' understanding of results by (1) determining how effectively computational algorithms predict epileptogenicity of sodium channel (SCN) missense variants; (2) optimizing their predictive capabilities; and (3) determining if epilepsy-associated SCN variants are present in population-based databases. This will help clinicians better understand the results of indeterminate SCN test results in people with epilepsy. Pathogenic, likely pathogenic, and benign variants in SCNs were identified using databases of sodium channel variants. Benign variants were also identified from population-based databases. Eight algorithms commonly used to predict pathogenicity were compared. In addition, logistic regression was used to determine if a combination of algorithms could better predict pathogenicity. Based on American College of Medical Genetic Criteria, 440 variants were classified as pathogenic or likely pathogenic and 84 were classified as benign or likely benign. Twenty-eight variants previously associated with epilepsy were present in population-based gene databases. The output provided by most computational algorithms had a high sensitivity but low specificity with an accuracy of 0.52-0.77. Accuracy could be improved by adjusting the threshold for pathogenicity. Using this adjustment, the Mendelian Clinically Applicable Pathogenicity (M-CAP) algorithm had an accuracy of 0.90 and a

  3. Clinical Views: Object-Oriented Views for Clinical Databases

    PubMed Central

    Portoni, Luisa; Combi, Carlo; Pinciroli, Francesco

    1998-01-01

    We present here a prototype of a clinical information system for the archiving and the management of multimedia and temporally-oriented clinical data related to PTCA patients. The system is based on an object-oriented DBMS and supports multiple views and view schemas on patients' data. Remote data access is supported too.

  4. Fraction-variant beam orientation optimization for non-coplanar IMRT

    NASA Astrophysics Data System (ADS)

    O'Connor, Daniel; Yu, Victoria; Nguyen, Dan; Ruan, Dan; Sheng, Ke

    2018-02-01

    Conventional beam orientation optimization (BOO) algorithms for IMRT assume that the same set of beam angles is used for all treatment fractions. In this paper we present a BOO formulation based on group sparsity that simultaneously optimizes non-coplanar beam angles for all fractions, yielding a fraction-variant (FV) treatment plan. Beam angles are selected by solving a multi-fraction fluence map optimization problem involving 500-700 candidate beams per fraction, with an additional group sparsity term that encourages most candidate beams to be inactive. The optimization problem is solved using the fast iterative shrinkage-thresholding algorithm. Our FV BOO algorithm is used to create five-fraction treatment plans for digital phantom, prostate, and lung cases as well as a 30-fraction plan for a head and neck case. A homogeneous PTV dose coverage is maintained in all fractions. The treatment plans are compared with fraction-invariant plans that use a fixed set of beam angles for all fractions. The FV plans reduced OAR mean dose and D 2 values on average by 3.3% and 3.8% of the prescription dose, respectively. Notably, mean OAR dose was reduced by 14.3% of prescription dose (rectum), 11.6% (penile bulb), 10.7% (seminal vesicle), 5.5% (right femur), 3.5% (bladder), 4.0% (normal left lung), 15.5% (cochleas), and 5.2% (chiasm). D 2 was reduced by 14.9% of prescription dose (right femur), 8.2% (penile bulb), 12.7% (proximal bronchus), 4.1% (normal left lung), 15.2% (cochleas), 10.1% (orbits), 9.1% (chiasm), 8.7% (brainstem), and 7.1% (parotids). Meanwhile, PTV homogeneity defined as D 95/D 5 improved from .92 to .95 (digital phantom), from .95 to .98 (prostate case), and from .94 to .97 (lung case), and remained constant for the head and neck case. Moreover, the FV plans are dosimetrically similar to conventional plans that use twice as many beams per fraction. Thus, FV BOO offers the potential to reduce delivery time for non-coplanar IMRT.

  5. DaMold: A data-mining platform for variant annotation and visualization in molecular diagnostics research.

    PubMed

    Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas

    2017-07-01

    Next-generation sequencing (NGS) has become a powerful and efficient tool for routine mutation screening in clinical research. As each NGS test yields hundreds of variants, the current challenge is to meaningfully interpret the data and select potential candidates. Analyzing each variant while manually investigating several relevant databases to collect specific information is a cumbersome and time-consuming process, and it requires expertise and familiarity with these databases. Thus, a tool that can seamlessly annotate variants with clinically relevant databases under one common interface would be of great help for variant annotation, cross-referencing, and visualization. This tool would allow variants to be processed in an automated and high-throughput manner and facilitate the investigation of variants in several genome browsers. Several analysis tools are available for raw sequencing-read processing and variant identification, but an automated variant filtering, annotation, cross-referencing, and visualization tool is still lacking. To fulfill these requirements, we developed DaMold, a Web-based, user-friendly tool that can filter and annotate variants and can access and compile information from 37 resources. It is easy to use, provides flexible input options, and accepts variants from NGS and Sanger sequencing as well as hotspots in VCF and BED formats. DaMold is available as an online application at http://damold.platomics.com/index.html, and as a Docker container and virtual machine at https://sourceforge.net/projects/damold/. © 2017 Wiley Periodicals, Inc.

  6. LitVar: a semantic search engine for linking genomic variant data in PubMed and PMC.

    PubMed

    Allot, Alexis; Peng, Yifan; Wei, Chih-Hsuan; Lee, Kyubum; Phan, Lon; Lu, Zhiyong

    2018-05-14

    The identification and interpretation of genomic variants play a key role in the diagnosis of genetic diseases and related research. These tasks increasingly rely on accessing relevant manually curated information from domain databases (e.g. SwissProt or ClinVar). However, due to the sheer volume of medical literature and high cost of expert curation, curated variant information in existing databases are often incomplete and out-of-date. In addition, the same genetic variant can be mentioned in publications with various names (e.g. 'A146T' versus 'c.436G>A' versus 'rs121913527'). A search in PubMed using only one name usually cannot retrieve all relevant articles for the variant of interest. Hence, to help scientists, healthcare professionals, and database curators find the most up-to-date published variant research, we have developed LitVar for the search and retrieval of standardized variant information. In addition, LitVar uses advanced text mining techniques to compute and extract relationships between variants and other associated entities such as diseases and chemicals/drugs. LitVar is publicly available at https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar.

  7. Harmonizing the interpretation of genetic variants across the world: the Malaysian experience.

    PubMed

    Hassan, Nik Norliza Nik; Plazzer, John-Paul; Smith, Timothy D; Halim-Fikri, Hashim; Macrae, Finlay; Zubaidi, A A L; Zilfalil, Bin Alwi

    2016-02-26

    Databases for gene variants are very useful for sharing genetic data and to facilitate the understanding of the genetic basis of diseases. This report summarises the issues surrounding the development of the Malaysian Human Variome Project Country Node. The focus is on human germline variants. Somatic variants, mitochondrial variants and other types of genetic variation have corresponding databases which are not covered here, as they have specific issues that do not necessarily apply to germline variations. The ethical, legal, social issues, intellectual property, ownership of the data, information technology implementation, and efforts to improve the standards and systems used in data sharing are discussed. An overarching framework such as provided by the Human Variome Project to co-ordinate activities is invaluable. Country Nodes, such as MyHVP, enable human gene variation associated with human diseases to be collected, stored and shared by all disciplines (clinicians, molecular biologists, pathologists, bioinformaticians) for a consistent interpretation of genetic variants locally and across the world.

  8. Expanded national database collection and data coverage in the FINDbase worldwide database for clinically relevant genomic variation allele frequencies

    PubMed Central

    Viennas, Emmanouil; Komianou, Angeliki; Mizzi, Clint; Stojiljkovic, Maja; Mitropoulou, Christina; Muilu, Juha; Vihinen, Mauno; Grypioti, Panagiota; Papadaki, Styliani; Pavlidis, Cristiana; Zukic, Branka; Katsila, Theodora; van der Spek, Peter J.; Pavlovic, Sonja; Tzimas, Giannis; Patrinos, George P.

    2017-01-01

    FINDbase (http://www.findbase.org) is a comprehensive data repository that records the prevalence of clinically relevant genomic variants in various populations worldwide, such as pathogenic variants leading mostly to monogenic disorders and pharmacogenomics biomarkers. The database also records the incidence of rare genetic diseases in various populations, all in well-distinct data modules. Here, we report extensive data content updates in all data modules, with direct implications to clinical pharmacogenomics. Also, we report significant new developments in FINDbase, namely (i) the release of a new version of the ETHNOS software that catalyzes development curation of national/ethnic genetic databases, (ii) the migration of all FINDbase data content into 90 distinct national/ethnic mutation databases, all built around Microsoft's PivotViewer (http://www.getpivot.com) software (iii) new data visualization tools and (iv) the interrelation of FINDbase with DruGeVar database with direct implications in clinical pharmacogenomics. The abovementioned updates further enhance the impact of FINDbase, as a key resource for Genomic Medicine applications. PMID:27924022

  9. Nomenclature- and Database-Compatible Names for the Two Ebola Virus Variants that Emerged in Guinea and the Democratic Republic of the Congo in 2014

    PubMed Central

    Kuhn, Jens H.; Andersen, Kristian G.; Baize, Sylvain; Bào, Yīmíng; Bavari, Sina; Berthet, Nicolas; Blinkova, Olga; Brister, J. Rodney; Clawson, Anna N.; Fair, Joseph; Gabriel, Martin; Garry, Robert F.; Gire, Stephen K.; Goba, Augustine; Gonzalez, Jean-Paul; Günther, Stephan; Happi, Christian T.; Jahrling, Peter B.; Kapetshi, Jimmy; Kobinger, Gary; Kugelman, Jeffrey R.; Leroy, Eric M.; Maganga, Gael Darren; Mbala, Placide K.; Moses, Lina M.; Muyembe-Tamfum, Jean-Jacques; N’Faly, Magassouba; Nichol, Stuart T.; Omilabu, Sunday A.; Palacios, Gustavo; Park, Daniel J.; Paweska, Janusz T.; Radoshitzky, Sheli R.; Rossi, Cynthia A.; Sabeti, Pardis C.; Schieffelin, John S.; Schoepp, Randal J.; Sealfon, Rachel; Swanepoel, Robert; Towner, Jonathan S.; Wada, Jiro; Wauquier, Nadia; Yozwiak, Nathan L.; Formenty, Pierre

    2014-01-01

    In 2014, Ebola virus (EBOV) was identified as the etiological agent of a large and still expanding outbreak of Ebola virus disease (EVD) in West Africa and a much more confined EVD outbreak in Middle Africa. Epidemiological and evolutionary analyses confirmed that all cases of both outbreaks are connected to a single introduction each of EBOV into human populations and that both outbreaks are not directly connected. Coding-complete genomic sequence analyses of isolates revealed that the two outbreaks were caused by two novel EBOV variants, and initial clinical observations suggest that neither of them should be considered strains. Here we present consensus decisions on naming for both variants (West Africa: “Makona”, Middle Africa: “Lomela”) and provide database-compatible full, shortened, and abbreviated names that are in line with recently established filovirus sub-species nomenclatures. PMID:25421896

  10. Meta-analysis of gene-level associations for rare variants based on single-variant statistics.

    PubMed

    Hu, Yi-Juan; Berndt, Sonja I; Gustafsson, Stefan; Ganna, Andrea; Hirschhorn, Joel; North, Kari E; Ingelsson, Erik; Lin, Dan-Yu

    2013-08-08

    Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  11. Mutation Update for GNE Gene Variants Associated with GNE Myopathy

    PubMed Central

    Celeste, Frank V.; Vilboux, Thierry; Ciccone, Carla; de Dios, John Karl; Malicdan, May Christine V.; Leoyklang, Petcharat; McKew, John C.; Gahl, William A.; Carrillo-Carrasco, Nuria; Huizing, Marjan

    2014-01-01

    The GNE gene encodes the rate-limiting, bifunctional enzyme of sialic acid biosynthesis, UDP-N-acetylglucosamine 2-epimerase/N-acetylmannosamine kinase (GNE). Biallelic GNE mutations underlie GNE myopathy, an adult-onset progressive myopathy. GNE myopathy-associated GNE mutations are predominantly missense, resulting in reduced, but not absent, GNE enzyme activities. The exact pathomechanism of GNE myopathy remains unknown, but likely involves aberrant (muscle) sialylation. Here we summarize 154 reported and novel GNE variants associated with GNE myopathy, including 122 missense, 11 nonsense, 14 insertion/deletions and 7 intronic variants. All variants were deposited in the online GNE variation database (http://www.dmd.nl/nmdb2/home.php?select_db=GNE). We report the predicted effects on protein function of all variants as well as the predicted effects on epimerase and/or kinase enzymatic activities of selected variants. By analyzing exome sequence databases, we identified three frequently occurring, unreported GNE missense variants/polymorphisms, important for future sequence interpretations. Based on allele frequencies, we estimate the world-wide prevalence of GNE myopathy to be ~ 4–21/1,000,000. This previously unrecognized high prevalence confirms suspicions that many patients may escape diagnosis. Awareness among physicians for GNE myopathy is essential for the identification of new patients, which is required for better understanding of the disorder’s pathomechanism and for the success of ongoing treatment trials. PMID:24796702

  12. Establishment of an international database for genetic variants in esophageal cancer.

    PubMed

    Vihinen, Mauno

    2016-10-01

    The establishment of a database has been suggested in order to collect, organize, and distribute genetic information about esophageal cancer. The World Organization for Specialized Studies on Diseases of the Esophagus and the Human Variome Project will be in charge of a central database of information about esophageal cancer-related variations from publications, databases, and laboratories; in addition to genetic details, clinical parameters will also be included. The aim will be to get all the central players in research, clinical, and commercial laboratories to contribute. The database will follow established recommendations and guidelines. The database will require a team of dedicated curators with different backgrounds. Numerous layers of systematics will be applied to facilitate computational analyses. The data items will be extensively integrated with other information sources. The database will be distributed as open access to ensure exchange of the data with other databases. Variations will be reported in relation to reference sequences on three levels--DNA, RNA, and protein-whenever applicable. In the first phase, the database will concentrate on genetic variations including both somatic and germline variations for susceptibility genes. Additional types of information can be integrated at a later stage. © 2016 New York Academy of Sciences.

  13. GALT protein database: querying structural and functional features of GALT enzyme.

    PubMed

    d'Acierno, Antonio; Facchiano, Angelo; Marabotti, Anna

    2014-09-01

    Knowledge of the impact of variations on protein structure can enhance the comprehension of the mechanisms of genetic diseases related to that protein. Here, we present a new version of GALT Protein Database, a Web-accessible data repository for the storage and interrogation of structural effects of variations of the enzyme galactose-1-phosphate uridylyltransferase (GALT), the impairment of which leads to classic Galactosemia, a rare genetic disease. This new version of this database now contains the models of 201 missense variants of GALT enzyme, including heterozygous variants, and it allows users not only to retrieve information about the missense variations affecting this protein, but also to investigate their impact on substrate binding, intersubunit interactions, stability, and other structural features. In addition, it allows the interactive visualization of the models of variants collected into the database. We have developed additional tools to improve the use of the database by nonspecialized users. This Web-accessible database (http://bioinformatica.isa.cnr.it/GALT/GALT2.0) represents a model of tools potentially suitable for application to other proteins that are involved in human pathologies and that are subjected to genetic variations. © 2014 WILEY PERIODICALS, INC.

  14. SPINS: standardized protein NMR storage. A data dictionary and object-oriented relational database for archiving protein NMR spectra.

    PubMed

    Baran, Michael C; Moseley, Hunter N B; Sahota, Gurmukh; Montelione, Gaetano T

    2002-10-01

    Modern protein NMR spectroscopy laboratories have a rapidly growing need for an easily queried local archival system of raw experimental NMR datasets. SPINS (Standardized ProteIn Nmr Storage) is an object-oriented relational database that provides facilities for high-volume NMR data archival, organization of analyses, and dissemination of results to the public domain by automatic preparation of the header files required for submission of data to the BioMagResBank (BMRB). The current version of SPINS coordinates the process from data collection to BMRB deposition of raw NMR data by standardizing and integrating the storage and retrieval of these data in a local laboratory file system. Additional facilities include a data mining query tool, graphical database administration tools, and a NMRStar v2. 1.1 file generator. SPINS also includes a user-friendly internet-based graphical user interface, which is optionally integrated with Varian VNMR NMR data collection software. This paper provides an overview of the data model underlying the SPINS database system, a description of its implementation in Oracle, and an outline of future plans for the SPINS project.

  15. GAVIN: Gene-Aware Variant INterpretation for medical sequencing.

    PubMed

    van der Velde, K Joeri; de Boer, Eddy N; van Diemen, Cleo C; Sikkema-Raddatz, Birgit; Abbott, Kristin M; Knopperts, Alain; Franke, Lude; Sijmons, Rolf H; de Koning, Tom J; Wijmenga, Cisca; Sinke, Richard J; Swertz, Morris A

    2017-01-16

    We present Gene-Aware Variant INterpretation (GAVIN), a new method that accurately classifies variants for clinical diagnostic purposes. Classifications are based on gene-specific calibrations of allele frequencies from the ExAC database, likely variant impact using SnpEff, and estimated deleteriousness based on CADD scores for >3000 genes. In a benchmark on 18 clinical gene sets, we achieve a sensitivity of 91.4% and a specificity of 76.9%. This accuracy is unmatched by 12 other tools. We provide GAVIN as an online MOLGENIS service to annotate VCF files and as an open source executable for use in bioinformatic pipelines. It can be found at http://molgenis.org/gavin .

  16. GETPrime: a gene- or transcript-specific primer database for quantitative real-time PCR.

    PubMed

    Gubelmann, Carine; Gattiker, Alexandre; Massouras, Andreas; Hens, Korneel; David, Fabrice; Decouttere, Frederik; Rougemont, Jacques; Deplancke, Bart

    2011-01-01

    The vast majority of genes in humans and other organisms undergo alternative splicing, yet the biological function of splice variants is still very poorly understood in large part because of the lack of simple tools that can map the expression profiles and patterns of these variants with high sensitivity. High-throughput quantitative real-time polymerase chain reaction (qPCR) is an ideal technique to accurately quantify nucleic acid sequences including splice variants. However, currently available primer design programs do not distinguish between splice variants and also differ substantially in overall quality, functionality or throughput mode. Here, we present GETPrime, a primer database supported by a novel platform that uniquely combines and automates several features critical for optimal qPCR primer design. These include the consideration of all gene splice variants to enable either gene-specific (covering the majority of splice variants) or transcript-specific (covering one splice variant) expression profiling, primer specificity validation, automated best primer pair selection according to strict criteria and graphical visualization of the latter primer pairs within their genomic context. GETPrime primers have been extensively validated experimentally, demonstrating high transcript specificity in complex samples. Thus, the free-access, user-friendly GETPrime database allows fast primer retrieval and visualization for genes or groups of genes of most common model organisms, and is available at http://updepla1srv1.epfl.ch/getprime/. Database URL: http://deplanckelab.epfl.ch.

  17. Detection of alternative splice variants at the proteome level in Aspergillus flavus.

    PubMed

    Chang, Kung-Yen; Georgianna, D Ryan; Heber, Steffen; Payne, Gary A; Muddiman, David C

    2010-03-05

    Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.

  18. Human Chromosome Y and Haplogroups; introducing YDHS Database.

    PubMed

    Tiirikka, Timo; Moilanen, Jukka S

    2015-12-01

    As the high throughput sequencing efforts generate more biological information, scientists from different disciplines are interpreting the polymorphisms that make us unique. In addition, there is an increasing trend in general public to research their own genealogy, find distant relatives and to know more about their biological background. Commercial vendors are providing analyses of mitochondrial and Y-chromosomal markers for such purposes. Clearly, an easy-to-use free interface to the existing data on the identified variants would be in the interest of general public and professionals less familiar with the field. Here we introduce a novel metadatabase YDHS that aims to provide such an interface for Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants. The database uses ISOGG Y-DNA tree as the source of mutations and haplogroups and by using genomic positions of the mutations the database links them to genes and other biological entities. YDHS contains analysis tools for deeper Y-SNP analysis. YDHS addresses the shortage of Y-DNA related databases. We have tested our database using a set of different cases from literature ranging from infertility to autism. The database is at http://www.semanticgen.net/ydhs Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants have not been in the scientific limelight, excluding certain specialized fields like forensics, mainly because there is not much freely available information or it is scattered in different sources. However, as we have demonstrated Y-SNPs do play a role in various cases on the haplogroup level and it is possible to create a free Y-DNA dedicated bioinformatics resource.

  19. Data Rods: High Speed, Time-Series Analysis of Massive Cryospheric Data Sets Using Object-Oriented Database Methods

    NASA Astrophysics Data System (ADS)

    Liang, Y.; Gallaher, D. W.; Grant, G.; Lv, Q.

    2011-12-01

    Change over time, is the central driver of climate change detection. The goal is to diagnose the underlying causes, and make projections into the future. In an effort to optimize this process we have developed the Data Rod model, an object-oriented approach that provides the ability to query grid cell changes and their relationships to neighboring grid cells through time. The time series data is organized in time-centric structures called "data rods." A single data rod can be pictured as the multi-spectral data history at one grid cell: a vertical column of data through time. This resolves the long-standing problem of managing time-series data and opens new possibilities for temporal data analysis. This structure enables rapid time- centric analysis at any grid cell across multiple sensors and satellite platforms. Collections of data rods can be spatially and temporally filtered, statistically analyzed, and aggregated for use with pattern matching algorithms. Likewise, individual image pixels can be extracted to generate multi-spectral imagery at any spatial and temporal location. The Data Rods project has created a series of prototype databases to store and analyze massive datasets containing multi-modality remote sensing data. Using object-oriented technology, this method overcomes the operational limitations of traditional relational databases. To demonstrate the speed and efficiency of time-centric analysis using the Data Rods model, we have developed a sea ice detection algorithm. This application determines the concentration of sea ice in a small spatial region across a long temporal window. If performed using traditional analytical techniques, this task would typically require extensive data downloads and spatial filtering. Using Data Rods databases, the exact spatio-temporal data set is immediately available No extraneous data is downloaded, and all selected data querying occurs transparently on the server side. Moreover, fundamental statistical

  20. GETPrime: a gene- or transcript-specific primer database for quantitative real-time PCR

    PubMed Central

    Gubelmann, Carine; Gattiker, Alexandre; Massouras, Andreas; Hens, Korneel; David, Fabrice; Decouttere, Frederik; Rougemont, Jacques; Deplancke, Bart

    2011-01-01

    The vast majority of genes in humans and other organisms undergo alternative splicing, yet the biological function of splice variants is still very poorly understood in large part because of the lack of simple tools that can map the expression profiles and patterns of these variants with high sensitivity. High-throughput quantitative real-time polymerase chain reaction (qPCR) is an ideal technique to accurately quantify nucleic acid sequences including splice variants. However, currently available primer design programs do not distinguish between splice variants and also differ substantially in overall quality, functionality or throughput mode. Here, we present GETPrime, a primer database supported by a novel platform that uniquely combines and automates several features critical for optimal qPCR primer design. These include the consideration of all gene splice variants to enable either gene-specific (covering the majority of splice variants) or transcript-specific (covering one splice variant) expression profiling, primer specificity validation, automated best primer pair selection according to strict criteria and graphical visualization of the latter primer pairs within their genomic context. GETPrime primers have been extensively validated experimentally, demonstrating high transcript specificity in complex samples. Thus, the free-access, user-friendly GETPrime database allows fast primer retrieval and visualization for genes or groups of genes of most common model organisms, and is available at http://updepla1srv1.epfl.ch/getprime/. Database URL: http://deplanckelab.epfl.ch. PMID:21917859

  1. Monogenic diabetes syndromes: Locus‐specific databases for Alström, Wolfram, and Thiamine‐responsive megaloblastic anemia

    PubMed Central

    Astuti, Dewi; Sabir, Ataf; Fulton, Piers; Zatyka, Malgorzata; Williams, Denise; Hardy, Carol; Milan, Gabriella; Favaretto, Francesca; Yu‐Wai‐Man, Patrick; Rohayem, Julia; López de Heredia, Miguel; Hershey, Tamara; Tranebjaerg, Lisbeth; Chen, Jian‐Hua; Chaussenot, Annabel; Nunes, Virginia; Marshall, Bess; McAfferty, Susan; Tillmann, Vallo; Maffei, Pietro; Paquis‐Flucklinger, Veronique; Geberhiwot, Tarekign; Mlynarski, Wojciech; Parkinson, Kay; Picard, Virginie; Bueno, Gema Esteban; Dias, Renuka; Arnold, Amy; Richens, Caitlin; Paisey, Richard; Urano, Fumihiko; Semple, Robert; Sinnott, Richard

    2017-01-01

    Abstract We developed a variant database for diabetes syndrome genes, using the Leiden Open Variation Database platform, containing observed phenotypes matched to the genetic variations. We populated it with 628 published disease‐associated variants (December 2016) for: WFS1 (n = 309), CISD2 (n = 3), ALMS1 (n = 268), and SLC19A2 (n = 48) for Wolfram type 1, Wolfram type 2, Alström, and Thiamine‐responsive megaloblastic anemia syndromes, respectively; and included 23 previously unpublished novel germline variants in WFS1 and 17 variants in ALMS1. We then investigated genotype–phenotype relations for the WFS1 gene. The presence of biallelic loss‐of‐function variants predicted Wolfram syndrome defined by insulin‐dependent diabetes and optic atrophy, with a sensitivity of 79% (95% CI 75%–83%) and specificity of 92% (83%–97%). The presence of minor loss‐of‐function variants in WFS1 predicted isolated diabetes, isolated deafness, or isolated congenital cataracts without development of the full syndrome (sensitivity 100% [93%–100%]; specificity 78% [73%–82%]). The ability to provide a prognostic prediction based on genotype will lead to improvements in patient care and counseling. The development of the database as a repository for monogenic diabetes gene variants will allow prognostic predictions for other diabetes syndromes as next‐generation sequencing expands the repertoire of genotypes and phenotypes. The database is publicly available online at https://lovd.euro-wabb.org. PMID:28432734

  2. Fine-Mapping of Common Genetic Variants Associated with Colorectal Tumor Risk Identified Potential Functional Variants

    PubMed Central

    Gala, Manish; Abecasis, Goncalo; Bezieau, Stephane; Brenner, Hermann; Butterbach, Katja; Caan, Bette J.; Carlson, Christopher S.; Casey, Graham; Chang-Claude, Jenny; Conti, David V.; Curtis, Keith R.; Duggan, David; Gallinger, Steven; Haile, Robert W.; Harrison, Tabitha A.; Hayes, Richard B.; Hoffmeister, Michael; Hopper, John L.; Hudson, Thomas J.; Jenkins, Mark A.; Küry, Sébastien; Le Marchand, Loic; Leal, Suzanne M.; Newcomb, Polly A.; Nickerson, Deborah A.; Potter, John D.; Schoen, Robert E.; Schumacher, Fredrick R.; Seminara, Daniela; Slattery, Martha L.; Hsu, Li; Chan, Andrew T.; White, Emily; Berndt, Sonja I.; Peters, Ulrike

    2016-01-01

    Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) associated with colorectal cancer risk. These SNPs may tag correlated variants with biological importance. Fine-mapping around GWAS loci can facilitate detection of functional candidates and additional independent risk variants. We analyzed 11,900 cases and 14,311 controls in the Genetics and Epidemiology of Colorectal Cancer Consortium and the Colon Cancer Family Registry. To fine-map genomic regions containing all known common risk variants, we imputed high-density genetic data from the 1000 Genomes Project. We tested single-variant associations with colorectal tumor risk for all variants spanning genomic regions 250-kb upstream or downstream of 31 GWAS-identified SNPs (index SNPs). We queried the University of California, Santa Cruz Genome Browser to examine evidence for biological function. Index SNPs did not show the strongest association signals with colorectal tumor risk in their respective genomic regions. Bioinformatics analysis of SNPs showing smaller P-values in each region revealed 21 functional candidates in 12 loci (5q31.1, 8q24, 11q13.4, 11q23, 12p13.32, 12q24.21, 14q22.2, 15q13, 18q21, 19q13.1, 20p12.3, and 20q13.33). We did not observe evidence of additional independent association signals in GWAS-identified regions. Our results support the utility of integrating data from comprehensive fine-mapping with expanding publicly available genomic databases to help clarify GWAS associations and identify functional candidates that warrant more onerous laboratory follow-up. Such efforts may aid the eventual discovery of disease-causing variant(s). PMID:27379672

  3. Object-orientated DBMS techniques for time-oriented medical record.

    PubMed

    Pinciroli, F; Combi, C; Pozzi, G

    1992-01-01

    In implementing time-orientated medical record (TOMR) management systems, use of a relational model played a big role. Many applications have been developed to extend query and data manipulation languages to temporal aspects of information. Our experience in developing TOMR revealed some deficiencies inside the relational model, such as: (a) abstract data type definition; (b) unified view of data, at a programming level; (c) management of temporal data; (d) management of signals and images. We identified some first topics to face by an object-orientated approach to database design. This paper describes the first steps in designing and implementing a TOMR by an object-orientated DBMS.

  4. Identification of Inherited Retinal Disease-Associated Genetic Variants in 11 Candidate Genes.

    PubMed

    Astuti, Galuh D N; van den Born, L Ingeborgh; Khan, M Imran; Hamel, Christian P; Bocquet, Béatrice; Manes, Gaël; Quinodoz, Mathieu; Ali, Manir; Toomes, Carmel; McKibbin, Martin; El-Asrag, Mohammed E; Haer-Wigman, Lonneke; Inglehearn, Chris F; Black, Graeme C M; Hoyng, Carel B; Cremers, Frans P M; Roosing, Susanne

    2018-01-10

    Inherited retinal diseases (IRDs) display an enormous genetic heterogeneity. Whole exome sequencing (WES) recently identified genes that were mutated in a small proportion of IRD cases. Consequently, finding a second case or family carrying pathogenic variants in the same candidate gene often is challenging. In this study, we searched for novel candidate IRD gene-associated variants in isolated IRD families, assessed their causality, and searched for novel genotype-phenotype correlations. Whole exome sequencing was performed in 11 probands affected with IRDs. Homozygosity mapping data was available for five cases. Variants with minor allele frequencies ≤ 0.5% in public databases were selected as candidate disease-causing variants. These variants were ranked based on their: (a) presence in a gene that was previously implicated in IRD; (b) minor allele frequency in the Exome Aggregation Consortium database (ExAC); (c) in silico pathogenicity assessment using the combined annotation dependent depletion (CADD) score; and (d) interaction of the corresponding protein with known IRD-associated proteins. Twelve unique variants were found in 11 different genes in 11 IRD probands. Novel autosomal recessive and dominant inheritance patterns were found for variants in Small Nuclear Ribonucleoprotein U5 Subunit 200 ( SNRNP200 ) and Zinc Finger Protein 513 ( ZNF513 ), respectively. Using our pathogenicity assessment, a variant in DEAH-Box Helicase 32 ( DHX32 ) was the top ranked novel candidate gene to be associated with IRDs, followed by eight medium and lower ranked candidate genes. The identification of candidate disease-associated sequence variants in 11 single families underscores the notion that the previously identified IRD-associated genes collectively carry > 90% of the defects implicated in IRDs. To identify multiple patients or families with variants in the same gene and thereby provide extra proof for pathogenicity, worldwide data sharing is needed.

  5. Who's Gonna Pay the Piper for Free Online Databases?

    ERIC Educational Resources Information Center

    Jacso, Peter

    1996-01-01

    Discusses new pricing models for some online services and considers the possibilities for the traditional online database market. Topics include multimedia music databases, including copyright implications; other retail-oriented databases; and paying for free databases with advertising. (LRW)

  6. BlackOPs: increasing confidence in variant detection through mappability filtering.

    PubMed

    Cabanski, Christopher R; Wilkerson, Matthew D; Soloway, Matthew; Parker, Joel S; Liu, Jinze; Prins, Jan F; Marron, J S; Perou, Charles M; Hayes, D Neil

    2013-10-01

    Identifying variants using high-throughput sequencing data is currently a challenge because true biological variants can be indistinguishable from technical artifacts. One source of technical artifact results from incorrectly aligning experimentally observed sequences to their true genomic origin ('mismapping') and inferring differences in mismapped sequences to be true variants. We developed BlackOPs, an open-source tool that simulates experimental RNA-seq and DNA whole exome sequences derived from the reference genome, aligns these sequences by custom parameters, detects variants and outputs a blacklist of positions and alleles caused by mismapping. Blacklists contain thousands of artifact variants that are indistinguishable from true variants and, for a given sample, are expected to be almost completely false positives. We show that these blacklist positions are specific to the alignment algorithm and read length used, and BlackOPs allows users to generate a blacklist specific to their experimental setup. We queried the dbSNP and COSMIC variant databases and found numerous variants indistinguishable from mapping errors. We demonstrate how filtering against blacklist positions reduces the number of potential false variants using an RNA-seq glioblastoma cell line data set. In summary, accounting for mapping-caused variants tuned to experimental setups reduces false positives and, therefore, improves genome characterization by high-throughput sequencing.

  7. Pose-variant facial expression recognition using an embedded image system

    NASA Astrophysics Data System (ADS)

    Song, Kai-Tai; Han, Meng-Ju; Chang, Shuo-Hung

    2008-12-01

    In recent years, one of the most attractive research areas in human-robot interaction is automated facial expression recognition. Through recognizing the facial expression, a pet robot can interact with human in a more natural manner. In this study, we focus on the facial pose-variant problem. A novel method is proposed in this paper to recognize pose-variant facial expressions. After locating the face position in an image frame, the active appearance model (AAM) is applied to track facial features. Fourteen feature points are extracted to represent the variation of facial expressions. The distance between feature points are defined as the feature values. These feature values are sent to a support vector machine (SVM) for facial expression determination. The pose-variant facial expression is classified into happiness, neutral, sadness, surprise or anger. Furthermore, in order to evaluate the performance for practical applications, this study also built a low resolution database (160x120 pixels) using a CMOS image sensor. Experimental results show that the recognition rate is 84% with the self-built database.

  8. Scripps Genome ADVISER: Annotation and Distributed Variant Interpretation SERver

    PubMed Central

    Pham, Phillip H.; Shipman, William J.; Erikson, Galina A.; Schork, Nicholas J.; Torkamani, Ali

    2015-01-01

    Interpretation of human genomes is a major challenge. We present the Scripps Genome ADVISER (SG-ADVISER) suite, which aims to fill the gap between data generation and genome interpretation by performing holistic, in-depth, annotations and functional predictions on all variant types and effects. The SG-ADVISER suite includes a de-identification tool, a variant annotation web-server, and a user interface for inheritance and annotation-based filtration. SG-ADVISER allows users with no bioinformatics expertise to manipulate large volumes of variant data with ease – without the need to download large reference databases, install software, or use a command line interface. SG-ADVISER is freely available at genomics.scripps.edu/ADVISER. PMID:25706643

  9. Novel LOVD databases for hereditary breast cancer and colorectal cancer genes in the Chinese population.

    PubMed

    Pan, Min; Cong, Peikuan; Wang, Yue; Lin, Changsong; Yuan, Ying; Dong, Jian; Banerjee, Santasree; Zhang, Tao; Chen, Yanling; Zhang, Ting; Chen, Mingqing; Hu, Peter; Zheng, Shu; Zhang, Jin; Qi, Ming

    2011-12-01

    The Human Variome Project (HVP) is an international consortium of clinicians, geneticists, and researchers from over 30 countries, aiming to facilitate the establishment and maintenance of standards, systems, and infrastructure for the worldwide collection and sharing of all genetic variations effecting human disease. The HVP-China Node will build new and supplement existing databases of genetic diseases. As the first effort, we have created a novel variant database of BRCA1 and BRCA2, mismatch repair genes (MMR), and APC genes for breast cancer, Lynch syndrome, and familial adenomatous polyposis (FAP), respectively, in the Chinese population using the Leiden Open Variation Database (LOVD) format. We searched PubMed and some Chinese search engines to collect all the variants of these genes in the Chinese population that have already been detected and reported. There are some differences in the gene variants between the Chinese population and that of other ethnicities. The database is available online at http://www.genomed.org/LOVD/. Our database will appear to users who survey other LOVD databases (e.g., by Google search, or by NCBI GeneTests search). Remote submissions are accepted, and the information is updated monthly. © 2011 Wiley Periodicals, Inc.

  10. Meta-analysis of CHEK2 1100delC variant and colorectal cancer susceptibility.

    PubMed

    Xiang, He-ping; Geng, Xiao-ping; Ge, Wei-wei; Li, He

    2011-11-01

    Cell cycle checkpoint kinase 2 (CHEK2) gene has been inconsistently associated with colorectal cancer (CRC), particularly the 1100delC variant. To generate large-scale evidence on whether the CHEK2 1100delC variant is associated with CRC susceptibility we have conducted a meta-analysis. Data were collected from the following electronic databases: PubMed, Excerpta Medica Database and Chinese Biomedical Literature Database, with the last report up to November 2010. The odds ratio (OR) and its 95% confidence interval (95% CI) were used to assess the strength of association. We evaluated the contrast of carriers versus non-carriers. Meta-analysis was performed in a fixed/random effect model by using the software Review Manager 4.2. A total of six studies including 4194 cases and 10,010 controls based on the search criteria were involved in this meta-analysis. A significant association of the CHEK2 1100delC variant with unselected CRC was found (OR=2.11, 95% CI=1.41-3.16, P=0.0003). We also found an association of the CHEK2 1100delC variant with familial CRC (OR=2.80, 95% CI=1.74-4.51, P<0.0001). However, the association was not established for sporadic CRC (OR=1.45, 95% CI=0.49-4.30, P=0.50). This meta-analysis demonstrates that the CHEK2 1100delC variant may be an important CRC-predisposing gene, which increases CRC risk. Copyright © 2011. Published by Elsevier Ltd.

  11. Identification of Candidate Gene Variants in Korean MODY Families by Whole-Exome Sequencing.

    PubMed

    Shim, Ye Jee; Kim, Jung Eun; Hwang, Su-Kyeong; Choi, Bong Seok; Choi, Byung Ho; Cho, Eun-Mi; Jang, Kyoung Mi; Ko, Cheol Woo

    2015-01-01

    To date, 13 genes causing maturity-onset diabetes of the young (MODY) have been identified. However, there is a big discrepancy in the genetic locus between Asian and Caucasian patients with MODY. Thus, we conducted whole-exome sequencing in Korean MODY families to identify causative gene variants. Six MODY probands and their family members were included. Variants in the dbSNP135 and TIARA databases for Koreans and the variants with minor allele frequencies >0.5% of the 1000 Genomes database were excluded. We selected only the functional variants (gain of stop codon, frameshifts and nonsynonymous single-nucleotide variants) and conducted a case-control comparison in the family members. The selected variants were scanned for the previously introduced gene set implicated in glucose metabolism. Three variants c.620C>T:p.Thr207Ile in PTPRD, c.559C>G:p.Gln187Glu in SYT9, and c.1526T>G:p.Val509Gly in WFS1 were respectively identified in 3 families. We could not find any disease-causative alleles of known MODY 1-13 genes. Based on the predictive program, Thr207Ile in PTPRD was considered pathogenic. Whole-exome sequencing is a valuable method for the genetic diagnosis of MODY. Further evaluation is necessary about the role of PTPRD, SYT9 and WFS1 in normal insulin release from pancreatic beta cells. © 2015 S. Karger AG, Basel.

  12. A Survey of Object-Oriented Database Technology

    DTIC Science & Technology

    1990-05-01

    now mention briefly the various security and autho- rization schemes provided by GEMSTONE. 1. Login Authorization. There are two ways to login to...GemStone- through the OPAL programming environment or through the GemStone C interface. A user ID and password is required in both cases to login . 2. Name...lIlj A. Black. Object structure in the Emerald system. Proc. Ist Intl. Conf. on Objcct- Oriented Programming Systems, Languages and Applications, pp

  13. Object-Oriented Approach to Integrating Database Semantics. Volume 4.

    DTIC Science & Technology

    1987-12-01

    schemata for; 1. Object Classification Shema -- Entities 2. Object Structure and Relationship Schema -- Relations 3. Operation Classification and... relationships are represented in a database is non- intuitive for naive users. *It is difficult to access and combine information in multiple databases. In this...from the CURRENT-.CLASSES table. Choosing a selected item do-selects it. Choose 0 to exit. 1. STUDENTS 2. CUR~RENT-..CLASSES 3. MANAGMNT -.CLASS

  14. CancerDR: cancer drug resistance database.

    PubMed

    Kumar, Rahul; Chaudhary, Kumardeep; Gupta, Sudheer; Singh, Harinder; Kumar, Shailesh; Gautam, Ankur; Kapoor, Pallavi; Raghava, Gajendra P S

    2013-01-01

    Cancer therapies are limited by the development of drug resistance, and mutations in drug targets is one of the main reasons for developing acquired resistance. The adequate knowledge of these mutations in drug targets would help to design effective personalized therapies. Keeping this in mind, we have developed a database "CancerDR", which provides information of 148 anti-cancer drugs, and their pharmacological profiling across 952 cancer cell lines. CancerDR provides comprehensive information about each drug target that includes; (i) sequence of natural variants, (ii) mutations, (iii) tertiary structure, and (iv) alignment profile of mutants/variants. A number of web-based tools have been integrated in CancerDR. This database will be very useful for identification of genetic alterations in genes encoding drug targets, and in turn the residues responsible for drug resistance. CancerDR allows user to identify promiscuous drug molecules that can kill wide range of cancer cells. CancerDR is freely accessible at http://crdd.osdd.net/raghava/cancerdr/

  15. Rare missense variants in POT1 predispose to familial cutaneous malignant melanoma

    PubMed Central

    Shi, Jianxin; Yang, Xiaohong R.; Ballew, Bari; Rotunno, Melissa; Calista, Donato; Fargnoli, Maria Concetta; Ghiorzo, Paola; Paillerets, Brigitte Bressac-de; Nagore, Eduardo; Avril, Marie Francoise; Caporaso, Neil E.; McMaster, Mary L.; Cullen, Michael; Wang, Zhaoming; Zhang, Xijun; Bruno, William; Pastorino, Lorenza; Queirolo, Paola; Banuls-Roca, Jose; Garcia-Casado, Zaida; Vaysse, Amaury; Mohamdi, Hamida; Riazalhosseini, Yasser; Foglio, Mario; Jouenne, Fanélie; Hua, Xing; Hyland, Paula L.; Yin, Jinhu; Vallabhaneni, Haritha; Chai, Weihang; Minghetti, Paola; Pellegrini, Cristina; Ravichandran, Sarangan; Eggermont, Alexander; Lathrop, Mark; Peris, Ketty; Scarra, Giovanna Bianchi; Landi, Giorgio; Savage, Sharon A.; Sampson, Joshua N.; He, Ji; Yeager, Meredith; Goldin, Lynn R.; Demenais, Florence; Chanock, Stephen J.; Tucker, Margaret A.; Goldstein, Alisa M.; Liu, Yie; Landi, Maria Teresa

    2014-01-01

    Although CDKN2A is the most frequent high-risk melanoma susceptibility gene, the underlying genetic factors for most melanoma-prone families remain unknown. Using whole exome sequencing, we identified a rare variant that arose as a founder mutation in the telomere shelterin POT1 gene (g.7:124493086 C>T, Ser270Asn) in five unrelated melanoma-prone families from Romagna, Italy. Carriers of this variant had increased telomere length and elevated fragile telomeres suggesting that this variant perturbs telomere maintenance. Two additional rare POT1 variants were identified in all cases sequenced in two other Italian families, yielding a frequency of POT1 variants comparable to that of CDKN2A mutations in this population. These variants were not found in public databases or in 2,038 genotyped Italian controls. We also identified two rare recurrent POT1 variants in American and French familial melanoma cases. Our findings suggest that POT1 is a major susceptibility gene for familial melanoma in several populations. PMID:24686846

  16. Towards the Architecture of an Instructional Multimedia Database.

    ERIC Educational Resources Information Center

    Verhagen, Plin W.; Bestebreurtje, R.

    1994-01-01

    Discussion of multimedia databases in education focuses on the development of an adaptable database in The Netherlands that uses optical storage media to hold the audiovisual components. Highlights include types of applications; types of users; accessibility; adaptation; an object-oriented approach; levels of the database architecture; and…

  17. Spanish personal name variations in national and international biomedical databases: implications for information retrieval and bibliometric studies

    PubMed Central

    Ruiz-Pérez, R.; López-Cózar, E. Delgado; Jiménez-Contreras, E.

    2002-01-01

    Objectives: The study sought to investigate how Spanish names are handled by national and international databases and to identify mistakes that can undermine the usefulness of these databases for locating and retrieving works by Spanish authors. Methods: The authors sampled 172 articles published by authors from the University of Granada Medical School between 1987 and 1996 and analyzed the variations in how each of their names was indexed in Science Citation Index (SCI), MEDLINE, and Índice Médico Español (IME). The number and types of variants that appeared for each author's name were recorded and compared across databases to identify inconsistencies in indexing practices. We analyzed the relationship between variability (number of variants of an author's name) and productivity (number of items the name was associated with as an author), the consequences for retrieval of information, and the most frequent indexing structures used for Spanish names. Results: The proportion of authors who appeared under more then one name was 48.1% in SCI, 50.7% in MEDLINE, and 69.0% in IME. Productivity correlated directly with variability: more than 50% of the authors listed on five to ten items appeared under more than one name in any given database, and close to 100% of the authors listed on more than ten items appeared under two or more variants. Productivity correlated inversely with retrievability: as the number of variants for a name increased, the number of items retrieved under each variant decreased. For the most highly productive authors, the number of items retrieved under each variant tended toward one. The most frequent indexing methods varied between databases. In MEDLINE and IME, names were indexed correctly as “first surname second surname, first name initial middle name initial” (if present) in 41.7% and 49.5% of the records, respectively. However, in SCI, the most frequent method was “first surname, first name initial second name initial” (48.0% of

  18. The Spectrum of Pedagogical Orientations of Malawian and South African Physical Science Teachers towards Inquiry

    ERIC Educational Resources Information Center

    Ramnarain, Umesh; Nampota, Dorothy; Schuster, David

    2016-01-01

    This study investigated and compared the pedagogical orientations of physical sciences teachers in Malawi and South Africa towards inquiry or direct methods of science teaching. Pedagogical orientation has been theorized as a component of pedagogical content knowledge. Orientations were characterized along a spectrum of two variants of inquiry and…

  19. Arabidopsis Gene Family Profiler (aGFP)--user-oriented transcriptomic database with easy-to-use graphic interface.

    PubMed

    Dupl'áková, Nikoleta; Renák, David; Hovanec, Patrik; Honysová, Barbora; Twell, David; Honys, David

    2007-07-23

    Microarray technologies now belong to the standard functional genomics toolbox and have undergone massive development leading to increased genome coverage, accuracy and reliability. The number of experiments exploiting microarray technology has markedly increased in recent years. In parallel with the rapid accumulation of transcriptomic data, on-line analysis tools are being introduced to simplify their use. Global statistical data analysis methods contribute to the development of overall concepts about gene expression patterns and to query and compose working hypotheses. More recently, these applications are being supplemented with more specialized products offering visualization and specific data mining tools. We present a curated gene family-oriented gene expression database, Arabidopsis Gene Family Profiler (aGFP; http://agfp.ueb.cas.cz), which gives the user access to a large collection of normalised Affymetrix ATH1 microarray datasets. The database currently contains NASC Array and AtGenExpress transcriptomic datasets for various tissues at different developmental stages of wild type plants gathered from nearly 350 gene chips. The Arabidopsis GFP database has been designed as an easy-to-use tool for users needing an easily accessible resource for expression data of single genes, pre-defined gene families or custom gene sets, with the further possibility of keyword search. Arabidopsis Gene Family Profiler presents a user-friendly web interface using both graphic and text output. Data are stored at the MySQL server and individual queries are created in PHP script. The most distinguishable features of Arabidopsis Gene Family Profiler database are: 1) the presentation of normalized datasets (Affymetrix MAS algorithm and calculation of model-based gene-expression values based on the Perfect Match-only model); 2) the choice between two different normalization algorithms (Affymetrix MAS4 or MAS5 algorithms); 3) an intuitive interface; 4) an interactive "virtual

  20. MitBASE : a comprehensive and integrated mitochondrial DNA database. The present status

    PubMed Central

    Attimonelli, M.; Altamura, N.; Benne, R.; Brennicke, A.; Cooper, J. M.; D’Elia, D.; Montalvo, A. de; Pinto, B. de; De Robertis, M.; Golik, P.; Knoop, V.; Lanave, C.; Lazowska, J.; Licciulli, F.; Malladi, B. S.; Memeo, F.; Monnerot, M.; Pasimeni, R.; Pilbout, S.; Schapira, A. H. V.; Sloof, P.; Saccone, C.

    2000-01-01

    MitBASE is an integrated and comprehensive database of mitochondrial DNA data which collects, under a single interface, databases for Plant, Vertebrate, Invertebrate, Human, Protist and Fungal mtDNA and a Pilot database on nuclear genes involved in mitochondrial biogenesis in Saccharomyces cerevisiae. MitBASE reports all available information from different organisms and from intraspecies variants and mutants. Data have been drawn from the primary databases and from the literature; value adding information has been structured, e.g., editing information on protist mtDNA genomes, pathological information for human mtDNA variants, etc. The different databases, some of which are structured using commercial packages (Microsoft Access, File Maker Pro) while others use a flat-file format, have been integrated under ORACLE. Ad hoc retrieval systems have been devised for some of the above listed databases keeping into account their peculiarities. The database is resident at the EBI and is available at the following site: http://www3.ebi.ac.uk/Research/Mitbase/mitbase.pl . The impact of this project is intended for both basic and applied research. The study of mitochondrial genetic diseases and mitochondrial DNA intraspecies diversity are key topics in several biotechnological fields. The database has been funded within the EU Biotechnology programme. PMID:10592207

  1. Database computing in HEP

    NASA Technical Reports Server (NTRS)

    Day, C. T.; Loken, S.; Macfarlane, J. F.; May, E.; Lifka, D.; Lusk, E.; Price, L. E.; Baden, A.; Grossman, R.; Qin, X.

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors, I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototypes based on relational and object-oriented databases of CDF data samples.

  2. A rare variant of the mtDNA HVS1 sequence in the hairs of Napoléon's family.

    PubMed

    Lucotte, Gérard

    2010-10-04

    This paper describes the finding of a rare variant in the sequence of the hypervariable segment (HVS1) of mitochondrial (mtDNA) extracted from two preserved hairs, authenticated as belonging to the French Emperor Napoléon I (Napoléon Bonaparte). This rare variant is a mutation that changes the base C to T at position 16,184 (16184C→T), and it constitutes the only mutation found in this HVS1 sequence. This mutation is rare, because it was not found in a reference database (P < 0.05). In a personal database (M. Pala) comprising 37,000 different sequences, the 16184C→T mutation was found in only three samples, thus in this database the mutation frequency was 0.00008%. This mutation 16184C→T was also the only variant found subsequently in the HVS1 sequences of mtDNAs extracted from Napoléon's mother (Letizia) and from his youngest sister (Caroline), confirming that this mutation is maternally inherited. This 16184C→T variant could be used for genetic verification to authenticate any doubtful material and determine whether it should indeed be attributed to Napoléon.

  3. A rare variant of the mtDNA HVS1 sequence in the hairs of Napoléon's family

    PubMed Central

    2010-01-01

    This paper describes the finding of a rare variant in the sequence of the hypervariable segment (HVS1) of mitochondrial (mtDNA) extracted from two preserved hairs, authenticated as belonging to the French Emperor Napoléon I (Napoléon Bonaparte). This rare variant is a mutation that changes the base C to T at position 16,184 (16184C→T), and it constitutes the only mutation found in this HVS1 sequence. This mutation is rare, because it was not found in a reference database (P < 0.05). In a personal database (M. Pala) comprising 37,000 different sequences, the 16184C→T mutation was found in only three samples, thus in this database the mutation frequency was 0.00008%. This mutation 16184C→T was also the only variant found subsequently in the HVS1 sequences of mtDNAs extracted from Napoléon's mother (Letizia) and from his youngest sister (Caroline), confirming that this mutation is maternally inherited. This 16184C→T variant could be used for genetic verification to authenticate any doubtful material and determine whether it should indeed be attributed to Napoléon. PMID:21092341

  4. Three-dimensional spatial analysis of missense variants in RTEL1 identifies pathogenic variants in patients with Familial Interstitial Pneumonia.

    PubMed

    Sivley, R Michael; Sheehan, Jonathan H; Kropski, Jonathan A; Cogan, Joy; Blackwell, Timothy S; Phillips, John A; Bush, William S; Meiler, Jens; Capra, John A

    2018-01-23

    Next-generation sequencing of individuals with genetic diseases often detects candidate rare variants in numerous genes, but determining which are causal remains challenging. We hypothesized that the spatial distribution of missense variants in protein structures contains information about function and pathogenicity that can help prioritize variants of unknown significance (VUS) and elucidate the structural mechanisms leading to disease. To illustrate this approach in a clinical application, we analyzed 13 candidate missense variants in regulator of telomere elongation helicase 1 (RTEL1) identified in patients with Familial Interstitial Pneumonia (FIP). We curated pathogenic and neutral RTEL1 variants from the literature and public databases. We then used homology modeling to construct a 3D structural model of RTEL1 and mapped known variants into this structure. We next developed a pathogenicity prediction algorithm based on proximity to known disease causing and neutral variants and evaluated its performance with leave-one-out cross-validation. We further validated our predictions with segregation analyses, telomere lengths, and mutagenesis data from the homologous XPD protein. Our algorithm for classifying RTEL1 VUS based on spatial proximity to pathogenic and neutral variation accurately distinguished 7 known pathogenic from 29 neutral variants (ROC AUC = 0.85) in the N-terminal domains of RTEL1. Pathogenic proximity scores were also significantly correlated with effects on ATPase activity (Pearson r = -0.65, p = 0.0004) in XPD, a related helicase. Applying the algorithm to 13 VUS identified from sequencing of RTEL1 from patients predicted five out of six disease-segregating VUS to be pathogenic. We provide structural hypotheses regarding how these mutations may disrupt RTEL1 ATPase and helicase function. Spatial analysis of missense variation accurately classified candidate VUS in RTEL1 and suggests how such variants cause disease. Incorporating

  5. Databases in the Area of Pharmacogenetics

    PubMed Central

    Sim, Sarah C.; Altman, Russ B.; Ingelman-Sundberg, Magnus

    2012-01-01

    In the area of pharmacogenetics and personalized health care it is obvious that databases, providing important information of the occurrence and consequences of variant genes encoding drug metabolizing enzymes, drug transporters, drug targets, and other proteins of importance for drug response or toxicity, are of critical value for scientists, physicians, and industry. The primary outcome of the pharmacogenomic field is the identification of biomarkers that can predict drug toxicity and drug response, thereby individualizing and improving drug treatment of patients. The drug in question and the polymorphic gene exerting the impact are the main issues to be searched for in the databases. Here, we review the databases that provide useful information in this respect, of benefit for the development of the pharmacogenomic field. PMID:21309040

  6. A comprehensive SNP and indel imputability database.

    PubMed

    Duan, Qing; Liu, Eric Yi; Croteau-Chonka, Damien C; Mohlke, Karen L; Li, Yun

    2013-02-15

    Genotype imputation has become an indispensible step in genome-wide association studies (GWAS). Imputation accuracy, directly influencing downstream analysis, has shown to be improved using re-sequencing-based reference panels; however, this comes at the cost of high computational burden due to the huge number of potentially imputable markers (tens of millions) discovered through sequencing a large number of individuals. Therefore, there is an increasing need for access to imputation quality information without actually conducting imputation. To facilitate this process, we have established a publicly available SNP and indel imputability database, aiming to provide direct access to imputation accuracy information for markers identified by the 1000 Genomes Project across four major populations and covering multiple GWAS genotyping platforms. SNP and indel imputability information can be retrieved through a user-friendly interface by providing the ID(s) of the desired variant(s) or by specifying the desired genomic region. The query results can be refined by selecting relevant GWAS genotyping platform(s). This is the first database providing variant imputability information specific to each continental group and to each genotyping platform. In Filipino individuals from the Cebu Longitudinal Health and Nutrition Survey, our database can achieve an area under the receiver-operating characteristic curve of 0.97, 0.91, 0.88 and 0.79 for markers with minor allele frequency >5%, 3-5%, 1-3% and 0.5-1%, respectively. Specifically, by filtering out 48.6% of markers (corresponding to a reduction of up to 48.6% in computational costs for actual imputation) based on the imputability information in our database, we can remove 77%, 58%, 51% and 42% of the poorly imputed markers at the cost of only 0.3%, 0.8%, 1.5% and 4.6% of the well-imputed markers with minor allele frequency >5%, 3-5%, 1-3% and 0.5-1%, respectively. http://www.unc.edu/∼yunmli/imputability.html

  7. Microstructural development inside the stress induced martensite variant in a Ti-Ni-Nb shape memory alloy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zheng, Y.F.; Cai, W.; Zhang, J.X.

    2000-04-03

    The microstructural development inside the stress induced martensite (SIM) variants in Ti-Ni-Nb alloy with various degrees of deformation have been revealed by electron microscopic observations. The orientation relationship between the SIM and the parent phase has been found: [1{bar 1}0]{sub M}{parallel}[11{bar 1}]{sub B2}, (001){sub M} 5{degree} away from (101){sub B2}. The lattice invariant shear of the SIM variants at the slightly deformed stage is dominantly (11{bar 1}) Type I twin. Besides the ordinary slip, the adjustment and development of the internal secondary twinning from (11{bar 1}) Type I twin to {l_angle}011{r_angle} Type II/ or (011) Type I twin, (001)compound twinmore » and (111) Type I twin happen concurrently or in combination inside the SIM variants with the further deformation. The corresponding deformation mechanisms include stress induced reorientation of SIM substructural bands by the most favorably oriented twin system, stress induced migration of the SIM substructural boundary through internal twinning and stress induced injection of foreign SIM variant to the preexisting substructural bands.« less

  8. Ankle fracture spur sign is pathognomonic for a variant ankle fracture.

    PubMed

    Hinds, Richard M; Garner, Matthew R; Lazaro, Lionel E; Warner, Stephen J; Loftus, Michael L; Birnbaum, Jacqueline F; Burket, Jayme C; Lorich, Dean G

    2015-02-01

    The hyperplantarflexion variant ankle fracture is composed of a posterior tibial lip fracture with posterolateral and posteromedial fracture fragments separated by a vertical fracture line. This infrequently reported injury pattern often includes an associated "spur sign" or double cortical density at the inferomedial tibial metaphysis. The objective of this study was to quantitatively establish the association of the ankle fracture spur sign with the hyperplantarflexion variant ankle fracture. Our clinical database of operative ankle fractures was retrospectively reviewed for the incidence of hyperplantarflexion variant and nonvariant ankle fractures as determined by assessment of injury radiographs, preoperative advanced imaging, and intraoperative observation. Injury radiographs were then evaluated for the presence of the spur sign, and association between the spur sign and variant fractures was analyzed. The incidence of the hyperplantarflexion variant fracture among all ankle fractures was 6.7% (43/640). The spur sign was present in 79% (34/43) of variant fractures and absent in all nonvariant fractures, conferring a specificity of 100% in identifying variant fractures. Positive predictive value and negative predictive value were 100% and 99%, respectively. The ankle fracture spur sign was pathognomonic for the hyperplantarflexion variant ankle fracture. It is important to identify variant fractures preoperatively as patient positioning, operative approach, and fixation construct of variant fractures often differ from those employed for osteosynthesis of nonvariant fractures. Identification of the spur sign should prompt acquisition of advanced imaging to formulate an appropriate operative plan to address the variant fracture pattern. Level III, retrospective comparative study. © The Author(s) 2014.

  9. CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects

    PubMed Central

    Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

    2014-01-01

    CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB PMID:25281234

  10. CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects.

    PubMed

    Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

    2014-01-01

    CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB. © The Author(s) 2014. Published by Oxford University Press.

  11. A systematic approach to assessing the clinical significance of genetic variants.

    PubMed

    Duzkale, H; Shen, J; McLaughlin, H; Alfares, A; Kelly, M A; Pugh, T J; Funke, B H; Rehm, H L; Lebo, M S

    2013-11-01

    Molecular genetic testing informs diagnosis, prognosis, and risk assessment for patients and their family members. Recent advances in low-cost, high-throughput DNA sequencing and computing technologies have enabled the rapid expansion of genetic test content, resulting in dramatically increased numbers of DNA variants identified per test. To address this challenge, our laboratory has developed a systematic approach to thorough and efficient assessments of variants for pathogenicity determination. We first search for existing data in publications and databases including internal, collaborative and public resources. We then perform full evidence-based assessments through statistical analyses of observations in the general population and disease cohorts, evaluation of experimental data from in vivo or in vitro studies, and computational predictions of potential impacts of each variant. Finally, we weigh all evidence to reach an overall conclusion on the potential for each variant to be disease causing. In this report, we highlight the principles of variant assessment, address the caveats and pitfalls, and provide examples to illustrate the process. By sharing our experience and providing a framework for variant assessment, including access to a freely available customizable tool, we hope to help move towards standardized and consistent approaches to variant assessment. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  12. Probing the Orientation of Surface-Immobilized Protein G B1 Using ToF-SIMS Sum Frequency Generation and NEXAFS Spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    L Baugh; T Weidner; J Baio

    2011-12-31

    The ability to orient active proteins on surfaces is a critical aspect of many medical technologies. An important related challenge is characterizing protein orientation in these surface films. This study uses a combination of time-of-flight secondary ion mass spectrometry (ToF-SIMS), sum frequency generation (SFG) vibrational spectroscopy, and near-edge X-ray absorption fine structure (NEXAFS) spectroscopy to characterize the orientation of surface-immobilized Protein G B1, a rigid 6 kDa domain that binds the Fc fragment of IgG. Two Protein G B1 variants with a single cysteine introduced at either end were immobilized via the cysteine thiol onto maleimide-oligo(ethylene glycol)-functionalized gold and baremore » gold substrates. X-ray photoelectron spectroscopy was used to measure the amount of immobilized protein, and ToF-SIMS was used to measure the amino acid composition of the exposed surface of the protein films and to confirm covalent attachment of protein thiol to the substrate maleimide groups. SFG and NEXAFS were used to characterize the ordering and orientation of peptide or side chain bonds. On both substrates and for both cysteine positions, ToF-SIMS data showed enrichment of mass peaks from amino acids located at the end of the protein opposite to the cysteine surface position as compared with nonspecifically immobilized protein, indicating end-on protein orientations. Orientation on the maleimide substrate was enhanced by increasing pH (7.0-9.5) and salt concentration (0-1.5 M NaCl). SFG spectral peaks characteristic of ordered {alpha}-helix and {beta}-sheet elements were observed for both variants but not for cysteine-free wild type protein on the maleimide surface. The phase of the {alpha}-helix and {beta}-sheet peaks indicated a predominantly upright orientation for both variants, consistent with an end-on protein binding configuration. Polarization dependence of the NEXAFS signal from the N 1s to {pi}* transition of {beta}-sheet peptide bonds

  13. Implementing Relational Operations in an Object-Oriented Database

    DTIC Science & Technology

    1992-03-01

    computer aided software engineering (CASE) and computer aided design (CAD) tools. There has been some research done in the area of combining...35 2. Prograph Database Engine .................................................................. 38 III. W HY A N R/O...in most business applications where the bulk of data being stored and manipulated is simply textual or numeric data that can be stored and manipulated

  14. A Database Practicum for Teaching Database Administration and Software Development at Regis University

    ERIC Educational Resources Information Center

    Mason, Robert T.

    2013-01-01

    This research paper compares a database practicum at the Regis University College for Professional Studies (CPS) with technology oriented practicums at other universities. Successful andragogy for technology courses can motivate students to develop a genuine interest in the subject, share their knowledge with peers and can inspire students to…

  15. The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection.

    PubMed

    Fernández-Suárez, Xosé M; Rigden, Daniel J; Galperin, Michael Y

    2014-01-01

    The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI's MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).

  16. Genetic polymorphisms of pharmacogenomic VIP variants in the Kyrgyz population from northwest China.

    PubMed

    Yunus, Zulfiya; Liu, Lijun; Wang, Hong; Zhang, Le; Li, Xiaolan; Geng, Tingting; Kang, Longli; Jin, Tianbo; Chen, Chao

    2013-10-15

    Pharmacogenomic variant information is well known for major human populations; however, this information is less commonly studied in minorities. In the present study, we genotyped 85 very important pharmacogenetic (VIP) variants (selected from the PharmGKB database) in the Kyrgyz population and compared our data with other four major human populations including Han Chinese in Beijing, China (CHB), the Japanese in Tokyo, Japan (JPT), a northern and western Europe population (CEU), and the Yoruba in Ibadan, Nigeria (YRI). There were 13, 12 and 16 of the selected VIP variant genotype frequencies in the Kyrgyz which differed from those of the CHB, JPT and CEU, respectively (p<0.005). In the YRI, there were 32 different variants, compared to the Kyrgyz (p<0.005). Genotype frequencies of ADH1B, AHR, CYP3A5, PTGS2, VDR, and VKORC1 in the Kyrgyz differed widely from those in the four populations. Haplotype analyses also showed differences among the Kyrgyz and the other four populations. Our results complement the information provided by the database of pharmacogenomics on Kyrgyz. We provide a theoretical basis for safer drug administration and individualized treatment plans for the Kyrgyz. We also provide a template for the study of pharmacogenomics in various ethnic minority groups in China. © 2013 Elsevier B.V. All rights reserved.

  17. Changes in classification of genetic variants in BRCA1 and BRCA2.

    PubMed

    Kast, Karin; Wimberger, Pauline; Arnold, Norbert

    2018-02-01

    Classification of variants of unknown significance (VUS) in the breast cancer genes BRCA1 and BRCA2 changes with accumulating evidence for clinical relevance. In most cases down-staging towards neutral variants without clinical significance is possible. We searched the database of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC) for changes in classification of genetic variants as an update to our earlier publication on genetic variants in the Centre of Dresden. Changes between 2015 and 2017 were recorded. In the group of variants of unclassified significance (VUS, Class 3, uncertain), only changes of classification towards neutral genetic variants were noted. In BRCA1, 25% of the Class 3 variants (n = 2/8) changed to Class 2 (likely benign) and Class 1 (benign). In BRCA2, in 50% of the Class 3 variants (n = 16/32), a change to Class 2 (n = 10/16) or Class 1 (n = 6/16) was observed. No change in classification was noted in Class 4 (likely pathogenic) and Class 5 (pathogenic) genetic variants in both genes. No up-staging from Class 1, Class 2 or Class 3 to more clinical significance was observed. All variants with a change in classification in our cohort were down-staged towards no clinical significance by a panel of experts of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC). Prevention in families with Class 3 variants should be based on pedigree based risks and should not be guided by the presence of a VUS.

  18. Large-scale mass spectrometric detection of variant peptides resulting from non-synonymous nucleotide differences

    PubMed Central

    Sheynkman, Gloria M.; Shortreed, Michael R.; Frey, Brian L.; Scalf, Mark; Smith, Lloyd M.

    2013-01-01

    Each individual carries thousands of non-synonymous single nucleotide variants (nsSNVs) in their genome, each corresponding to a single amino acid polymorphism (SAP) in the encoded proteins. It is important to be able to directly detect and quantify these variations at the protein level in order to study post-transcriptional regulation, differential allelic expression, and other important biological processes. However, such variant peptides are not generally detected in standard proteomic analyses, due to their absence from the generic databases that are employed for mass spectrometry searching. Here, we extend previous work that demonstrated the use of customized SAP databases constructed from sample-matched RNA-Seq data. We collected deep coverage RNA-Seq data from the Jurkat cell line, compiled the set of nsSNVs that are expressed, used this information to construct a customized SAP database, and searched it against deep coverage shotgun MS data obtained from the same sample. This approach enabled detection of 421 SAP peptides mapping to 395 nsSNVs. We compared these peptides to peptides identified from a large generic search database containing all known nsSNVs (dbSNP) and found that more than 70% of the SAP peptides from this dbSNP-derived search were not supported by the RNA-Seq data, and thus are likely false positives. Next, we increased the SAP coverage from the RNA-Seq derived database by utilizing multiple protease digestions, thereby increasing variant detection to 695 SAP peptides mapping to 504 nsSNV sites. These detected SAP peptides corresponded to moderate to high abundance transcripts (30+ transcripts per million, TPM). The SAP peptides included 192 allelic pairs; the relative expression levels of the two alleles were evaluated for 51 of those pairs, and found to be comparable in all cases. PMID:24175627

  19. fMRI orientation decoding in V1 does not require global maps or globally coherent orientation stimuli.

    PubMed

    Alink, Arjen; Krugliak, Alexandra; Walther, Alexander; Kriegeskorte, Nikolaus

    2013-01-01

    The orientation of a large grating can be decoded from V1 functional magnetic resonance imaging (fMRI) data, even at low resolution (3-mm isotropic voxels). This finding has suggested that columnar-level neuronal information might be accessible to fMRI at 3T. However, orientation decodability might alternatively arise from global orientation-preference maps. Such global maps across V1 could result from bottom-up processing, if the preferences of V1 neurons were biased toward particular orientations (e.g., radial from fixation, or cardinal, i.e., vertical or horizontal). Global maps could also arise from local recurrent or top-down processing, reflecting pre-attentive perceptual grouping, attention spreading, or predictive coding of global form. Here we investigate whether fMRI orientation decoding with 2-mm voxels requires (a) globally coherent orientation stimuli and/or (b) global-scale patterns of V1 activity. We used opposite-orientation gratings (balanced about the cardinal orientations) and spirals (balanced about the radial orientation), along with novel patch-swapped variants of these stimuli. The two stimuli of a patch-swapped pair have opposite orientations everywhere (like their globally coherent parent stimuli). However, the two stimuli appear globally similar, a patchwork of opposite orientations. We find that all stimulus pairs are robustly decodable, demonstrating that fMRI orientation decoding does not require globally coherent orientation stimuli. Furthermore, decoding remained robust after spatial high-pass filtering for all stimuli, showing that fine-grained components of the fMRI patterns reflect visual orientations. Consistent with previous studies, we found evidence for global radial and vertical preference maps in V1. However, these were weak or absent for patch-swapped stimuli, suggesting that global preference maps depend on globally coherent orientations and might arise through recurrent or top-down processes related to the perception of

  20. Assigning Main Orientation to an EOH Descriptor on Multispectral Images.

    PubMed

    Li, Yong; Shi, Xiang; Wei, Lijun; Zou, Junwei; Chen, Fang

    2015-07-01

    This paper proposes an approach to compute an EOH (edge-oriented histogram) descriptor with main orientation. EOH has a better matching ability than SIFT (scale-invariant feature transform) on multispectral images, but does not assign a main orientation to keypoints. Alternatively, it tends to assign the same main orientation to every keypoint, e.g., zero degrees. This limits EOH to matching keypoints between images of translation misalignment only. Observing this limitation, we propose assigning to keypoints the main orientation that is computed with PIIFD (partial intensity invariant feature descriptor). In the proposed method, SIFT keypoints are detected from images as the extrema of difference of Gaussians, and every keypoint is assigned to the main orientation computed with PIIFD. Then, EOH is computed for every keypoint with respect to its main orientation. In addition, an implementation variant is proposed for fast computation of the EOH descriptor. Experimental results show that the proposed approach performs more robustly than the original EOH on image pairs that have a rotation misalignment.

  1. Childhood Abuse Experiences and the COMT and MTHFR Genetic Variants Associated With Male Sexual Orientation in the Han Chinese Populations: A Case-Control Study.

    PubMed

    Qin, Jia-Bi; Zhao, Guang-Lu; Wang, Feng; Cai, Yu-Mao; Lan, Li-Na; Yang, Lin; Feng, Tie-Jian

    2018-01-01

    variants could be positively associated with the development of homosexuality. However, it remains unknown how these factors jointly play a role in the development of homosexuality, and more studies in different ethnic populations and with a larger sample and a prospective design are required to confirm our findings. Qin J-B, Zhao G-L, Wang F, et al. Childhood Abuse Experiences and the COMT and MTHFR Genetic Variants Associated With Male Sexual Orientation in the Han Chinese Populations: A Case-Control Study. J Sex Med 2018;15:29-42. Copyright © 2017 International Society for Sexual Medicine. Published by Elsevier Inc. All rights reserved.

  2. Energies and 2'-Hydroxyl Group Orientations of RNA Backbone Conformations. Benchmark CCSD(T)/CBS Database, Electronic Analysis, and Assessment of DFT Methods and MD Simulations.

    PubMed

    Mládek, Arnošt; Banáš, Pavel; Jurečka, Petr; Otyepka, Michal; Zgarbová, Marie; Šponer, Jiří

    2014-01-14

    Sugar-phosphate backbone is an electronically complex molecular segment imparting RNA molecules high flexibility and architectonic heterogeneity necessary for their biological functions. The structural variability of RNA molecules is amplified by the presence of the 2'-hydroxyl group, capable of forming multitude of intra- and intermolecular interactions. Bioinformatics studies based on X-ray structure database revealed that RNA backbone samples at least 46 substates known as rotameric families. The present study provides a comprehensive analysis of RNA backbone conformational preferences and 2'-hydroxyl group orientations. First, we create a benchmark database of estimated CCSD(T)/CBS relative energies of all rotameric families and test performance of dispersion-corrected DFT-D3 methods and molecular mechanics in vacuum and in continuum solvent. The performance of the DFT-D3 methods is in general quite satisfactory. The B-LYP-D3 method provides the best trade-off between accuracy and computational demands. B3-LYP-D3 slightly outperforms the new PW6B95-D3 and MPW1B95-D3 and is the second most accurate density functional of the study. The best agreement with CCSD(T)/CBS is provided by DSD-B-LYP-D3 double-hybrid functional, although its large-scale applications may be limited by high computational costs. Molecular mechanics does not reproduce the fine energy differences between the RNA backbone substates. We also demonstrate that the differences in the magnitude of the hyperconjugation effect do not correlate with the energy ranking of the backbone conformations. Further, we investigated the 2'-hydroxyl group orientation preferences. For all families, we conducted a QM and MM hydroxyl group rigid scan in gas phase and solvent. We then carried out set of explicit solvent MD simulations of folded RNAs and analyze 2'-hydroxyl group orientations of different backbone families in MD. The solvent energy profiles determined primarily by the sugar pucker match well with the

  3. Retrovirus Integration Database (RID): a public database for retroviral insertion sites into host genomes.

    PubMed

    Shao, Wei; Shan, Jigui; Kearney, Mary F; Wu, Xiaolin; Maldarelli, Frank; Mellors, John W; Luke, Brian; Coffin, John M; Hughes, Stephen H

    2016-07-04

    The NCI Retrovirus Integration Database is a MySql-based relational database created for storing and retrieving comprehensive information about retroviral integration sites, primarily, but not exclusively, HIV-1. The database is accessible to the public for submission or extraction of data originating from experiments aimed at collecting information related to retroviral integration sites including: the site of integration into the host genome, the virus family and subtype, the origin of the sample, gene exons/introns associated with integration, and proviral orientation. Information about the references from which the data were collected is also stored in the database. Tools are built into the website that can be used to map the integration sites to UCSC genome browser, to plot the integration site patterns on a chromosome, and to display provirus LTRs in their inserted genome sequence. The website is robust, user friendly, and allows users to query the database and analyze the data dynamically. https://rid.ncifcrf.gov ; or http://home.ncifcrf.gov/hivdrp/resources.htm .

  4. Patient-oriented cancer information on the internet: a comparison of wikipedia and a professionally maintained database.

    PubMed

    Rajagopalan, Malolan S; Khanna, Vineet K; Leiter, Yaacov; Stott, Meghan; Showalter, Timothy N; Dicker, Adam P; Lawrence, Yaacov R

    2011-09-01

    A wiki is a collaborative Web site, such as Wikipedia, that can be freely edited. Because of a wiki's lack of formal editorial control, we hypothesized that the content would be less complete and accurate than that of a professional peer-reviewed Web site. In this study, the coverage, accuracy, and readability of cancer information on Wikipedia were compared with those of the patient-orientated National Cancer Institute's Physician Data Query (PDQ) comprehensive cancer database. For each of 10 cancer types, medically trained personnel scored PDQ and Wikipedia articles for accuracy and presentation of controversies by using an appraisal form. Reliability was assessed by using interobserver variability and test-retest reproducibility. Readability was calculated from word and sentence length. Evaluators were able to rapidly assess articles (18 minutes/article), with a test-retest reliability of 0.71 and interobserver variability of 0.53. For both Web sites, inaccuracies were rare, less than 2% of information examined. PDQ was significantly more readable than Wikipedia: Flesch-Kincaid grade level 9.6 versus 14.1. There was no difference in depth of coverage between PDQ and Wikipedia (29.9, 34.2, respectively; maximum possible score 72). Controversial aspects of cancer care were relatively poorly discussed in both resources (2.9 and 6.1 for PDQ and Wikipedia, respectively, NS; maximum possible score 18). A planned subanalysis comparing common and uncommon cancers demonstrated no difference. Although the wiki resource had similar accuracy and depth as the professionally edited database, it was significantly less readable. Further research is required to assess how this influences patients' understanding and retention.

  5. New workflow for classification of genetic variants' pathogenicity applied to hereditary recurrent fevers by the International Study Group for Systemic Autoinflammatory Diseases (INSAID).

    PubMed

    Van Gijn, Marielle E; Ceccherini, Isabella; Shinar, Yael; Carbo, Ellen C; Slofstra, Mariska; Arostegui, Juan I; Sarrabay, Guillaume; Rowczenio, Dorota; Omoyımnı, Ebun; Balci-Peynircioglu, Banu; Hoffman, Hal M; Milhavet, Florian; Swertz, Morris A; Touitou, Isabelle

    2018-03-29

    Hereditary recurrent fevers (HRFs) are rare inflammatory diseases sharing similar clinical symptoms and effectively treated with anti-inflammatory biological drugs. Accurate diagnosis of HRF relies heavily on genetic testing. This study aimed to obtain an experts' consensus on the clinical significance of gene variants in four well-known HRF genes: MEFV , TNFRSF1A , NLRP3 and MVK . We configured a MOLGENIS web platform to share and analyse pathogenicity classifications of the variants and to manage a consensus-based classification process. Four experts in HRF genetics submitted independent classifications of 858 variants. Classifications were driven to consensus by recruiting four more expert opinions and by targeting discordant classifications in five iterative rounds. Consensus classification was reached for 804/858 variants (94%). None of the unsolved variants (6%) remained with opposite classifications (eg, pathogenic vs benign). New mutational hotspots were found in all genes. We noted a lower pathogenic variant load and a higher fraction of variants with unknown or unsolved clinical significance in the MEFV gene. Applying a consensus-driven process on the pathogenicity assessment of experts yielded rapid classification of almost all variants of four HRF genes. The high-throughput database will profoundly assist clinicians and geneticists in the diagnosis of HRFs. The configured MOLGENIS platform and consensus evolution protocol are usable for assembly of other variant pathogenicity databases. The MOLGENIS software is available for reuse at http://github.com/molgenis/molgenis; the specific HRF configuration is available at http://molgenis.org/said/. The HRF pathogenicity classifications will be published on the INFEVERS database at https://fmf.igh.cnrs.fr/ISSAID/infevers/. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  6. Synthesis of spatially variant lattices.

    PubMed

    Rumpf, Raymond C; Pazos, Javier

    2012-07-02

    It is often desired to functionally grade and/or spatially vary a periodic structure like a photonic crystal or metamaterial, yet no general method for doing this has been offered in the literature. A straightforward procedure is described here that allows many properties of the lattice to be spatially varied at the same time while producing a final lattice that is still smooth and continuous. Properties include unit cell orientation, lattice spacing, fill fraction, and more. This adds many degrees of freedom to a design such as spatially varying the orientation to exploit directional phenomena. The method is not a coordinate transformation technique so it can more easily produce complicated and arbitrary spatial variance. To demonstrate, the algorithm is used to synthesize a spatially variant self-collimating photonic crystal to flow a Gaussian beam around a 90° bend. The performance of the structure was confirmed through simulation and it showed virtually no scattering around the bend that would have arisen if the lattice had defects or discontinuities.

  7. Multiple endocrine neoplasia type 1 (MEN1): An update of 208 new germline variants reported in the last nine years.

    PubMed

    Concolino, Paola; Costella, Alessandra; Capoluongo, Ettore

    2016-01-01

    This review will focus on the germline MEN1 mutations that have been reported in patients with MEN1 and other hereditary endocrine disorders from 2007 to September 2015. A comprehensive review regarding the analysis of 1336 MEN1 mutations reported in the first decade following the gene's identification was performed by Lemos and Thakker in 2008. No other similar papers are available in literature apart from these data. We also checked for the list of Locus-Specific DataBases (LSDBs) and we found five MEN1 free-online mutational databases. 151 articles from the NCBI PubMed literature database were read and evaluated and a total of 75 MEN1 variants were found. On the contrary, 67, 22 and 44 novel MEN1 variants were obtained from ClinVar, MEN1 at Café Variome and HGMD (The Human Gene Mutation Database) databases respectively. A final careful analysis of MEN1 mutations affecting the coding region was performed. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects.

    PubMed

    Raimondi, Daniele; Gazzo, Andrea M; Rooman, Marianne; Lenaerts, Tom; Vranken, Wim F

    2016-06-15

    There are now many predictors capable of identifying the likely phenotypic effects of single nucleotide variants (SNVs) or short in-frame Insertions or Deletions (INDELs) on the increasing amount of genome sequence data. Most of these predictors focus on SNVs and use a combination of features related to sequence conservation, biophysical, and/or structural properties to link the observed variant to either neutral or disease phenotype. Despite notable successes, the mapping between genetic variants and their phenotypic effects is riddled with levels of complexity that are not yet fully understood and that are often not taken into account in the predictions, despite their promise of significantly improving the prediction of deleterious mutants. We present DEOGEN, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs. By integrating information from different biological scales and mimicking the complex mixture of effects that lead from the variant to the phenotype, we obtain significant improvements in the variant-effect prediction results. Next to the typical variant-oriented features based on the evolutionary conservation of the mutated positions, we added a collection of protein-oriented features that are based on functional aspects of the gene affected. We cross-validated DEOGEN on 36 825 polymorphisms, 20 821 deleterious SNVs, and 1038 INDELs from SwissProt. The multilevel contextualization of each (variant, protein) pair in DEOGEN provides a 10% improvement of MCC with respect to current state-of-the-art tools. The software and the data presented here is publicly available at http://ibsquare.be/deogen : wvranken@vub.ac.be Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. The Clock Is Ticking: Library Orientation as Puzzle Room

    ERIC Educational Resources Information Center

    Reade, Tripp

    2017-01-01

    Tripp Reade is the school librarian at Cardinal Gibbons High School in Raleigh, North Carolina. This article describes how he redesigned his school's library orientation program after learning about escape rooms and a variant known as puzzle rooms. Puzzle rooms present players with a set of challenges to solve; they require "teamwork,…

  10. Filovirus RefSeq Entries: Evaluation and Selection of Filovirus Type Variants, Type Sequences, and Names

    PubMed Central

    Kuhn, Jens H.; Andersen, Kristian G.; Bào, Yīmíng; Bavari, Sina; Becker, Stephan; Bennett, Richard S.; Bergman, Nicholas H.; Blinkova, Olga; Bradfute, Steven; Brister, J. Rodney; Bukreyev, Alexander; Chandran, Kartik; Chepurnov, Alexander A.; Davey, Robert A.; Dietzgen, Ralf G.; Doggett, Norman A.; Dolnik, Olga; Dye, John M.; Enterlein, Sven; Fenimore, Paul W.; Formenty, Pierre; Freiberg, Alexander N.; Garry, Robert F.; Garza, Nicole L.; Gire, Stephen K.; Gonzalez, Jean-Paul; Griffiths, Anthony; Happi, Christian T.; Hensley, Lisa E.; Herbert, Andrew S.; Hevey, Michael C.; Hoenen, Thomas; Honko, Anna N.; Ignatyev, Georgy M.; Jahrling, Peter B.; Johnson, Joshua C.; Johnson, Karl M.; Kindrachuk, Jason; Klenk, Hans-Dieter; Kobinger, Gary; Kochel, Tadeusz J.; Lackemeyer, Matthew G.; Lackner, Daniel F.; Leroy, Eric M.; Lever, Mark S.; Mühlberger, Elke; Netesov, Sergey V.; Olinger, Gene G.; Omilabu, Sunday A.; Palacios, Gustavo; Panchal, Rekha G.; Park, Daniel J.; Patterson, Jean L.; Paweska, Janusz T.; Peters, Clarence J.; Pettitt, James; Pitt, Louise; Radoshitzky, Sheli R.; Ryabchikova, Elena I.; Saphire, Erica Ollmann; Sabeti, Pardis C.; Sealfon, Rachel; Shestopalov, Aleksandr M.; Smither, Sophie J.; Sullivan, Nancy J.; Swanepoel, Robert; Takada, Ayato; Towner, Jonathan S.; van der Groen, Guido; Volchkov, Viktor E.; Volchkova, Valentina A.; Wahl-Jensen, Victoria; Warren, Travis K.; Warfield, Kelly L.; Weidmann, Manfred; Nichol, Stuart T.

    2014-01-01

    Sequence determination of complete or coding-complete genomes of viruses is becoming common practice for supporting the work of epidemiologists, ecologists, virologists, and taxonomists. Sequencing duration and costs are rapidly decreasing, sequencing hardware is under modification for use by non-experts, and software is constantly being improved to simplify sequence data management and analysis. Thus, analysis of virus disease outbreaks on the molecular level is now feasible, including characterization of the evolution of individual virus populations in single patients over time. The increasing accumulation of sequencing data creates a management problem for the curators of commonly used sequence databases and an entry retrieval problem for end users. Therefore, utilizing the data to their fullest potential will require setting nomenclature and annotation standards for virus isolates and associated genomic sequences. The National Center for Biotechnology Information’s (NCBI’s) RefSeq is a non-redundant, curated database for reference (or type) nucleotide sequence records that supplies source data to numerous other databases. Building on recently proposed templates for filovirus variant naming [ ()////variant designation>-], we report consensus decisions from a majority of past and currently active filovirus experts on the eight filovirus type variants and isolates to be represented in RefSeq, their final designations, and their associated sequences. PMID:25256396

  11. A Database Design and Development Case: NanoTEK Networks

    ERIC Educational Resources Information Center

    Ballenger, Robert M.

    2010-01-01

    This case provides a real-world project-oriented case study for students enrolled in a management information systems, database management, or systems analysis and design course in which database design and development are taught. The case consists of a business scenario to provide background information and details of the unique operating…

  12. A Quality-Control-Oriented Database for a Mesoscale Meteorological Observation Network

    NASA Astrophysics Data System (ADS)

    Lussana, C.; Ranci, M.; Uboldi, F.

    2012-04-01

    In the operational context of a local weather service, data accessibility and quality related issues must be managed by taking into account a wide set of user needs. This work describes the structure and the operational choices made for the operational implementation of a database system storing data from highly automated observing stations, metadata and information on data quality. Lombardy's environmental protection agency, ARPA Lombardia, manages a highly automated mesoscale meteorological network. A Quality Assurance System (QAS) ensures that reliable observational information is collected and disseminated to the users. The weather unit in ARPA Lombardia, at the same time an important QAS component and an intensive data user, has developed a database specifically aimed to: 1) providing quick access to data for operational activities and 2) ensuring data quality for real-time applications, by means of an Automatic Data Quality Control (ADQC) procedure. Quantities stored in the archive include hourly aggregated observations of: precipitation amount, temperature, wind, relative humidity, pressure, global and net solar radiation. The ADQC performs several independent tests on raw data and compares their results in a decision-making procedure. An important ADQC component is the Spatial Consistency Test based on Optimal Interpolation. Interpolated and Cross-Validation analysis values are also stored in the database, providing further information to human operators and useful estimates in case of missing data. The technical solution adopted is based on a LAMP (Linux, Apache, MySQL and Php) system, constituting an open source environment suitable for both development and operational practice. The ADQC procedure itself is performed by R scripts directly interacting with the MySQL database. Users and network managers can access the database by using a set of web-based Php applications.

  13. Space Launch System Booster Separation Aerodynamic Database Development and Uncertainty Quantification

    NASA Technical Reports Server (NTRS)

    Chan, David T.; Pinier, Jeremy T.; Wilcox, Floyd J., Jr.; Dalle, Derek J.; Rogers, Stuart E.; Gomez, Reynaldo J.

    2016-01-01

    The development of the aerodynamic database for the Space Launch System (SLS) booster separation environment has presented many challenges because of the complex physics of the ow around three independent bodies due to proximity e ects and jet inter- actions from the booster separation motors and the core stage engines. This aerodynamic environment is dicult to simulate in a wind tunnel experiment and also dicult to simu- late with computational uid dynamics. The database is further complicated by the high dimensionality of the independent variable space, which includes the orientation of the core stage, the relative positions and orientations of the solid rocket boosters, and the thrust lev- els of the various engines. Moreover, the clearance between the core stage and the boosters during the separation event is sensitive to the aerodynamic uncertainties of the database. This paper will present the development process for Version 3 of the SLS booster separa- tion aerodynamic database and the statistics-based uncertainty quanti cation process for the database.

  14. Medical Image Databases

    PubMed Central

    Tagare, Hemant D.; Jaffe, C. Carl; Duncan, James

    1997-01-01

    Abstract Information contained in medical images differs considerably from that residing in alphanumeric format. The difference can be attributed to four characteristics: (1) the semantics of medical knowledge extractable from images is imprecise; (2) image information contains form and spatial data, which are not expressible in conventional language; (3) a large part of image information is geometric; (4) diagnostic inferences derived from images rest on an incomplete, continuously evolving model of normality. This paper explores the differentiating characteristics of text versus images and their impact on design of a medical image database intended to allow content-based indexing and retrieval. One strategy for implementing medical image databases is presented, which employs object-oriented iconic queries, semantics by association with prototypes, and a generic schema. PMID:9147338

  15. The EUVE Proposal Database

    NASA Astrophysics Data System (ADS)

    Christian, C. A.; Olson, E. C.

    1993-01-01

    The proposal database and scheduling system for the Extreme Ultraviolet Explorer is described. The proposal database has been implemented to take input for approved observations selected by the EUVE Peer Review Panel and output target information suitable for the scheduling system to digest. The scheduling system is a hybrid of the SPIKE program and EUVE software which checks spacecraft constraints, produces a proposed schedule and selects spacecraft orientations with optimal configurations for acquiring star trackers, etc. This system is used to schedule the In Orbit Calibration activities that took place this summer, following the EUVE launch in early June 1992. The strategy we have implemented has implications for the selection of approved targets, which have impacted the Peer Review process. In addition, we will discuss how the proposal database, founded on Sybase, controls the processing of EUVE Guest Observer data.

  16. Identifying Mendelian disease genes with the Variant Effect Scoring Tool

    PubMed Central

    2013-01-01

    Background Whole exome sequencing studies identify hundreds to thousands of rare protein coding variants of ambiguous significance for human health. Computational tools are needed to accelerate the identification of specific variants and genes that contribute to human disease. Results We have developed the Variant Effect Scoring Tool (VEST), a supervised machine learning-based classifier, to prioritize rare missense variants with likely involvement in human disease. The VEST classifier training set comprised ~ 45,000 disease mutations from the latest Human Gene Mutation Database release and another ~45,000 high frequency (allele frequency >1%) putatively neutral missense variants from the Exome Sequencing Project. VEST outperforms some of the most popular methods for prioritizing missense variants in carefully designed holdout benchmarking experiments (VEST ROC AUC = 0.91, PolyPhen2 ROC AUC = 0.86, SIFT4.0 ROC AUC = 0.84). VEST estimates variant score p-values against a null distribution of VEST scores for neutral variants not included in the VEST training set. These p-values can be aggregated at the gene level across multiple disease exomes to rank genes for probable disease involvement. We tested the ability of an aggregate VEST gene score to identify candidate Mendelian disease genes, based on whole-exome sequencing of a small number of disease cases. We used whole-exome data for two Mendelian disorders for which the causal gene is known. Considering only genes that contained variants in all cases, the VEST gene score ranked dihydroorotate dehydrogenase (DHODH) number 2 of 2253 genes in four cases of Miller syndrome, and myosin-3 (MYH3) number 2 of 2313 genes in three cases of Freeman Sheldon syndrome. Conclusions Our results demonstrate the potential power gain of aggregating bioinformatics variant scores into gene-level scores and the general utility of bioinformatics in assisting the search for disease genes in large-scale exome sequencing studies. VEST is

  17. Patient-Oriented Cancer Information on the Internet: A Comparison of Wikipedia and a Professionally Maintained Database

    PubMed Central

    Rajagopalan, Malolan S.; Khanna, Vineet K.; Leiter, Yaacov; Stott, Meghan; Showalter, Timothy N.; Dicker, Adam P.; Lawrence, Yaacov R.

    2011-01-01

    Purpose: A wiki is a collaborative Web site, such as Wikipedia, that can be freely edited. Because of a wiki's lack of formal editorial control, we hypothesized that the content would be less complete and accurate than that of a professional peer-reviewed Web site. In this study, the coverage, accuracy, and readability of cancer information on Wikipedia were compared with those of the patient-orientated National Cancer Institute's Physician Data Query (PDQ) comprehensive cancer database. Methods: For each of 10 cancer types, medically trained personnel scored PDQ and Wikipedia articles for accuracy and presentation of controversies by using an appraisal form. Reliability was assessed by using interobserver variability and test-retest reproducibility. Readability was calculated from word and sentence length. Results: Evaluators were able to rapidly assess articles (18 minutes/article), with a test-retest reliability of 0.71 and interobserver variability of 0.53. For both Web sites, inaccuracies were rare, less than 2% of information examined. PDQ was significantly more readable than Wikipedia: Flesch-Kincaid grade level 9.6 versus 14.1. There was no difference in depth of coverage between PDQ and Wikipedia (29.9, 34.2, respectively; maximum possible score 72). Controversial aspects of cancer care were relatively poorly discussed in both resources (2.9 and 6.1 for PDQ and Wikipedia, respectively, NS; maximum possible score 18). A planned subanalysis comparing common and uncommon cancers demonstrated no difference. Conclusion: Although the wiki resource had similar accuracy and depth as the professionally edited database, it was significantly less readable. Further research is required to assess how this influences patients' understanding and retention. PMID:22211130

  18. Rationale and uses of a public HIV drug-resistance database.

    PubMed

    Shafer, Robert W

    2006-09-15

    Knowledge regarding the drug resistance of human immunodeficiency virus (HIV) is critical for surveillance of drug resistance, development of antiretroviral drugs, and management of infections with drug-resistant viruses. Such knowledge is derived from studies that correlate genetic variation in the targets of therapy with the antiretroviral treatments received by persons from whom the variant was obtained (genotype-treatment), with drug-susceptibility data on genetic variants (genotype-phenotype), and with virological and clinical response to a new treatment regimen (genotype-outcome). An HIV drug-resistance database is required to represent, store, and analyze the diverse forms of data underlying our knowledge of drug resistance and to make these data available to the broad community of researchers studying drug resistance in HIV and clinicians using HIV drug-resistance tests. Such genotype-treatment, genotype-phenotype, and genotype-outcome correlations are contained in the Stanford HIV RT and Protease Sequence Database and have specific usefulness.

  19. Assessment of epithelial sodium channel variants in nonwhite cystic fibrosis patients with non-diagnostic CFTR genotypes.

    PubMed

    Brennan, Marie-Luise; Pique, Lynn M; Schrijver, Iris

    2016-01-01

    Several lines of evidence suggest a role for the epithelial sodium channel (ENaC) in cystic fibrosis (CF). The purpose of our study was to assess the contribution of genetic variants in the ENaC subunits (α, β, γ) in nonwhite CF patients in whom CFTR molecular testing has been non-diagnostic. Samples were obtained from patients who were nonwhite and whose molecular CFTR testing did not identify two mutations. Sequencing of the SCNN1A, B, and G genes was performed and variants assessed for pathogenicity and association with CF using databases, protein and splice site mutation analysis software, and literature review. We identified four nonsynonymous amino acid variants in SCNN1A, three in SCNN1B and one in SCNN1G. There was no convincing evidence of pathogenicity. Whereas all have been reported in the dbSNP database, only p.Ala334Thr, p.Val573Ile, and p.Thr663Ala in SCNN1A, p.Gly442Val in SCNN1B and p.Gly183Ser in SCNN1G were previously reported in ENaC genetic studies of CF or CF-like patients. Synonymous substitutions were also observed but novel synonymous variants were not detected. There is no conclusive association of ENaC genetic variants with CF in nonwhite CF patients. Copyright © 2015 European Cystic Fibrosis Society. Published by Elsevier B.V. All rights reserved.

  20. Pan-cancer analysis reveals technical artifacts in TCGA germline variant calls.

    PubMed

    Buckley, Alexandra R; Standish, Kristopher A; Bhutani, Kunal; Ideker, Trey; Lasken, Roger S; Carter, Hannah; Harismendy, Olivier; Schork, Nicholas J

    2017-06-12

    Cancer research to date has largely focused on somatically acquired genetic aberrations. In contrast, the degree to which germline, or inherited, variation contributes to tumorigenesis remains unclear, possibly due to a lack of accessible germline variant data. Here we called germline variants on 9618 cases from The Cancer Genome Atlas (TCGA) database representing 31 cancer types. We identified batch effects affecting loss of function (LOF) variant calls that can be traced back to differences in the way the sequence data were generated both within and across cancer types. Overall, LOF indel calls were more sensitive to technical artifacts than LOF Single Nucleotide Variant (SNV) calls. In particular, whole genome amplification of DNA prior to sequencing led to an artificially increased burden of LOF indel calls, which confounded association analyses relating germline variants to tumor type despite stringent indel filtering strategies. The samples affected by these technical artifacts include all acute myeloid leukemia and practically all ovarian cancer samples. We demonstrate how technical artifacts induced by whole genome amplification of DNA can lead to false positive germline-tumor type associations and suggest TCGA whole genome amplified samples be used with caution. This study draws attention to the need to be sensitive to problems associated with a lack of uniformity in data generation in TCGA data.

  1. Representing metabolic pathway information: an object-oriented approach.

    PubMed

    Ellis, L B; Speedie, S M; McLeish, R

    1998-01-01

    The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) is a website providing information and dynamic links for microbial metabolic pathways, enzyme reactions, and their substrates and products. The Compound, Organism, Reaction and Enzyme (CORE) object-oriented database management system was developed to contain and serve this information. CORE was developed using Java, an object-oriented programming language, and PSE persistent object classes from Object Design, Inc. CORE dynamically generates descriptive web pages for reactions, compounds and enzymes, and reconstructs ad hoc pathway maps starting from any UM-BBD reaction. CORE code is available from the authors upon request. CORE is accessible through the UM-BBD at: http://www. labmed.umn.edu/umbbd/index.html.

  2. STOPGAP: a database for systematic target opportunity assessment by genetic association predictions.

    PubMed

    Shen, Judong; Song, Kijoung; Slater, Andrew J; Ferrero, Enrico; Nelson, Matthew R

    2017-09-01

    We developed the STOPGAP (Systematic Target OPportunity assessment by Genetic Association Predictions) database, an extensive catalog of human genetic associations mapped to effector gene candidates. STOPGAP draws on a variety of publicly available GWAS associations, linkage disequilibrium (LD) measures, functional genomic and variant annotation sources. Algorithms were developed to merge the association data, partition associations into non-overlapping LD clusters, map variants to genes and produce a variant-to-gene score used to rank the relative confidence among potential effector genes. This database can be used for a multitude of investigations into the genes and genetic mechanisms underlying inter-individual variation in human traits, as well as supporting drug discovery applications. Shell, R, Perl and Python scripts and STOPGAP R data files (version 2.5.1 at publication) are available at https://github.com/StatGenPRD/STOPGAP . Some of the most useful STOPGAP fields can be queried through an R Shiny web application at http://stopgapwebapp.com . matthew.r.nelson@gsk.com. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  3. Diversity and impact of rare variants in genes encoding the platelet G protein-coupled receptors.

    PubMed

    Jones, Matthew L; Norman, Jane E; Morgan, Neil V; Mundell, Stuart J; Lordkipanidzé, Marie; Lowe, Gillian C; Daly, Martina E; Simpson, Michael A; Drake, Sian; Watson, Steve P; Mumford, Andrew D

    2015-04-01

    Platelet responses to activating agonists are influenced by common population variants within or near G protein-coupled receptor (GPCR) genes that affect receptor activity. However, the impact of rare GPCR gene variants is unknown. We describe the rare single nucleotide variants (SNVs) in the coding and splice regions of 18 GPCR genes in 7,595 exomes from the 1,000-genomes and Exome Sequencing Project databases and in 31 cases with inherited platelet function disorders (IPFDs). In the population databases, the GPCR gene target regions contained 740 SNVs (318 synonymous, 410 missense, 7 stop gain and 6 splice region) of which 70 % had global minor allele frequency (MAF) < 0.05 %. Functional annotation using six computational algorithms, experimental evidence and structural data identified 156/740 (21 %) SNVs as potentially damaging to GPCR function, most commonly in regions encoding the transmembrane and C-terminal intracellular receptor domains. In 31 index cases with IPFDs (Gi-pathway defect n=15; secretion defect n=11; thromboxane pathway defect n=3 and complex defect n=2) there were 256 SNVs in the target regions of 15 stimulatory platelet GPCRs (34 unique; 12 with MAF< 1 % and 22 with MAF≥ 1 %). These included rare variants predicting R122H, P258T and V207A substitutions in the P2Y12 receptor that were annotated as potentially damaging, but only partially explained the platelet function defects in each case. Our data highlight that potentially damaging variants in platelet GPCR genes have low individual frequencies, but are collectively abundant in the population. Potentially damaging variants are also present in pedigrees with IPFDs and may contribute to complex laboratory phenotypes.

  4. Efficient analysis of mouse genome sequences reveal many nonsense variants

    PubMed Central

    Steeland, Sophie; Timmermans, Steven; Van Ryckeghem, Sara; Hulpiau, Paco; Saeys, Yvan; Van Montagu, Marc; Vandenbroucke, Roosmarijn E.; Libert, Claude

    2016-01-01

    Genetic polymorphisms in coding genes play an important role when using mouse inbred strains as research models. They have been shown to influence research results, explain phenotypical differences between inbred strains, and increase the amount of interesting gene variants present in the many available inbred lines. SPRET/Ei is an inbred strain derived from Mus spretus that has ∼1% sequence difference with the C57BL/6J reference genome. We obtained a listing of all SNPs and insertions/deletions (indels) present in SPRET/Ei from the Mouse Genomes Project (Wellcome Trust Sanger Institute) and processed these data to obtain an overview of all transcripts having nonsynonymous coding sequence variants. We identified 8,883 unique variants affecting 10,096 different transcripts from 6,328 protein-coding genes, which is about 28% of all coding genes. Because only a subset of these variants results in drastic changes in proteins, we focused on variations that are nonsense mutations that ultimately resulted in a gain of a stop codon. These genes were identified by in silico changing the C57BL/6J coding sequences to the SPRET/Ei sequences, converting them to amino acid (AA) sequences, and comparing the AA sequences. All variants and transcripts affected were also stored in a database, which can be browsed using a SPRET/Ei M. spretus variants web tool (www.spretus.org), including a manual. We validated the tool by demonstrating the loss of function of three proteins predicted to be severely truncated, namely Fas, IRAK2, and IFNγR1. PMID:27147605

  5. Italian Present-day Stress Indicators: IPSI Database

    NASA Astrophysics Data System (ADS)

    Mariucci, M. T.; Montone, P.

    2017-12-01

    In Italy, since the 90s of the last century, researches concerning the contemporary stress field have been developing at Istituto Nazionale di Geofisica e Vulcanologia (INGV) with local and regional scale studies. Throughout the years many data have been analysed and collected: now they are organized and available for an easy end-use online. IPSI (Italian Present-day Stress Indicators) database, is the first geo-referenced repository of information on the crustal present-day stress field maintained at INGV through a web application database and website development by Gabriele Tarabusi. Data consist of horizontal stress orientations analysed and compiled in a standardized format and quality-ranked for reliability and comparability on a global scale with other database. Our first database release includes 855 data records updated to December 2015. Here we present an updated version that will be released in 2018, after new earthquake data entry up to December 2017. The IPSI web site (http://ipsi.rm.ingv.it/) allows accessing data on a standard map viewer and choose which data (category and/or quality) to plot easily. The main information of each single element (type, quality, orientation) can be viewed simply going over the related symbol, all the information appear by clicking the element. At the same time, simple basic information on the different data type, tectonic regime assignment, quality ranking method are available with pop-up windows. Data records can be downloaded in some common formats, moreover it is possible to download a file directly usable with SHINE, a web based application to interpolate stress orientations (http://shine.rm.ingv.it). IPSI is mainly conceived for those interested in studying the characters of Italian peninsula and surroundings although Italian data are part of the World Stress Map (http://www.world-stress-map.org/) as evidenced by many links that redirect to this database for more details on standard practices in this field.

  6. Clinical spectrum of KIAA2022 pathogenic variants in males: Case report of two boys with KIAA2022 pathogenic variants and review of the literature.

    PubMed

    Lorenzo, Melissa; Stolte-Dijkstra, Irene; van Rheenen, Patrick; Smith, Ronald Garth; Scheers, Tom; Walia, Jagdeep S

    2018-06-01

    KIAA2022 is an X-linked intellectual disability (XLID) syndrome affecting males more severely than females. Few males with KIAA2022 variants and XLID have been reported. We present a clinical report of two unrelated males, with two nonsense KIAA2022 pathogenic variants, with profound intellectual disabilities, limited language development, strikingly similar autistic behavior, delay in motor milestones, and postnatal growth restriction. Patient 1, 19-years-old, has long ears, deeply set eyes with keratoconus, strabismus, a narrow forehead, anteverted nares, café-au-lait spots, macroglossia, thick vermilion of the upper and lower lips, and prognathism. He has gastroesophageal reflux, constipation with delayed rectosigmoid colonic transit time, difficulty regulating temperature, several musculoskeletal issues, and a history of one grand mal seizure. Patient 2, 10-years-old, has mild dysmorphic features, therapy resistant vomiting with diminished motility of the stomach, mild constipation, cortical visual impairment with intermittent strabismus, axial hypotonia, difficulty regulating temperature, and cutaneous mastocytosis. Genetic testing identified KIAA2022 variant c.652C > T(p.Arg218*) in Patient 1, and a novel nonsense de novo variant c.2707G > T(p.Glu903*) in Patient 2. We also summarized features of all reported males with KIAA2022 variants to date. This report not only adds knowledge of a novel pathogenic variant to the KIAA2022 variant database, but also likely extends the spectrum by describing novel dysmorphic features and medical conditions including macroglossia, café-au-lait spots, keratoconus, severe cutaneous mastocytosis, and motility problems of the GI tract, which may help physicians involved in the care of patients with this syndrome. Lastly, we describe the power of social media in bringing families with rare medical conditions together. © 2018 Wiley Periodicals, Inc.

  7. Heterogenous database integration in a physician workstation.

    PubMed

    Annevelink, J; Young, C Y; Tang, P C

    1991-01-01

    We discuss the integration of a variety of data and information sources in a Physician Workstation (PWS), focusing on the integration of data from DHCP, the Veteran Administration's Distributed Hospital Computer Program. We designed a logically centralized, object-oriented data-schema, used by end users and applications to explore the data accessible through an object-oriented database using a declarative query language. We emphasize the use of procedural abstraction to transparently integrate a variety of information sources into the data schema.

  8. Heterogenous database integration in a physician workstation.

    PubMed Central

    Annevelink, J.; Young, C. Y.; Tang, P. C.

    1991-01-01

    We discuss the integration of a variety of data and information sources in a Physician Workstation (PWS), focusing on the integration of data from DHCP, the Veteran Administration's Distributed Hospital Computer Program. We designed a logically centralized, object-oriented data-schema, used by end users and applications to explore the data accessible through an object-oriented database using a declarative query language. We emphasize the use of procedural abstraction to transparently integrate a variety of information sources into the data schema. PMID:1807624

  9. SAbDab: the structural antibody database

    PubMed Central

    Dunbar, James; Krawczyk, Konrad; Leem, Jinwoo; Baker, Terry; Fuchs, Angelika; Georges, Guy; Shi, Jiye; Deane, Charlotte M.

    2014-01-01

    Structural antibody database (SAbDab; http://opig.stats.ox.ac.uk/webapps/sabdab) is an online resource containing all the publicly available antibody structures annotated and presented in a consistent fashion. The data are annotated with several properties including experimental information, gene details, correct heavy and light chain pairings, antigen details and, where available, antibody–antigen binding affinity. The user can select structures, according to these attributes as well as structural properties such as complementarity determining region loop conformation and variable domain orientation. Individual structures, datasets and the complete database can be downloaded. PMID:24214988

  10. Competitive region orientation code for palmprint verification and identification

    NASA Astrophysics Data System (ADS)

    Tang, Wenliang

    2015-11-01

    Orientation features of the palmprint have been widely investigated in coding-based palmprint-recognition methods. Conventional orientation-based coding methods usually used discrete filters to extract the orientation feature of palmprint. However, in real operations, the orientations of the filter usually are not consistent with the lines of the palmprint. We thus propose a competitive region orientation-based coding method. Furthermore, an effective weighted balance scheme is proposed to improve the accuracy of the extracted region orientation. Compared with conventional methods, the region orientation of the palmprint extracted using the proposed method can precisely and robustly describe the orientation feature of the palmprint. Extensive experiments on the baseline PolyU and multispectral palmprint databases are performed and the results show that the proposed method achieves a promising performance in comparison to conventional state-of-the-art orientation-based coding methods in both palmprint verification and identification.

  11. Brute-Force Approach for Mass Spectrometry-Based Variant Peptide Identification in Proteogenomics without Personalized Genomic Data

    NASA Astrophysics Data System (ADS)

    Ivanov, Mark V.; Lobas, Anna A.; Levitsky, Lev I.; Moshkovskii, Sergei A.; Gorshkov, Mikhail V.

    2018-02-01

    In a proteogenomic approach based on tandem mass spectrometry analysis of proteolytic peptide mixtures, customized exome or RNA-seq databases are employed for identifying protein sequence variants. However, the problem of variant peptide identification without personalized genomic data is important for a variety of applications. Following the recent proposal by Chick et al. (Nat. Biotechnol. 33, 743-749, 2015) on the feasibility of such variant peptide search, we evaluated two available approaches based on the previously suggested "open" search and the "brute-force" strategy. To improve the efficiency of these approaches, we propose an algorithm for exclusion of false variant identifications from the search results involving analysis of modifications mimicking single amino acid substitutions. Also, we propose a de novo based scoring scheme for assessment of identified point mutations. In the scheme, the search engine analyzes y-type fragment ions in MS/MS spectra to confirm the location of the mutation in the variant peptide sequence.

  12. APADB: a database for alternative polyadenylation and microRNA regulation events

    PubMed Central

    Müller, Sören; Rycak, Lukas; Afonso-Grunz, Fabian; Winter, Peter; Zawada, Adam M.; Damrath, Ewa; Scheider, Jessica; Schmäh, Juliane; Koch, Ina; Kahl, Günter; Rotter, Björn

    2014-01-01

    Alternative polyadenylation (APA) is a widespread mechanism that contributes to the sophisticated dynamics of gene regulation. Approximately 50% of all protein-coding human genes harbor multiple polyadenylation (PA) sites; their selective and combinatorial use gives rise to transcript variants with differing length of their 3′ untranslated region (3′UTR). Shortened variants escape UTR-mediated regulation by microRNAs (miRNAs), especially in cancer, where global 3′UTR shortening accelerates disease progression, dedifferentiation and proliferation. Here we present APADB, a database of vertebrate PA sites determined by 3′ end sequencing, using massive analysis of complementary DNA ends. APADB provides (A)PA sites for coding and non-coding transcripts of human, mouse and chicken genes. For human and mouse, several tissue types, including different cancer specimens, are available. APADB records the loss of predicted miRNA binding sites and visualizes next-generation sequencing reads that support each PA site in a genome browser. The database tables can either be browsed according to organism and tissue or alternatively searched for a gene of interest. APADB is the largest database of APA in human, chicken and mouse. The stored information provides experimental evidence for thousands of PA sites and APA events. APADB combines 3′ end sequencing data with prediction algorithms of miRNA binding sites, allowing to further improve prediction algorithms. Current databases lack correct information about 3′UTR lengths, especially for chicken, and APADB provides necessary information to close this gap. Database URL: http://tools.genxpro.net/apadb/ PMID:25052703

  13. Cellulase variants

    DOEpatents

    Blazej, Robert; Toriello, Nicholas; Emrich, Charles; Cohen, Richard N.; Koppel, Nitzan

    2015-07-14

    This invention provides novel variant cellulolytic enzymes having improved activity and/or stability. In certain embodiments the variant cellulotyic enzymes comprise a glycoside hydrolase with or comprising a substitution at one or more positions corresponding to one or more of residues F64, A226, and/or E246 in Thermobifida fusca Cel9A enzyme. In certain embodiments the glycoside hydrolase is a variant of a family 9 glycoside hydrolase. In certain embodiments the glycoside hydrolase is a variant of a theme B family 9 glycoside hydrolase.

  14. Identification of Rare Variants in TNNI3 with Atrial Fibrillation in a Chinese GeneID Population

    PubMed Central

    Wang, Chuchu; Wu, Manman; Qian, Jin; Li, Bin; Tu, Xin; Xu, Chengqi; Li, Sisi; Chen, Shanshan; Zhao, Yuanyuan; Huang, Yufeng; Shi, Lisong; Cheng, Xiang; Liao, Yuhua; Chen, Qiuyun; Xia, Yunlong; Yao, Wei; Wu, Gang; Cheng, Mian; Wang, Qing K.

    2015-01-01

    Despite advances by genome-wide association studies (GWAS), much of heritability of common human diseases remains missing, a phenomenon referred to as ‘missing heritability’. One potential cause for ‘missing heritability’ is the rare susceptibility variants overlooked by GWAS. Atrial fibrillation (AF) is the most common arrhythmia seen at hospitals and increases risk of stroke by 5-fold and doubles risk of heart failure and sudden death. Here we studied one large Chinese family with AF and hypertrophic cardiomyopathy (HCM). Whole-exome sequencing analysis identified a mutation in TNNI3, R186Q, that co-segregated with the disease in the family, but did not exist in >1,583 controls, suggesting that R186Q causes AF and HCM. High-resolution melting curve analysis and direct DNA sequence analysis were then used to screen mutations in all exons and exon-intron boundaries of TNNI3 in a panel of 1,127 unrelated AF patients and 1,583 non-AF subjects. Four novel missense variants were identified in TNNI3, including E64G, M154L, E187G and D196G in four independent AF patients, but no variant was found in 1,583 non-AF subjects. All variants were not found in public databases, including the ExAC Browser database with 60,706 exomes. These data suggests that rare TNNI3 variants are associated with AF (P=0.03). TNNI3 encodes troponin I, a key regulator of the contraction-relaxation function of cardiac muscle and was not previously implicated in AF. Thus, this study may identify a new biological pathway for the pathogenesis of AF and provides evidence to support the rare variant hypothesis for missing heritability. PMID:26169204

  15. Germline contamination and leakage in whole genome somatic single nucleotide variant detection.

    PubMed

    Sendorek, Dorota H; Caloian, Cristian; Ellrott, Kyle; Bare, J Christopher; Yamaguchi, Takafumi N; Ewing, Adam D; Houlahan, Kathleen E; Norman, Thea C; Margolin, Adam A; Stuart, Joshua M; Boutros, Paul C

    2018-01-31

    The clinical sequencing of cancer genomes to personalize therapy is becoming routine across the world. However, concerns over patient re-identification from these data lead to questions about how tightly access should be controlled. It is not thought to be possible to re-identify patients from somatic variant data. However, somatic variant detection pipelines can mistakenly identify germline variants as somatic ones, a process called "germline leakage". The rate of germline leakage across different somatic variant detection pipelines is not well-understood, and it is uncertain whether or not somatic variant calls should be considered re-identifiable. To fill this gap, we quantified germline leakage across 259 sets of whole-genome somatic single nucleotide variant (SNVs) predictions made by 21 teams as part of the ICGC-TCGA DREAM Somatic Mutation Calling Challenge. The median somatic SNV prediction set contained 4325 somatic SNVs and leaked one germline polymorphism. The level of germline leakage was inversely correlated with somatic SNV prediction accuracy and positively correlated with the amount of infiltrating normal cells. The specific germline variants leaked differed by tumour and algorithm. To aid in quantitation and correction of leakage, we created a tool, called GermlineFilter, for use in public-facing somatic SNV databases. The potential for patient re-identification from leaked germline variants in somatic SNV predictions has led to divergent open data access policies, based on different assessments of the risks. Indeed, a single, well-publicized re-identification event could reshape public perceptions of the values of genomic data sharing. We find that modern somatic SNV prediction pipelines have low germline-leakage rates, which can be further reduced, especially for cloud-sharing, using pre-filtering software.

  16. Common variants in Mendelian kidney disease genes and their association with renal function.

    PubMed

    Parsa, Afshin; Fuchsberger, Christian; Köttgen, Anna; O'Seaghdha, Conall M; Pattaro, Cristian; de Andrade, Mariza; Chasman, Daniel I; Teumer, Alexander; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Kim, Young J; Taliun, Daniel; Li, Man; Feitosa, Mary; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; Glazer, Nicole; Isaacs, Aaron; Rao, Madhumathi; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Couraki, Vincent; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Hofer, Edith; Hu, Frank; Demirkan, Ayse; Oostra, Ben A; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Döring, Angela; Wichmann, H-Erich; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; van Duijn, Cornelia M; Borecki, Ingrid; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Bochud, Murielle; Heid, Iris M; Siscovick, David S; Fox, Caroline S; Kao, W Linda; Böger, Carsten A

    2013-12-01

    Many common genetic variants identified by genome-wide association studies for complex traits map to genes previously linked to rare inherited Mendelian disorders. A systematic analysis of common single-nucleotide polymorphisms (SNPs) in genes responsible for Mendelian diseases with kidney phenotypes has not been performed. We thus developed a comprehensive database of genes for Mendelian kidney conditions and evaluated the association between common genetic variants within these genes and kidney function in the general population. Using the Online Mendelian Inheritance in Man database, we identified 731 unique disease entries related to specific renal search terms and confirmed a kidney phenotype in 218 of these entries, corresponding to mutations in 258 genes. We interrogated common SNPs (minor allele frequency >5%) within these genes for association with the estimated GFR in 74,354 European-ancestry participants from the CKDGen Consortium. However, the top four candidate SNPs (rs6433115 at LRP2, rs1050700 at TSC1, rs249942 at PALB2, and rs9827843 at ROBO2) did not achieve significance in a stage 2 meta-analysis performed in 56,246 additional independent individuals, indicating that these common SNPs are not associated with estimated GFR. The effect of less common or rare variants in these genes on kidney function in the general population and disease-specific cohorts requires further research.

  17. Common Variants in Mendelian Kidney Disease Genes and Their Association with Renal Function

    PubMed Central

    Fuchsberger, Christian; Köttgen, Anna; O’Seaghdha, Conall M.; Pattaro, Cristian; de Andrade, Mariza; Chasman, Daniel I.; Teumer, Alexander; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Kim, Young J.; Taliun, Daniel; Li, Man; Feitosa, Mary; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C.; Glazer, Nicole; Isaacs, Aaron; Rao, Madhumathi; Smith, Albert V.; O’Connell, Jeffrey R.; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Hwang, Shih-Jen; Atkinson, Elizabeth J.; Lohman, Kurt; Cornelis, Marilyn C.; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Couraki, Vincent; Holliday, Elizabeth G.; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y.; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B.; Launer, Lenore J.; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D.; Boerwinkle, Eric; Schmidt, Helena; Hofer, Edith; Hu, Frank; Demirkan, Ayse; Oostra, Ben A.; Turner, Stephen T.; Ding, Jingzhong; Andrews, Jeanette S.; Freedman, Barry I.; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Döring, Angela; Wichmann, H.-Erich; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E.; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H.; Wright, Alan F.; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K.; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G.; Rivadeneira, Fernando; Aulchenko, Yurii S.; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K.; Portas, Laura; Ford, Ian; Buckley, Brendan M.; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J. Wouter; Probst-Hensch, Nicole M.; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R.; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; van Duijn, Cornelia M.; Borecki, Ingrid; Kardia, Sharon L.R.; Liu, Yongmei; Curhan, Gary C.; Rudan, Igor; Gyllensten, Ulf; Wilson, James F.; Franke, Andre; Pramstaller, Peter P.; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M.; Bochud, Murielle; Heid, Iris M.; Siscovick, David S.; Fox, Caroline S.; Kao, W. Linda; Böger, Carsten A.

    2013-01-01

    Many common genetic variants identified by genome-wide association studies for complex traits map to genes previously linked to rare inherited Mendelian disorders. A systematic analysis of common single-nucleotide polymorphisms (SNPs) in genes responsible for Mendelian diseases with kidney phenotypes has not been performed. We thus developed a comprehensive database of genes for Mendelian kidney conditions and evaluated the association between common genetic variants within these genes and kidney function in the general population. Using the Online Mendelian Inheritance in Man database, we identified 731 unique disease entries related to specific renal search terms and confirmed a kidney phenotype in 218 of these entries, corresponding to mutations in 258 genes. We interrogated common SNPs (minor allele frequency >5%) within these genes for association with the estimated GFR in 74,354 European-ancestry participants from the CKDGen Consortium. However, the top four candidate SNPs (rs6433115 at LRP2, rs1050700 at TSC1, rs249942 at PALB2, and rs9827843 at ROBO2) did not achieve significance in a stage 2 meta-analysis performed in 56,246 additional independent individuals, indicating that these common SNPs are not associated with estimated GFR. The effect of less common or rare variants in these genes on kidney function in the general population and disease-specific cohorts requires further research. PMID:24029420

  18. The Effects of Purpose Orientations on Recent High School Graduates' College Application Decisions

    ERIC Educational Resources Information Center

    Sharma, Gitima; Kim, Jungnam; Bryan, Julia

    2017-01-01

    Using the 2002 Educational Longitudinal Study database, the authors examined the different types of purpose orientations amongst a nationally representative sample of adolescents and the effect of these purpose orientations on high school graduates' college application decisions. Results indicated four types of purpose orientations: career,…

  19. Conceptual and logical level of database modeling

    NASA Astrophysics Data System (ADS)

    Hunka, Frantisek; Matula, Jiri

    2016-06-01

    Conceptual and logical levels form the top most levels of database modeling. Usually, ORM (Object Role Modeling) and ER diagrams are utilized to capture the corresponding schema. The final aim of business process modeling is to store its results in the form of database solution. For this reason, value oriented business process modeling which utilizes ER diagram to express the modeling entities and relationships between them are used. However, ER diagrams form the logical level of database schema. To extend possibilities of different business process modeling methodologies, the conceptual level of database modeling is needed. The paper deals with the REA value modeling approach to business process modeling using ER-diagrams, and derives conceptual model utilizing ORM modeling approach. Conceptual model extends possibilities for value modeling to other business modeling approaches.

  20. A resource oriented webs service for environmental modeling

    NASA Astrophysics Data System (ADS)

    Ferencik, Ioan

    2013-04-01

    Environmental modeling is a largely adopted practice in the study of natural phenomena. Environmental models can be difficult to build and use and thus sharing them within the community is an important aspect. The most common approach to share a model is to expose it as a web service. In practice the interaction with this web service is cumbersome due to lack of standardized contract and the complexity of the model being exposed. In this work we investigate the use of a resource oriented approach in exposing environmental models as web services. We view a model as a layered resource build atop the object concept from Object Oriented Programming, augmented with persistence capabilities provided by an embedded object database to keep track of its state and implementing the four basic principles of resource oriented architectures: addressability, statelessness, representation and uniform interface. For implementation we use exclusively open source software: Django framework, dyBase object oriented database and Python programming language. We developed a generic framework of resources structured into a hierarchy of types and consequently extended this typology with recurses specific to the domain of environmental modeling. To test our web service we used cURL, a robust command-line based web client.

  1. amamutdb.no: A relational database for MAN2B1 allelic variants that compiles genotypes, clinical phenotypes, and biochemical and structural data of mutant MAN2B1 in α-mannosidosis.

    PubMed

    Riise Stensland, Hilde Monica Frostad; Frantzen, Gabrio; Kuokkanen, Elina; Buvang, Elisabeth Kjeldsen; Klenow, Helle Bagterp; Heikinheimo, Pirkko; Malm, Dag; Nilssen, Øivind

    2015-06-01

    α-Mannosidosis is an autosomal recessive lysosomal storage disorder caused by mutations in the MAN2B1 gene, encoding lysosomal α-mannosidase. The disorder is characterized by a range of clinical phenotypes of which the major manifestations are mental impairment, hearing impairment, skeletal changes, and immunodeficiency. Here, we report an α-mannosidosis mutation database, amamutdb.no, which has been constructed as a publicly accessible online resource for recording and analyzing MAN2B1 variants (http://amamutdb.no). Our aim has been to offer structured and relational information on MAN2B1 mutations and genotypes along with associated clinical phenotypes. Classifying missense mutations, as pathogenic or benign, is a challenge. Therefore, they have been given special attention as we have compiled all available data that relate to their biochemical, functional, and structural properties. The α-mannosidosis mutation database is comprehensive and relational in the sense that information can be retrieved and compiled across datasets; hence, it will facilitate diagnostics and increase our understanding of the clinical and molecular aspects of α-mannosidosis. We believe that the amamutdb.no structure and architecture will be applicable for the development of databases for any monogenic disorder. © 2015 WILEY PERIODICALS, INC.

  2. Systematic comparison of variant calling pipelines using gold standard personal exome variants

    PubMed Central

    Hwang, Sohyun; Kim, Eiru; Lee, Insuk; Marcotte, Edward M.

    2015-01-01

    The success of clinical genomics using next generation sequencing (NGS) requires the accurate and consistent identification of personal genome variants. Assorted variant calling methods have been developed, which show low concordance between their calls. Hence, a systematic comparison of the variant callers could give important guidance to NGS-based clinical genomics. Recently, a set of high-confident variant calls for one individual (NA12878) has been published by the Genome in a Bottle (GIAB) consortium, enabling performance benchmarking of different variant calling pipelines. Based on the gold standard reference variant calls from GIAB, we compared the performance of thirteen variant calling pipelines, testing combinations of three read aligners—BWA-MEM, Bowtie2, and Novoalign—and four variant callers—Genome Analysis Tool Kit HaplotypeCaller (GATK-HC), Samtools mpileup, Freebayes and Ion Proton Variant Caller (TVC), for twelve data sets for the NA12878 genome sequenced by different platforms including Illumina2000, Illumina2500, and Ion Proton, with various exome capture systems and exome coverage. We observed different biases toward specific types of SNP genotyping errors by the different variant callers. The results of our study provide useful guidelines for reliable variant identification from deep sequencing of personal genomes. PMID:26639839

  3. Interaction of birth order, handedness, and sexual orientation in the Kinsey interview data.

    PubMed

    Bogaert, Anthony F; Blanchard, Ray; Crosthwait, Lesley E

    2007-10-01

    Recent evidence indicates that 2 of the most consistently observed correlates of men's sexual orientation--handedness and older brothers--may be linked interactively in their prediction of men's sexual orientation. In this article, the authors studied the relationship among handedness, older brothers, and men's sexual orientation in the large and historically significant database originally compiled by Alfred C. Kinsey and his colleagues (A. C. Kinsey, W. B. Pomeroy, & C. E. Martin, 1948). The results demonstrated that handedness moderates the relationship between older brothers and sexual orientation. Specifically, older brothers increased the odds of homosexuality in right-handers only; in non-righthanders, older brothers did not affect the odds of homosexuality. These results refine the possible biological explanations reported to underlie both the handedness and older brother relationships to men's sexual orientation. These results also suggest that biological explanations of men's sexual orientation are likely relevant across time, as the Kinsey data comprise an older cohort relative to modern samples. (PsycINFO Database Record (c) 2007 APA, all rights reserved).

  4. Saada: A Generator of Astronomical Database

    NASA Astrophysics Data System (ADS)

    Michel, L.

    2011-11-01

    Saada transforms a set of heterogeneous FITS files or VOtables of various categories (images, tables, spectra, etc.) in a powerful database deployed on the Web. Databases are located on your host and stay independent of any external server. This job doesn’t require writing code. Saada can mix data of various categories in multiple collections. Data collections can be linked each to others making relevant browsing paths and allowing data-mining oriented queries. Saada supports 4 VO services (Spectra, images, sources and TAP) . Data collections can be published immediately after the deployment of the Web interface.

  5. An integrated database-pipeline system for studying single nucleotide polymorphisms and diseases.

    PubMed

    Yang, Jin Ok; Hwang, Sohyun; Oh, Jeongsu; Bhak, Jong; Sohn, Tae-Kwon

    2008-12-12

    Studies on the relationship between disease and genetic variations such as single nucleotide polymorphisms (SNPs) are important. Genetic variations can cause disease by influencing important biological regulation processes. Despite the needs for analyzing SNP and disease correlation, most existing databases provide information only on functional variants at specific locations on the genome, or deal with only a few genes associated with disease. There is no combined resource to widely support gene-, SNP-, and disease-related information, and to capture relationships among such data. Therefore, we developed an integrated database-pipeline system for studying SNPs and diseases. To implement the pipeline system for the integrated database, we first unified complicated and redundant disease terms and gene names using the Unified Medical Language System (UMLS) for classification and noun modification, and the HUGO Gene Nomenclature Committee (HGNC) and NCBI gene databases. Next, we collected and integrated representative databases for three categories of information. For genes and proteins, we examined the NCBI mRNA, UniProt, UCSC Table Track and MitoDat databases. For genetic variants we used the dbSNP, JSNP, ALFRED, and HGVbase databases. For disease, we employed OMIM, GAD, and HGMD databases. The database-pipeline system provides a disease thesaurus, including genes and SNPs associated with disease. The search results for these categories are available on the web page http://diseasome.kobic.re.kr/, and a genome browser is also available to highlight findings, as well as to permit the convenient review of potentially deleterious SNPs among genes strongly associated with specific diseases and clinical phenotypes. Our system is designed to capture the relationships between SNPs associated with disease and disease-causing genes. The integrated database-pipeline provides a list of candidate genes and SNP markers for evaluation in both epidemiological and molecular

  6. DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases

    NASA Astrophysics Data System (ADS)

    Bröcheler, Matthias; Pugliese, Andrea; Subrahmanian, V. S.

    RDF is an increasingly important paradigm for the representation of information on the Web. As RDF databases increase in size to approach tens of millions of triples, and as sophisticated graph matching queries expressible in languages like SPARQL become increasingly important, scalability becomes an issue. To date, there is no graph-based indexing method for RDF data where the index was designed in a way that makes it disk-resident. There is therefore a growing need for indexes that can operate efficiently when the index itself resides on disk. In this paper, we first propose the DOGMA index for fast subgraph matching on disk and then develop a basic algorithm to answer queries over this index. This algorithm is then significantly sped up via an optimized algorithm that uses efficient (but correct) pruning strategies when combined with two different extensions of the index. We have implemented a preliminary system and tested it against four existing RDF database systems developed by others. Our experiments show that our algorithm performs very well compared to these systems, with orders of magnitude improvements for complex graph queries.

  7. Controlling Protein Surface Orientation by Strategic Placement of Oligo-Histidine Tags

    PubMed Central

    2017-01-01

    We report oriented immobilization of proteins using the standard hexahistidine (His6)-Ni2+:NTA (nitrilotriacetic acid) methodology, which we systematically tuned to give control of surface coverage. Fluorescence microscopy and surface plasmon resonance measurements of self-assembled monolayers (SAMs) of red fluorescent proteins (TagRFP) showed that binding strength increased by 1 order of magnitude for each additional His6-tag on the TagRFP proteins. All TagRFP variants with His6-tags located on only one side of the barrel-shaped protein yielded a 1.5 times higher surface coverage compared to variants with His6-tags on opposite sides of the so-called β-barrel. Time-resolved fluorescence anisotropy measurements supported by polarized infrared spectroscopy verified that the orientation (and thus coverage and functionality) of proteins on surfaces can be controlled by strategic placement of a His6-tag on the protein. Molecular dynamics simulations show how the differently tagged proteins reside at the surface in “end-on” and “side-on” orientations with each His6-tag contributing to binding. Also, not every dihistidine subunit in a given His6-tag forms a full coordination bond with the Ni2+:NTA SAMs, which varied with the position of the His6-tag on the protein. At equal valency but different tag positions on the protein, differences in binding were caused by probing for Ni2+:NTA moieties and by additional electrostatic interactions between different fractions of the β-barrel structure and charged NTA moieties. Potential of mean force calculations indicate there is no specific single-protein interaction mode that provides a clear preferential surface orientation, suggesting that the experimentally measured preference for the end-on orientation is a supra-protein, not a single-protein, effect. PMID:28850777

  8. CDKL5 variants

    PubMed Central

    Kalscheuer, Vera M.; Hennig, Friederike; Leonard, Helen; Downs, Jenny; Clarke, Angus; Benke, Tim A.; Armstrong, Judith; Pineda, Mercedes; Bailey, Mark E.S.; Cobb, Stuart R.

    2017-01-01

    Objective: To provide new insights into the interpretation of genetic variants in a rare neurologic disorder, CDKL5 deficiency, in the contexts of population sequencing data and an updated characterization of the CDKL5 gene. Methods: We analyzed all known potentially pathogenic CDKL5 variants by combining data from large-scale population sequencing studies with CDKL5 variants from new and all available clinical cohorts and combined this with computational methods to predict pathogenicity. Results: The study has identified several variants that can be reclassified as benign or likely benign. With the addition of novel CDKL5 variants, we confirm that pathogenic missense variants cluster in the catalytic domain of CDKL5 and reclassify a purported missense variant as having a splicing consequence. We provide further evidence that missense variants in the final 3 exons are likely to be benign and not important to disease pathology. We also describe benign splicing and nonsense variants within these exons, suggesting that isoform hCDKL5_5 is likely to have little or no neurologic significance. We also use the available data to make a preliminary estimate of minimum incidence of CDKL5 deficiency. Conclusions: These findings have implications for genetic diagnosis, providing evidence for the reclassification of specific variants previously thought to result in CDKL5 deficiency. Together, these analyses support the view that the predominant brain isoform in humans (hCDKL5_1) is crucial for normal neurodevelopment and that the catalytic domain is the primary functional domain. PMID:29264392

  9. Efficient hemodynamic event detection utilizing relational databases and wavelet analysis

    NASA Technical Reports Server (NTRS)

    Saeed, M.; Mark, R. G.

    2001-01-01

    Development of a temporal query framework for time-oriented medical databases has hitherto been a challenging problem. We describe a novel method for the detection of hemodynamic events in multiparameter trends utilizing wavelet coefficients in a MySQL relational database. Storage of the wavelet coefficients allowed for a compact representation of the trends, and provided robust descriptors for the dynamics of the parameter time series. A data model was developed to allow for simplified queries along several dimensions and time scales. Of particular importance, the data model and wavelet framework allowed for queries to be processed with minimal table-join operations. A web-based search engine was developed to allow for user-defined queries. Typical queries required between 0.01 and 0.02 seconds, with at least two orders of magnitude improvement in speed over conventional queries. This powerful and innovative structure will facilitate research on large-scale time-oriented medical databases.

  10. Impact of EML4-ALK Variant on Resistance Mechanisms and Clinical Outcomes in ALK-Positive Lung Cancer.

    PubMed

    Lin, Jessica J; Zhu, Viola W; Yoda, Satoshi; Yeap, Beow Y; Schrock, Alexa B; Dagogo-Jack, Ibiayi; Jessop, Nicholas A; Jiang, Ginger Y; Le, Long P; Gowen, Kyle; Stephens, Philip J; Ross, Jeffrey S; Ali, Siraj M; Miller, Vincent A; Johnson, Melissa L; Lovly, Christine M; Hata, Aaron N; Gainor, Justin F; Iafrate, Anthony J; Shaw, Alice T; Ou, Sai-Hong Ignatius

    2018-04-20

    Purpose Advanced anaplastic lymphoma kinase ( ALK) fusion-positive non-small-cell lung cancers (NSCLCs) are effectively treated with ALK tyrosine kinase inhibitors (TKIs). However, clinical outcomes in these patients vary, and the benefit of TKIs is limited as a result of acquired resistance. Emerging data suggest that the ALK fusion variant may affect clinical outcome, but the molecular basis for this association is unknown. Patients and Methods We identified 129 patients with ALK-positive NSCLC with known ALK variants. ALK resistance mutations and clinical outcomes on ALK TKIs were retrospectively evaluated according to ALK variant. A Foundation Medicine data set of 577 patients with ALK-positive NSCLC was also examined. Results The most frequent ALK variants were EML4-ALK variant 1 in 55 patients (43%) and variant 3 in 51 patients (40%). We analyzed 77 tumor biopsy specimens from patients with variants 1 and 3 who had progressed on an ALK TKI. ALK resistance mutations were significantly more common in variant 3 than in variant 1 (57% v 30%; P = .023). In particular, ALK G1202R was more common in variant 3 than in variant 1 (32% v 0%; P < .001). Analysis of the Foundation Medicine database revealed similar associations of variant 3 with ALK resistance mutation and with G1202R ( P = .010 and .015, respectively). Among patients treated with the third-generation ALK TKI lorlatinib, variant 3 was associated with a significantly longer progression-free survival than variant 1 (hazard ratio, 0.31; 95% CI, 0.12 to 0.79; P = .011). Conclusion Specific ALK variants may be associated with the development of ALK resistance mutations, particularly G1202R, and provide a molecular link between variant and clinical outcome. ALK variant thus represents a potentially important factor in the selection of next-generation ALK inhibitors.

  11. SNPdbe: constructing an nsSNP functional impacts database.

    PubMed

    Schaefer, Christian; Meier, Alice; Rost, Burkhard; Bromberg, Yana

    2012-02-15

    Many existing databases annotate experimentally characterized single nucleotide polymorphisms (SNPs). Each non-synonymous SNP (nsSNP) changes one amino acid in the gene product (single amino acid substitution;SAAS). This change can either affect protein function or be neutral in that respect. Most polymorphisms lack experimental annotation of their functional impact. Here, we introduce SNPdbe-SNP database of effects, with predictions of computationally annotated functional impacts of SNPs. Database entries represent nsSNPs in dbSNP and 1000 Genomes collection, as well as variants from UniProt and PMD. SAASs come from >2600 organisms; 'human' being the most prevalent. The impact of each SAAS on protein function is predicted using the SNAP and SIFT algorithms and augmented with experimentally derived function/structure information and disease associations from PMD, OMIM and UniProt. SNPdbe is consistently updated and easily augmented with new sources of information. The database is available as an MySQL dump and via a web front end that allows searches with any combination of organism names, sequences and mutation IDs. http://www.rostlab.org/services/snpdbe.

  12. MIPS: a database for genomes and protein sequences.

    PubMed Central

    Mewes, H W; Heumann, K; Kaps, A; Mayer, K; Pfeiffer, F; Stocker, S; Frishman, D

    1999-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried near Munich, Germany, develops and maintains genome oriented databases. It is commonplace that the amount of sequence data available increases rapidly, but not the capacity of qualified manual annotation at the sequence databases. Therefore, our strategy aims to cope with the data stream by the comprehensive application of analysis tools to sequences of complete genomes, the systematic classification of protein sequences and the active support of sequence analysis and functional genomics projects. This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). MIPS provides access through its WWW server (http://www.mips.biochem.mpg.de) to a spectrum of generic databases, including the above mentioned as well as a database of protein families (PROTFAM), the MITOP database, and the all-against-all FASTA database. PMID:9847138

  13. Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts.

    PubMed

    Hakenberg, Jörg; Cheng, Wei-Yi; Thomas, Philippe; Wang, Ying-Chih; Uzilov, Andrew V; Chen, Rong

    2016-01-08

    Data from a plethora of high-throughput sequencing studies is readily available to researchers, providing genetic variants detected in a variety of healthy and disease populations. While each individual cohort helps gain insights into polymorphic and disease-associated variants, a joint perspective can be more powerful in identifying polymorphisms, rare variants, disease-associations, genetic burden, somatic variants, and disease mechanisms. We have set up a Reference Variant Store (RVS) containing variants observed in a number of large-scale sequencing efforts, such as 1000 Genomes, ExAC, Scripps Wellderly, UK10K; various genotyping studies; and disease association databases. RVS holds extensive annotations pertaining to affected genes, functional impacts, disease associations, and population frequencies. RVS currently stores 400 million distinct variants observed in more than 80,000 human samples. RVS facilitates cross-study analysis to discover novel genetic risk factors, gene-disease associations, potential disease mechanisms, and actionable variants. Due to its large reference populations, RVS can also be employed for variant filtration and gene prioritization. A web interface to public datasets and annotations in RVS is available at https://rvs.u.hpc.mssm.edu/.

  14. [Phenotypic and genotypic spectra of patients with glucose-6-phosphate dehydrogenase deficiency gene known pathogenic variants: a single-center study].

    PubMed

    Chen, X; Yang, L; Wang, H J; Wu, B B; Lu, Y L; Dong, X R; Zhou, W H

    2018-05-02

    Objective: To analyze the hotspots of known pathogenic disease-causing variants of glucose-6-phosphate dehydrogenase (G6PD) and the phenotype spectrum of neonatal patients with known pathogenic disease-causing variants of G6PD. Methods: The known pathogenic disease-causing variants of G6PD were collected from Human Gene Mutation Database. Screening was performed for these variants among the 7 966 cases (2 357 neonatal, 5 609 non-neonatal) in the database of sequencing at Molecular Diagnosis Center, Children's Hospital of Fudan University. All these samples were from patients suspected with genetic disorder. The database contained Whole Exon Sequencing data and Clinical Exon Sequencing data. We screened out the patients with known pathogenic disease-causing variants of G6PD, analyzed the hotspot of G6PD and the phenotype spectrum of neonatal patients with known pathogenic disease-causing variants of G6PD. Results: (1) Among the next generation sequencing data of the 7 966 samples, 86 samples (1.1%) were detected as positive for the known pathogenic disease-causing variants of G6PD (positive samples set). In the positive sample set, 51 patients (33 males, 18 females) were newborn babies. Forty-three patients (26 males, 17 females) had the enzyme activity data of G6PD. (2) Among the 86 samples, Arg463His, Arg459Leu, Leu342Phe, Val291Met were the leading 4 disease-causing variants found in 72 samples (84%). (3) Male neonatal patients with the same variants had the statistically significant differences in enzyme activity: among 13 patients with Arg463His, enzyme activity of 9 patients was ranked as grade Ⅲ, 1 case ranked as Ⅳ, 3 cases had no activity data;among 10 patients with Arg459Leu, enzyme activity of 4 patients was ranked as Ⅱ, 4 cases ranked as Ⅲ, 2 cases had no activity data;among 2 patients with His32Arg, enzyme activity of one patient was ranked as Ⅱ, another was Ⅲ. Male neonatal patients with the same mutation and enzyme activity also had the

  15. ToTem: a tool for variant calling pipeline optimization.

    PubMed

    Tom, Nikola; Tom, Ondrej; Malcikova, Jitka; Pavlova, Sarka; Kubesova, Blanka; Rausch, Tobias; Kolarik, Miroslav; Benes, Vladimir; Bystry, Vojtech; Pospisilova, Sarka

    2018-06-26

    High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user's priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. ToTem is a tool for automated pipeline optimization which is freely available as a web application at  https://totem.software .

  16. Experimental evidence of stress-field-induced selection of variants in Ni-Mn-Ga ferromagnetic shape-memory alloys

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Y. D.; Key Laboratory for Anisotropy and Texture of Materials; Brown, D. W.

    2007-05-01

    The in situ time-of-flight neutron-diffraction measurements captured well the martensitic transformation behavior of the Ni-Mn-Ga ferromagnetic shape-memory alloys under uniaxial stress fields. We found that a small uniaxial stress applied during phase transformation dramatically disturbed the distribution of variants in the product phase. The observed changes in the distributions of variants may be explained by considering the role of the minimum distortion energy of the Bain transformation in the effective partition among the variants belonging to the same orientation of parent phase. It was also found that transformation kinetics under various stress fields follows the scale law. The present investigationsmore » provide the fundamental approach for scaling the evolution of microstructures in martensitic transitions, which is of general interest to the condensed matter community.« less

  17. Characterization of pathogenic SORL1 genetic variants for association with Alzheimer’s disease: a clinical interpretation strategy

    PubMed Central

    Holstege, Henne; van der Lee, Sven J; Hulsman, Marc; Wong, Tsz Hang; van Rooij, Jeroen GJ; Weiss, Marjan; Louwersheimer, Eva; Wolters, Frank J; Amin, Najaf; Uitterlinden, André G; Hofman, Albert; Ikram, M Arfan; van Swieten, John C; Meijers-Heijboer, Hanne; van der Flier, Wiesje M; Reinders, Marcel JT; van Duijn, Cornelia M; Scheltens, Philip

    2017-01-01

    Accumulating evidence suggests that genetic variants in the SORL1 gene are associated with Alzheimer disease (AD), but a strategy to identify which variants are pathogenic is lacking. In a discovery sample of 115 SORL1 variants detected in 1908 Dutch AD cases and controls, we identified the variant characteristics associated with SORL1 variant pathogenicity. Findings were replicated in an independent sample of 103 SORL1 variants detected in 3193 AD cases and controls. In a combined sample of the discovery and replication samples, comprising 181 unique SORL1 variants, we developed a strategy to classify SORL1 variants into five subtypes ranging from pathogenic to benign. We tested this pathogenicity screen in SORL1 variants reported in two independent published studies. SORL1 variant pathogenicity is defined by the Combined Annotation Dependent Depletion (CADD) score and the minor allele frequency (MAF) reported by the Exome Aggregation Consortium (ExAC) database. Variants predicted strongly damaging (CADD score >30), which are extremely rare (ExAC-MAF <1 × 10−5) increased AD risk by 12-fold (95% CI 4.2–34.3; P=5 × 10−9). Protein-truncating SORL1 mutations were all unknown to ExAC and occurred exclusively in AD cases. More common SORL1 variants (ExAC-MAF≥1 × 10−5) were not associated with increased AD risk, even when predicted strongly damaging. Findings were independent of gender and the APOE-ε4 allele. High-risk SORL1 variants were observed in a substantial proportion of the AD cases analyzed (2%). Based on their effect size, we propose to consider high-risk SORL1 variants next to variants in APOE, PSEN1, PSEN2 and APP for personalized risk assessments in clinical practice. PMID:28537274

  18. Palm-Vein Classification Based on Principal Orientation Features

    PubMed Central

    Zhou, Yujia; Liu, Yaqin; Feng, Qianjin; Yang, Feng; Huang, Jing; Nie, Yixiao

    2014-01-01

    Personal recognition using palm–vein patterns has emerged as a promising alternative for human recognition because of its uniqueness, stability, live body identification, flexibility, and difficulty to cheat. With the expanding application of palm–vein pattern recognition, the corresponding growth of the database has resulted in a long response time. To shorten the response time of identification, this paper proposes a simple and useful classification for palm–vein identification based on principal direction features. In the registration process, the Gaussian-Radon transform is adopted to extract the orientation matrix and then compute the principal direction of a palm–vein image based on the orientation matrix. The database can be classified into six bins based on the value of the principal direction. In the identification process, the principal direction of the test sample is first extracted to ascertain the corresponding bin. One-by-one matching with the training samples is then performed in the bin. To improve recognition efficiency while maintaining better recognition accuracy, two neighborhood bins of the corresponding bin are continuously searched to identify the input palm–vein image. Evaluation experiments are conducted on three different databases, namely, PolyU, CASIA, and the database of this study. Experimental results show that the searching range of one test sample in PolyU, CASIA and our database by the proposed method for palm–vein identification can be reduced to 14.29%, 14.50%, and 14.28%, with retrieval accuracy of 96.67%, 96.00%, and 97.71%, respectively. With 10,000 training samples in the database, the execution time of the identification process by the traditional method is 18.56 s, while that by the proposed approach is 3.16 s. The experimental results confirm that the proposed approach is more efficient than the traditional method, especially for a large database. PMID:25383715

  19. Genetic polymorphisms of pharmacogenomic VIP variants in the Yi population from China.

    PubMed

    Yan, Mengdan; Li, Dianzhen; Zhao, Guige; Li, Jing; Niu, Fanglin; Li, Bin; Chen, Peng; Jin, Tianbo

    2018-03-30

    Drug response and target therapeutic dosage are different among individuals. The variability is largely genetically determined. With the development of pharmacogenetics and pharmacogenomics, widespread research have provided us a wealth of information on drug-related genetic polymorphisms, and the very important pharmacogenetic (VIP) variants have been identified for the major populations around the world whereas less is known regarding minorities in China, including the Yi ethnic group. Our research aims to screen the potential genetic variants in Yi population on pharmacogenomics and provide a theoretical basis for future medication guidance. In the present study, 80 VIP variants (selected from the PharmGKB database) were genotyped in 100 unrelated and healthy Yi adults recruited for our research. Through statistical analysis, we made a comparison between the Yi and other 11 populations listed in the HapMap database for significant SNPs detection. Two specific SNPs were subsequently enrolled in an observation on global allele distribution with the frequencies downloaded from ALlele FREquency Database. Moreover, F-statistics (Fst), genetic structure and phylogenetic tree analyses were conducted for determination of genetic similarity between the 12 ethnic groups. Using the χ2 tests, rs1128503 (ABCB1), rs7294 (VKORC1), rs9934438 (VKORC1), rs1540339 (VDR) and rs689466 (PTGS2) were identified as the significantly different loci for further analysis. The global allele distribution revealed that the allele "A" of rs1540339 and rs9934438 were more frequent in Yi people, which was consistent with the most populations in East Asia. F-statistics (Fst), genetic structure and phylogenetic tree analyses demonstrated that the Yi and CHD shared a closest relationship on their genetic backgrounds. Additionally, Yi was considered similar to the Han people from Shaanxi province among the domestic ethnic populations in China. Our results demonstrated significant differences on

  20. BTKbase, mutation database for X-linked agammaglobulinemia (XLA).

    PubMed Central

    Vihinen, M; Brandau, O; Brandén, L J; Kwan, S P; Lappalainen, I; Lester, T; Noordzij, J G; Ochs, H D; Ollila, J; Pienaar, S M; Riikonen, P; Saha, B K; Smith, C I

    1998-01-01

    X-linked agammaglobulinemia (XLA) is an immunodeficiency caused by mutations in the gene coding for Bruton's agammaglobulinemia tyrosine kinase (BTK). A database (BTKbase) of BTK mutations has been compiled and the recent update lists 463 mutation entries from 406 unrelated families showing 303 unique molecular events. In addition to mutations, the database also lists variants or polymorphisms. Each patient is given a unique patient identity number (PIN). Information is included regarding the phenotype including symptoms. Mutations in all the five domains of BTK have been noticed to cause the disease, the most common event being missense mutations. The mutations appear almost uniformly throughout the molecule and frequently affect CpG sites that code for arginine residues. The putative structural implications of all the missense mutations are given in the database. The improved version of the registry having a number of new features is available at http://www. helsinki.fi/science/signal/btkbase.html PMID:9399844

  1. An incremental database access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, Nicholas; Sellis, Timos

    1994-01-01

    We investigated a number of design and performance issues of interoperable database management systems (DBMS's). The major results of our investigation were obtained in the areas of client-server database architectures for heterogeneous DBMS's, incremental computation models, buffer management techniques, and query optimization. We finished a prototype of an advanced client-server workstation-based DBMS which allows access to multiple heterogeneous commercial DBMS's. Experiments and simulations were then run to compare its performance with the standard client-server architectures. The focus of this research was on adaptive optimization methods of heterogeneous database systems. Adaptive buffer management accounts for the random and object-oriented access methods for which no known characterization of the access patterns exists. Adaptive query optimization means that value distributions and selectives, which play the most significant role in query plan evaluation, are continuously refined to reflect the actual values as opposed to static ones that are computed off-line. Query feedback is a concept that was first introduced to the literature by our group. We employed query feedback for both adaptive buffer management and for computing value distributions and selectivities. For adaptive buffer management, we use the page faults of prior executions to achieve more 'informed' management decisions. For the estimation of the distributions of the selectivities, we use curve-fitting techniques, such as least squares and splines, for regressing on these values.

  2. A polarimetric scattering database for non-spherical ice particles at microwave wavelengths

    NASA Astrophysics Data System (ADS)

    Lu, Yinghui; Jiang, Zhiyuan; Aydin, Kultegin; Verlinde, Johannes; Clothiaux, Eugene E.; Botta, Giovanni

    2016-10-01

    The atmospheric science community has entered a period in which electromagnetic scattering properties at microwave frequencies of realistically constructed ice particles are necessary for making progress on a number of fronts. One front includes retrieval of ice-particle properties and signatures from ground-based, airborne, and satellite-based radar and radiometer observations. Another front is evaluation of model microphysics by application of forward operators to their outputs and comparison to observations during case study periods. Yet a third front is data assimilation, where again forward operators are applied to databases of ice-particle scattering properties and the results compared to observations, with their differences leading to corrections of the model state. Over the past decade investigators have developed databases of ice-particle scattering properties at microwave frequencies and made them openly available. Motivated by and complementing these earlier efforts, a database containing polarimetric single-scattering properties of various types of ice particles at millimeter to centimeter wavelengths is presented. While the database presented here contains only single-scattering properties of ice particles in a fixed orientation, ice-particle scattering properties are computed for many different directions of the radiation incident on them. These results are useful for understanding the dependence of ice-particle scattering properties on ice-particle orientation with respect to the incident radiation. For ice particles that are small compared to the wavelength, the number of incident directions of the radiation is sufficient to compute reasonable estimates of their (randomly) orientation-averaged scattering properties. This database is complementary to earlier ones in that it contains complete (polarimetric) scattering property information for each ice particle - 44 plates, 30 columns, 405 branched planar crystals, 660 aggregates, and 640 conical

  3. iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.

    PubMed

    Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi

    2018-01-01

    We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.

  4. Efficient Privacy-Enhancing Techniques for Medical Databases

    NASA Astrophysics Data System (ADS)

    Schartner, Peter; Schaffer, Martin

    In this paper, we introduce an alternative for using linkable unique health identifiers: locally generated system-wide unique digital pseudonyms. The presented techniques are based on a novel technique called collision-free number generation which is discussed in the introductory part of the article. Afterwards, attention is payed onto two specific variants of collision-free number generation: one based on the RSA-Problem and the other one based on the Elliptic Curve Discrete Logarithm Problem. Finally, two applications are sketched: centralized medical records and anonymous medical databases.

  5. Analysis of the Association between Catechol-O-Methyltransferase Val158Met and Male Sexual Orientation.

    PubMed

    Yu, Wei; Tu, Dan; Hong, Fuchang; Wang, Jing; Liu, Xiaoli; Cai, Yumao; Xu, Ruiwei; Zhao, Guanglu; Wang, Feng; Pan, Hong; Wu, Shinan; Feng, Tiejian; Wang, Binbin

    2015-09-01

    Male sexual orientation is thought to have a genetic component. However, previous studies have failed to generate positive results from among candidate genes. Catechol-O-methyltransferase (COMT), located on chromosome 22, has six exons, spans 27 kb, and encodes a protein of 271 amino acids. COMT has an important role in regulating the embryonic levels of catecholamine neurotransmitters (such as dopamine, norepinephrine, and epinephrine) and estrogens. COMT is also thought to be related to sexual orientation. This study aimed to investigate the relationship between the COMT Val158Met variant and male sexual orientation. We performed association analysis of the COMT gene single nucleotide polymorphism, Val158Met, in 409 homosexual cases and 387 heterosexual control Chinese men. COMT polymorphism status was determined using a polymerase chain reaction-based assay. Polymerase chain reaction was performed to genotype the COMT Val158Met polymorphism. The frequency differences of the genotype and alleles distribution between the male homosexual and control groups. Significant differences, both in genotype and alleles, between male homosexual individuals and controls indicated a genetic component related to male homosexuality. The Val allele recessive model could be an interrelated genetic model of the cause of male homosexuality. The COMT Val158Met variant might be associated with male sexual orientation and a recessive model was suggested. © 2015 International Society for Sexual Medicine.

  6. A service-oriented data access control model

    NASA Astrophysics Data System (ADS)

    Meng, Wei; Li, Fengmin; Pan, Juchen; Song, Song; Bian, Jiali

    2017-01-01

    The development of mobile computing, cloud computing and distributed computing meets the growing individual service needs. Facing with complex application system, it's an urgent problem to ensure real-time, dynamic, and fine-grained data access control. By analyzing common data access control models, on the basis of mandatory access control model, the paper proposes a service-oriented access control model. By regarding system services as subject and data of databases as object, the model defines access levels and access identification of subject and object, and ensures system services securely to access databases.

  7. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions

    PubMed Central

    Brezovský, Jan

    2016-01-01

    An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools’ predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations

  8. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions.

    PubMed

    Bendl, Jaroslav; Musil, Miloš; Štourač, Jan; Zendulka, Jaroslav; Damborský, Jiří; Brezovský, Jan

    2016-05-01

    An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools' predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To

  9. Proteomic characterization of histone variants in the mouse testis by mass spectrometry-based top-down analysis.

    PubMed

    Kwak, Ho-Geun; Dohmae, Naoshi

    2016-11-15

    Various histones, including testis-specific histones, exist during spermatogenesis and some of them have been reported to play a key role in chromatin remodeling. Mass spectrometry (MS)-based characterization has become the important step to understand histone structures. Although individual histones or partial histone variant groups have been characterized, the comprehensive analysis of histone variants has not yet been conducted in the mouse testis. Here, we present the comprehensive separation and characterization of histone variants from mouse testes by a top-down approach using MS. Histone variants were successfully separated on a reversed phase column using high performance liquid chromatography (HPLC) with an ion-pairing reagent. Increasing concentrations of testis-specific histones were observed in the mouse testis and some somatic histones increased in the epididymis. Specifically, the increase of mass abundance in H3.2 in the epididymis was inversely proportional to the decrease in H3t in the testis, which was approximately 80%. The top-down characterization of intact histone variants in the mouse testis was performed using LC-MS/MS. The masses of separated histone variants and their expected post-translation modifications were calculated by performing deconvolution with information taken from the database. TH2A, TH2B and H3t were characterized by MS/MS fragmentation. Our approach provides comprehensive knowledge for identification of histone variants in the mouse testis that will contribute to the structural and functional research of histone variants during spermatogenesis.

  10. Gender Variance and Sexual Orientation Among Male Spirit Mediums in Myanmar.

    PubMed

    Coleman, Eli; Allen, Mariette Pathy; Ford, Jessie V

    2018-05-01

    This article describes the gender identity, gender expression, and sexual orientation of male spirit mediums in Myanmar. Our analysis is based on ethnographic work, field observation, and 10 semi-structured interviews. These observations were conducted from 2010 to 2015, mostly in Mandalay, with some fieldwork in Yangon and Bagan. The focus of this investigation was specifically on achout (gender variant individuals) who were spirit mediums (nat kadaw). Semi-structured interviews explored the ways that participants understood their gender identity, gender expression, and sexuality in relation to their work as spirit mediums and broader social life. Myanmar remains quite a homophobic and transphobic culture but is undergoing rapid economic and social change. Therefore, it provides an interesting context to study how safe spaces are produced for sexual/gender minorities amidst broader social change. We find that, through the animistic belief structure, there is a growing space for gender nonconforming people, gender variant, and same-sex-oriented individuals (achout) to neutralize their stigmatized status and attain a level of respect and economic advantage. Their ability to become nat kadaw (mediums of spirits) mitigates or trumps their stigmatized status.

  11. IMPACT web portal: oncology database integrating molecular profiles with actionable therapeutics.

    PubMed

    Hintzsche, Jennifer D; Yoo, Minjae; Kim, Jihye; Amato, Carol M; Robinson, William A; Tan, Aik Choon

    2018-04-20

    With the advancement of next generation sequencing technology, researchers are now able to identify important variants and structural changes in DNA and RNA in cancer patient samples. With this information, we can now correlate specific variants and/or structural changes with actionable therapeutics known to inhibit these variants. We introduce the creation of the IMPACT Web Portal, a new online resource that connects molecular profiles of tumors to approved drugs, investigational therapeutics and pharmacogenetics associated drugs. IMPACT Web Portal contains a total of 776 drugs connected to 1326 target genes and 435 target variants, fusion, and copy number alterations. The online IMPACT Web Portal allows users to search for various genetic alterations and connects them to three levels of actionable therapeutics. The results are categorized into 3 levels: Level 1 contains approved drugs separated into two groups; Level 1A contains approved drugs with variant specific information while Level 1B contains approved drugs with gene level information. Level 2 contains drugs currently in oncology clinical trials. Level 3 provides pharmacogenetic associations between approved drugs and genes. IMPACT Web Portal allows for sequencing data to be linked to actionable therapeutics for translational and drug repurposing research. The IMPACT Web Portal online resource allows users to query genes and variants to approved and investigational drugs. We envision that this resource will be a valuable database for personalized medicine and drug repurposing. IMPACT Web Portal is freely available for non-commercial use at http://tanlab.ucdenver.edu/IMPACT .

  12. AgdbNet – antigen sequence database software for bacterial typing

    PubMed Central

    Jolley, Keith A; Maiden, Martin CJ

    2006-01-01

    Background Bacterial typing schemes based on the sequences of genes encoding surface antigens require databases that provide a uniform, curated, and widely accepted nomenclature of the variants identified. Due to the differences in typing schemes, imposed by the diversity of genes targeted, creating these databases has typically required the writing of one-off code to link the database to a web interface. Here we describe agdbNet, widely applicable web database software that facilitates simultaneous BLAST querying of multiple loci using either nucleotide or peptide sequences. Results Databases are described by XML files that are parsed by a Perl CGI script. Each database can have any number of loci, which may be defined by nucleotide and/or peptide sequences. The software is currently in use on at least five public databases for the typing of Neisseria meningitidis, Campylobacter jejuni and Streptococcus equi and can be set up to query internal isolate tables or suitably-configured external isolate databases, such as those used for multilocus sequence typing. The style of the resulting website can be fully configured by modifying stylesheets and through the use of customised header and footer files that surround the output of the script. Conclusion The software provides a rapid means of setting up customised Internet antigen sequence databases. The flexible configuration options enable typing schemes with differing requirements to be accommodated. PMID:16790057

  13. The Variant p.(Arg183Trp) in SPTLC2 Causes Late-Onset Hereditary Sensory Neuropathy.

    PubMed

    Suriyanarayanan, Saranya; Auranen, Mari; Toppila, Jussi; Paetau, Anders; Shcherbii, Maria; Palin, Eino; Wei, Yu; Lohioja, Tarja; Schlotter-Weigel, Beate; Schön, Ulrike; Abicht, Angela; Rautenstrauss, Bernd; Tyynismaa, Henna; Walter, Maggie C; Hornemann, Thorsten; Ylikallio, Emil

    2016-03-01

    Hereditary sensory and autonomic neuropathy 1 (HSAN1) is an autosomal dominant disorder that can be caused by variants in SPTLC1 or SPTLC2, encoding subunits of serine palmitoyl-CoA transferase. Disease variants alter the enzyme's substrate specificity and lead to accumulation of neurotoxic 1-deoxysphingolipids. We describe two families with autosomal dominant HSAN1C caused by a new variant in SPTLC2, c.547C>T, p.(Arg183Trp). The variant changed a conserved amino acid and was not found in public variant databases. All patients had a relatively mild progressive distal sensory impairment, with onset after age 50. Small fibers were affected early, leading to abnormalities on quantitative sensory testing. Sural biopsy revealed a severe chronic axonal neuropathy with subtotal loss of myelinated axons, relatively preserved number of non-myelinated fibers and no signs for regeneration. Skin biopsy with PGP9.5 labeling showed lack of intraepidermal nerve endings early in the disease. Motor manifestations developed later in the disease course, but there was no evidence of autonomic involvement. Patients had elevated serum 1-deoxysphingolipids, and the variant protein produced elevated amounts of 1-deoxysphingolipids in vitro, which proved the pathogenicity of the variant. Our results expand the genetic spectrum of HSAN1C and provide further detail about the clinical characteristics. Sequencing of SPTLC2 should be considered in all patients presenting with mild late-onset sensory-predominant small or large fiber neuropathy.

  14. Characterization of Novel Missense Variants of SERPINA1 Gene Causing Alpha-1 Antitrypsin Deficiency.

    PubMed

    Matamala, Nerea; Lara, Beatriz; Gomez-Mariano, Gema; Martínez, Selene; Retana, Diana; Fernandez, Taiomara; Silvestre, Ramona Angeles; Belmonte, Irene; Rodriguez-Frias, Francisco; Vilar, Marçal; Sáez, Raquel; Iturbe, Igor; Castillo, Silvia; Molina-Molina, María; Texido, Anna; Tirado-Conde, Gema; Lopez-Campos, Jose Luis; Posada, Manuel; Blanco, Ignacio; Janciauskiene, Sabina; Martinez-Delgado, Beatriz

    2018-06-01

    The SERPINA1 gene is highly polymorphic, with more than 100 variants described in databases. SERPINA1 encodes the alpha-1 antitrypsin (AAT) protein, and severe deficiency of AAT is a major contributor to pulmonary emphysema and liver diseases. In Spanish patients with AAT deficiency, we identified seven new variants of the SERPINA1 gene involving amino acid substitutions in different exons: PiSDonosti (S+Ser14Phe), PiTijarafe (Ile50Asn), PiSevilla (Ala58Asp), PiCadiz (Glu151Lys), PiTarragona (Phe227Cys), PiPuerto Real (Thr249Ala), and PiValencia (Lys328Glu). We examined the characteristics of these variants and the putative association with the disease. Mutant proteins were overexpressed in HEK293T cells, and AAT expression, polymerization, degradation, and secretion, as well as antielastase activity, were analyzed by periodic acid-Schiff staining, Western blotting, pulse-chase, and elastase inhibition assays. When overexpressed, S+S14F, I50N, A58D, F227C, and T249A variants formed intracellular polymers and did not secrete AAT protein. Both the E151K and K328E variants secreted AAT protein and did not form polymers, although K328E showed intracellular retention and reduced antielastase activity. We conclude that deficient variants may be more frequent than previously thought and that their discovery is possible only by the complete sequencing of the gene and subsequent functional characterization. Better knowledge of SERPINA1 variants would improve diagnosis and management of individuals with AAT deficiency.

  15. Seshat: A Web service for accurate annotation, validation, and analysis of TP53 variants generated by conventional and next-generation sequencing.

    PubMed

    Tikkanen, Tuomas; Leroy, Bernard; Fournier, Jean Louis; Risques, Rosa Ana; Malcikova, Jitka; Soussi, Thierry

    2018-07-01

    Accurate annotation of genomic variants in human diseases is essential to allow personalized medicine. Assessment of somatic and germline TP53 alterations has now reached the clinic and is required in several circumstances such as the identification of the most effective cancer therapy for patients with chronic lymphocytic leukemia (CLL). Here, we present Seshat, a Web service for annotating TP53 information derived from sequencing data. A flexible framework allows the use of standard file formats such as Mutation Annotation Format (MAF) or Variant Call Format (VCF), as well as common TXT files. Seshat performs accurate variant annotations using the Human Genome Variation Society (HGVS) nomenclature and the stable TP53 genomic reference provided by the Locus Reference Genomic (LRG). In addition, using the 2017 release of the UMD_TP53 database, Seshat provides multiple statistical information for each TP53 variant including database frequency, functional activity, or pathogenicity. The information is delivered in standardized output tables that minimize errors and facilitate comparison of mutational data across studies. Seshat is a beneficial tool to interpret the ever-growing TP53 sequencing data generated by multiple sequencing platforms and it is freely available via the TP53 Website, http://p53.fr or directly at http://vps338341.ovh.net/. © 2018 Wiley Periodicals, Inc.

  16. Rapid functional analysis of computationally complex rare human IRF6 gene variants using a novel zebrafish model.

    PubMed

    Li, Edward B; Truong, Dawn; Hallett, Shawn A; Mukherjee, Kusumika; Schutte, Brian C; Liao, Eric C

    2017-09-01

    Large-scale sequencing efforts have captured a rapidly growing catalogue of genetic variations. However, the accurate establishment of gene variant pathogenicity remains a central challenge in translating personal genomics information to clinical decisions. Interferon Regulatory Factor 6 (IRF6) gene variants are significant genetic contributors to orofacial clefts. Although approximately three hundred IRF6 gene variants have been documented, their effects on protein functions remain difficult to interpret. Here, we demonstrate the protein functions of human IRF6 missense gene variants could be rapidly assessed in detail by their abilities to rescue the irf6 -/- phenotype in zebrafish through variant mRNA microinjections at the one-cell stage. The results revealed many missense variants previously predicted by traditional statistical and computational tools to be loss-of-function and pathogenic retained partial or full protein function and rescued the zebrafish irf6 -/- periderm rupture phenotype. Through mRNA dosage titration and analysis of the Exome Aggregation Consortium (ExAC) database, IRF6 missense variants were grouped by their abilities to rescue at various dosages into three functional categories: wild type function, reduced function, and complete loss-of-function. This sensitive and specific biological assay was able to address the nuanced functional significances of IRF6 missense gene variants and overcome many limitations faced by current statistical and computational tools in assigning variant protein function and pathogenicity. Furthermore, it unlocked the possibility for characterizing yet undiscovered human IRF6 missense gene variants from orofacial cleft patients, and illustrated a generalizable functional genomics paradigm in personalized medicine.

  17. Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies.

    PubMed

    Standish, Kristopher A; Carland, Tristan M; Lockwood, Glenn K; Pfeiffer, Wayne; Tatineni, Mahidhar; Huang, C Chris; Lamberth, Sarah; Cherkas, Yauheniya; Brodmerkel, Carrie; Jaeger, Ed; Smith, Lance; Rajagopal, Gunaretnam; Curran, Mark E; Schork, Nicholas J

    2015-09-22

    Next-generation sequencing (NGS) technologies have become much more efficient, allowing whole human genomes to be sequenced faster and cheaper than ever before. However, processing the raw sequence reads associated with NGS technologies requires care and sophistication in order to draw compelling inferences about phenotypic consequences of variation in human genomes. It has been shown that different approaches to variant calling from NGS data can lead to different conclusions. Ensuring appropriate accuracy and quality in variant calling can come at a computational cost. We describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes. We explore the influence of many factors that may impact the accuracy and efficiency of group-based variant calling, including group size, the biogeographical backgrounds of the individuals who have been sequenced, and the computing environment used. We make efficient use of the Gordon supercomputer cluster at the San Diego Supercomputer Center by incorporating job-packing and parallelization considerations into our workflow while calling variants on 437 whole human genomes generated as part of large association study. We ultimately find that our workflow resulted in high-quality variant calls in a computationally efficient manner. We argue that studies like ours should motivate further investigations combining hardware-oriented advances in computing systems with algorithmic developments to tackle emerging 'big data' problems in biomedical research brought on by the expansion of NGS technologies.

  18. Orienteering: An Annotated Bibliography = Orientierungslauf: Eine kommentierte Bibliographie.

    ERIC Educational Resources Information Center

    Seiler, Roland, Ed.; Hartmann, Wolfgang, Ed.

    1994-01-01

    Annotated bibliography of 220 books, monographs, and journal articles on orienteering published 1984-94, from SPOLIT database of the Federal Institute of Sport Science (Cologne, Germany). Annotations in English or German. Ten sections including psychological, physiological, health, sociological, and environmental aspects; training and coaching;…

  19. C++, objected-oriented programming, and astronomical data models

    NASA Technical Reports Server (NTRS)

    Farris, A.

    1992-01-01

    Contemporary astronomy is characterized by increasingly complex instruments and observational techniques, higher data collection rates, and large data archives, placing severe stress on software analysis systems. The object-oriented paradigm represents a significant new approach to software design and implementation that holds great promise for dealing with this increased complexity. The basic concepts of this approach will be characterized in contrast to more traditional procedure-oriented approaches. The fundamental features of objected-oriented programming will be discussed from a C++ programming language perspective, using examples familiar to astronomers. This discussion will focus on objects, classes and their relevance to the data type system; the principle of information hiding; and the use of inheritance to implement generalization/specialization relationships. Drawing on the object-oriented approach, features of a new database model to support astronomical data analysis will be presented.

  20. Comprehensive splicing functional analysis of DNA variants of the BRCA2 gene by hybrid minigenes

    PubMed Central

    2012-01-01

    Introduction The underlying pathogenic mechanism of a large fraction of DNA variants of disease-causing genes is the disruption of the splicing process. We aimed to investigate the effect on splicing of the BRCA2 variants c.8488-1G > A (exon 20) and c.9026_9030del (exon 23), as well as 41 BRCA2 variants reported in the Breast Cancer Information Core (BIC) mutation database. Methods DNA variants were analyzed with the splicing prediction programs NNSPLICE and Human Splicing Finder. Functional analyses of candidate variants were performed by lymphocyte RT-PCR and/or hybrid minigene assays. Forty-one BIC variants of exons 19, 20, 23 and 24 were bioinformatically selected and generated by PCR-mutagenesis of the wild type minigenes. Results Lymphocyte RT-PCR of c.8488-1G > A showed intron 19 retention and a 12-nucleotide deletion in exon 20, whereas c.9026_9030del did not show any splicing anomaly. Minigene analysis of c.8488-1G > A displayed the aforementioned aberrant isoforms but also exon 20 skipping. We further evaluated the splicing outcomes of 41 variants of four BRCA2 exons by minigene analysis. Eighteen variants presented splicing aberrations. Most variants (78.9%) disrupted the natural splice sites, whereas four altered putative enhancers/silencers and had a weak effect. Fluorescent RT-PCR of minigenes accurately detected 14 RNA isoforms generated by cryptic site usage, exon skipping and intron retention events. Fourteen variants showed total splicing disruptions and were predicted to truncate or eliminate essential domains of BRCA2. Conclusions A relevant proportion of BRCA2 variants are correlated with splicing disruptions, indicating that RNA analysis is a valuable tool to assess the pathogenicity of a particular DNA change. The minigene system is a straightforward and robust approach to detect variants with an impact on splicing and contributes to a better knowledge of this gene expression step. PMID:22632462

  1. Imprecision and Uncertainty in the UFO Database Model.

    ERIC Educational Resources Information Center

    Van Gyseghem, Nancy; De Caluwe, Rita

    1998-01-01

    Discusses how imprecision and uncertainty are dealt with in the UFO (Uncertainty and Fuzziness in an Object-oriented) database model. Such information is expressed by means of possibility distributions, and modeled by means of the proposed concept of "role objects." The role objects model uncertain, tentative information about objects,…

  2. A memory advantage for past-oriented over future-oriented performance feedback.

    PubMed

    Nash, Robert A; Winstone, Naomi E; Gregory, Samantha E A; Papps, Emily

    2018-03-05

    People frequently receive performance feedback that describes how well they achieved in the past, and how they could improve in future. In educational contexts, future-oriented (directive) feedback is often argued to be more valuable to learners than past-oriented (evaluative) feedback; critically, prior research led us to predict that it should also be better remembered. We tested this prediction in six experiments. Subjects read written feedback containing evaluative and directive comments, which supposedly related to essays they had previously written (Experiments 1-2), or to essays another person had written (Experiments 3-6). Subjects then tried to reproduce the feedback from memory after a short delay. In all six experiments, the data strongly revealed the opposite effect to the one we predicted: despite only small differences in wording, evaluative feedback was in fact recalled consistently better than directive feedback. Furthermore, even when adult subjects did recall directive feedback, they frequently misremembered it in an evaluative style. These findings appear at odds with the position that being oriented toward the future is advantageous to memory. They also raise important questions about the possible behavioral effects and generalizability of such biases, in terms of students' academic performance. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  3. Differences in Transcriptional Activity of Human Papillomavirus Type 6 Molecular Variants in Recurrent Respiratory Papillomatosis

    PubMed Central

    Measso do Bonfim, Caroline; Simão Sobrinho, João; Lacerda Nogueira, Rodrigo; Salgado Kupper, Daniel; Cardoso Pereira Valera, Fabiana; Lacerda Nogueira, Maurício; Villa, Luisa Lina; Rahal, Paula; Sichero, Laura

    2015-01-01

    A significant proportion of recurrent respiratory papillomatosis (RRP) is caused by human papillomavirus type 6 (HPV-6). The long control region (LCR) contains cis-elements for regulation of transcription. Our aim was to characterize LCR HPV-6 variants in RRP cases, compare promoter activity of these isolates and search for cellular transcription factors (TFs) that could explain the differences observed. The complete LCR from 13 RRP was analyzed. Transcriptional activity of 5 variants was compared using luciferase assays. Differences in putative TFs binding sites among variants were revealed using the TRANSFAC database. Chromatin immunoprecipation (CHIP) and luciferase assays were used to evaluate TF binding and impact upon transcription, respectively. Juvenile-onset RRP cases harbored exclusively HPV-6vc related variants, whereas among adult-onset cases HPV-6a variants were more prevalent. The HPV-6vc reference was more transcriptionally active than the HPV-6a reference. Active FOXA1, ELF1 and GATA1 binding sites overlap variable nucleotide positions among isolates and influenced LCR activity. Furthermore, our results support a crucial role for ELF1 on transcriptional downregulation. We identified TFs implicated in the regulation of HPV-6 early gene expression. Many of these factors are mutated in cancer or are putative cancer biomarkers, and must be further studied. PMID:26151558

  4. Principles and Recommendations for Standardizing the Use of the Next-Generation Sequencing Variant File in Clinical Settings.

    PubMed

    Lubin, Ira M; Aziz, Nazneen; Babb, Lawrence J; Ballinger, Dennis; Bisht, Himani; Church, Deanna M; Cordes, Shaun; Eilbeck, Karen; Hyland, Fiona; Kalman, Lisa; Landrum, Melissa; Lockhart, Edward R; Maglott, Donna; Marth, Gabor; Pfeifer, John D; Rehm, Heidi L; Roy, Somak; Tezak, Zivana; Truty, Rebecca; Ullman-Cullere, Mollie; Voelkerding, Karl V; Worthey, Elizabeth A; Zaranek, Alexander W; Zook, Justin M

    2017-05-01

    A national workgroup convened by the Centers for Disease Control and Prevention identified principles and made recommendations for standardizing the description of sequence data contained within the variant file generated during the course of clinical next-generation sequence analysis for diagnosing human heritable conditions. The specifications for variant files were initially developed to be flexible with regard to content representation to support a variety of research applications. This flexibility permits variation with regard to how sequence findings are described and this depends, in part, on the conventions used. For clinical laboratory testing, this poses a problem because these differences can compromise the capability to compare sequence findings among laboratories to confirm results and to query databases to identify clinically relevant variants. To provide for a more consistent representation of sequence findings described within variant files, the workgroup made several recommendations that considered alignment to a common reference sequence, variant caller settings, use of genomic coordinates, and gene and variant naming conventions. These recommendations were considered with regard to the existing variant file specifications presently used in the clinical setting. Adoption of these recommendations is anticipated to reduce the potential for ambiguity in describing sequence findings and facilitate the sharing of genomic data among clinical laboratories and other entities. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  5. EFHC1 variants in juvenile myoclonic epilepsy: reanalysis according to NHGRI and ACMG guidelines for assigning disease causality.

    PubMed

    Bailey, Julia N; Patterson, Christopher; de Nijs, Laurence; Durón, Reyna M; Nguyen, Viet-Huong; Tanaka, Miyabi; Medina, Marco T; Jara-Prado, Aurelio; Martínez-Juárez, Iris E; Ochoa, Adriana; Molina, Yolli; Suzuki, Toshimitsu; Alonso, María E; Wight, Jenny E; Lin, Yu-Chen; Guilhoto, Laura; Targas Yacubian, Elza Marcia; Machado-Salas, Jesús; Daga, Andrea; Yamakawa, Kazuhiro; Grisar, Thierry M; Lakaye, Bernard; Delgado-Escueta, Antonio V

    2017-02-01

    EFHC1 variants are the most common mutations in inherited myoclonic and grand mal clonic-tonic-clonic (CTC) convulsions of juvenile myoclonic epilepsy (JME). We reanalyzed 54 EFHC1 variants associated with epilepsy from 17 cohorts based on National Human Genome Research Institute (NHGRI) and American College of Medical Genetics and Genomics (ACMG) guidelines for interpretation of sequence variants. We calculated Bayesian LOD scores for variants in coinheritance, unconditional exact tests and odds ratios (OR) in case-control associations, allele frequencies in genome databases, and predictions for conservation/pathogenicity. We reviewed whether variants damage EFHC1 functions, whether efhc1 -/- KO mice recapitulate CTC convulsions and "microdysgenesis" neuropathology, and whether supernumerary synaptic and dendritic phenotypes can be rescued in the fly model when EFHC1 is overexpressed. We rated strengths of evidence and applied ACMG combinatorial criteria for classifying variants. Nine variants were classified as "pathogenic," 14 as "likely pathogenic," 9 as "benign," and 2 as "likely benign." Twenty variants of unknown significance had an insufficient number of ancestry-matched controls, but ORs exceeded 5 when compared with racial/ethnic-matched Exome Aggregation Consortium (ExAC) controls. NHGRI gene-level evidence and variant-level evidence establish EFHC1 as the first non-ion channel microtubule-associated protein whose mutations disturb R-type VDCC and TRPM2 calcium currents in overgrown synapses and dendrites within abnormally migrated dislocated neurons, thus explaining CTC convulsions and "microdysgenesis" neuropathology of JME.Genet Med 19 2, 144-156.

  6. Pleiotropic Effects of Variants in Dementia Genes in Parkinson Disease.

    PubMed

    Ibanez, Laura; Dube, Umber; Davis, Albert A; Fernandez, Maria V; Budde, John; Cooper, Breanna; Diez-Fairen, Monica; Ortega-Cubero, Sara; Pastor, Pau; Perlmutter, Joel S; Cruchaga, Carlos; Benitez, Bruno A

    2018-01-01

    Background: The prevalence of dementia in Parkinson disease (PD) increases dramatically with advancing age, approaching 80% in patients who survive 20 years with the disease. Increasing evidence suggests clinical, pathological and genetic overlap between Alzheimer disease, dementia with Lewy bodies and frontotemporal dementia with PD. However, the contribution of the dementia-causing genes to PD risk, cognitive impairment and dementia in PD is not fully established. Objective: To assess the contribution of coding variants in Mendelian dementia-causing genes on the risk of developing PD and the effect on cognitive performance of PD patients. Methods: We analyzed the coding regions of the amyloid-beta precursor protein ( APP ), Presenilin 1 and 2 ( PSEN1, PSEN2 ), and Granulin ( GRN ) genes from 1,374 PD cases and 973 controls using pooled-DNA targeted sequence, human exome-chip and whole-exome sequencing (WES) data by single variant and gene base (SKAT-O and burden tests) analyses. Global cognitive function was assessed using the Mini-Mental State Examination (MMSE) or the Montreal Cognitive Assessment (MoCA). The effect of coding variants in dementia-causing genes on cognitive performance was tested by multiple regression analysis adjusting for gender, disease duration, age at dementia assessment, study site and APOE carrier status. Results: Known AD pathogenic mutations in the PSEN1 (p.A79V) and PSEN2 (p.V148I) genes were found in 0.3% of all PD patients. There was a significant burden of rare, likely damaging variants in the GRN and PSEN1 genes in PD patients when compared with frequencies in the European population from the ExAC database. Multiple regression analysis revealed that PD patients carrying rare variants in the APP, PSEN1, PSEN2 , and GRN genes exhibit lower cognitive tests scores than non-carrier PD patients ( p = 2.0 × 10 -4 ), independent of age at PD diagnosis, age at evaluation, APOE status or recruitment site. Conclusions: Pathogenic mutations

  7. An Object-Oriented Collection of Minimum Degree Algorithms: Design, Implementation, and Experiences

    NASA Technical Reports Server (NTRS)

    Kumfert, Gary; Pothen, Alex

    1999-01-01

    The multiple minimum degree (MMD) algorithm and its variants have enjoyed 20+ years of research and progress in generating fill-reducing orderings for sparse, symmetric positive definite matrices. Although conceptually simple, efficient implementations of these algorithms are deceptively complex and highly specialized. In this case study, we present an object-oriented library that implements several recent minimum degree-like algorithms. We discuss how object-oriented design forces us to decompose these algorithms in a different manner than earlier codes and demonstrate how this impacts the flexibility and efficiency of our C++ implementation. We compare the performance of our code against other implementations in C or Fortran.

  8. Analysis of RNA-Seq datasets reveals enrichment of tissue-specific splice variants for nuclear envelope proteins.

    PubMed

    Capitanchik, Charlotte; Dixon, Charles; Swanson, Selene K; Florens, Laurence; Kerr, Alastair R W; Schirmer, Eric C

    2018-06-18

    Nuclear envelopathies/laminopathies yield tissue-specific pathologies, yet arise from mutation of ubiquitously-expressed genes. One possible explanation of this tissue specificity is that tissue-specific partners become disrupted from larger complexes, but a little investigated alternate hypothesis is that the mutated proteins themselves have tissue-specific splice variants. Here, we analyze RNA-Seq datasets to identify muscle-specific splice variants of nuclear envelope genes that could be relevant to the study of laminopathies, particularly muscular dystrophies, that are not currently annotated in sequence databases. Notably, we found novel isoforms or tissue-specificity of isoforms for: Lap2, linked to cardiomyopathy; Nesprin 2, linked to Emery-Dreifuss muscular dystrophy and Lmo7, a regulator of the emerin gene that is linked to Emery-Dreifuss muscular dystrophy. Interestingly, the muscle-specific exon in Lmo7 is rich in serine phosphorylation motifs, suggesting an important regulatory function. Evidence for muscle-specific splice variants in non-nuclear envelope proteins linked to other muscular dystrophies was also found. Tissue-specific variants were also indicated for several nucleoporins including Nup54, Nup133, Nup153 and Nup358/RanBP2. We confirmed expression of novel Lmo7 and RanBP2 variants with RT-PCR and found that specific knockdown of the Lmo7 variant caused a reduction in myogenic index during mouse C2C12 myogenesis. Global analysis revealed an enrichment of tissue-specific splice variants for nuclear envelope proteins in general compared to the rest of the genome, suggesting that splice variants contribute to regulating its tissue-specific functions.

  9. Common hyperspectral image database design

    NASA Astrophysics Data System (ADS)

    Tian, Lixun; Liao, Ningfang; Chai, Ali

    2009-11-01

    This paper is to introduce Common hyperspectral image database with a demand-oriented Database design method (CHIDB), which comprehensively set ground-based spectra, standardized hyperspectral cube, spectral analysis together to meet some applications. The paper presents an integrated approach to retrieving spectral and spatial patterns from remotely sensed imagery using state-of-the-art data mining and advanced database technologies, some data mining ideas and functions were associated into CHIDB to make it more suitable to serve in agriculture, geological and environmental areas. A broad range of data from multiple regions of the electromagnetic spectrum is supported, including ultraviolet, visible, near-infrared, thermal infrared, and fluorescence. CHIDB is based on dotnet framework and designed by MVC architecture including five main functional modules: Data importer/exporter, Image/spectrum Viewer, Data Processor, Parameter Extractor, and On-line Analyzer. The original data were all stored in SQL server2008 for efficient search, query and update, and some advance Spectral image data Processing technology are used such as Parallel processing in C#; Finally an application case is presented in agricultural disease detecting area.

  10. Genovar: a detection and visualization tool for genomic variants.

    PubMed

    Jung, Kwang Su; Moon, Sanghoon; Kim, Young Jin; Kim, Bong-Jo; Park, Kiejung

    2012-05-08

    Along with single nucleotide polymorphisms (SNPs), copy number variation (CNV) is considered an important source of genetic variation associated with disease susceptibility. Despite the importance of CNV, the tools currently available for its analysis often produce false positive results due to limitations such as low resolution of array platforms, platform specificity, and the type of CNV. To resolve this problem, spurious signals must be separated from true signals by visual inspection. None of the previously reported CNV analysis tools support this function and the simultaneous visualization of comparative genomic hybridization arrays (aCGH) and sequence alignment. The purpose of the present study was to develop a useful program for the efficient detection and visualization of CNV regions that enables the manual exclusion of erroneous signals. A JAVA-based stand-alone program called Genovar was developed. To ascertain whether a detected CNV region is a novel variant, Genovar compares the detected CNV regions with previously reported CNV regions using the Database of Genomic Variants (DGV, http://projects.tcag.ca/variation) and the Single Nucleotide Polymorphism Database (dbSNP). The current version of Genovar is capable of visualizing genomic data from sources such as the aCGH data file and sequence alignment format files. Genovar is freely accessible and provides a user-friendly graphic user interface (GUI) to facilitate the detection of CNV regions. The program also provides comprehensive information to help in the elimination of spurious signals by visual inspection, making Genovar a valuable tool for reducing false positive CNV results. http://genovar.sourceforge.net/.

  11. Heterozygous RFX6 protein truncating variants are associated with MODY with reduced penetrance.

    PubMed

    Patel, Kashyap A; Kettunen, Jarno; Laakso, Markku; Stančáková, Alena; Laver, Thomas W; Colclough, Kevin; Johnson, Matthew B; Abramowicz, Marc; Groop, Leif; Miettinen, Päivi J; Shepherd, Maggie H; Flanagan, Sarah E; Ellard, Sian; Inagaki, Nobuya; Hattersley, Andrew T; Tuomi, Tiinamaija; Cnop, Miriam; Weedon, Michael N

    2017-10-12

    Finding new causes of monogenic diabetes helps understand glycaemic regulation in humans. To find novel genetic causes of maturity-onset diabetes of the young (MODY), we sequenced MODY cases with unknown aetiology and compared variant frequencies to large public databases. From 36 European patients, we identify two probands with novel RFX6 heterozygous nonsense variants. RFX6 protein truncating variants are enriched in the MODY discovery cohort compared to the European control population within ExAC (odds ratio = 131, P = 1 × 10 -4 ). We find similar results in non-Finnish European (n = 348, odds ratio = 43, P = 5 × 10 -5 ) and Finnish (n = 80, odds ratio = 22, P = 1 × 10 -6 ) replication cohorts. RFX6 heterozygotes have reduced penetrance of diabetes compared to common HNF1A and HNF4A-MODY mutations (27, 70 and 55% at 25 years of age, respectively). The hyperglycaemia results from beta-cell dysfunction and is associated with lower fasting and stimulated gastric inhibitory polypeptide (GIP) levels. Our study demonstrates that heterozygous RFX6 protein truncating variants are associated with MODY with reduced penetrance.Maturity-onset diabetes of the young (MODY) is the most common subtype of familial diabetes. Here, Patel et al. use targeted DNA sequencing of MODY patients and large-scale publically available data to show that RFX6 heterozygous protein truncating variants cause reduced penetrance MODY.

  12. X-Linked Glomerulopathy Due to COL4A5 Founder Variant.

    PubMed

    Barua, Moumita; John, Rohan; Stella, Lorenzo; Li, Weili; Roslin, Nicole M; Sharif, Bedra; Hack, Saidah; Lajoie-Starkell, Ginette; Schwaderer, Andrew L; Becknell, Brian; Wuttke, Matthias; Köttgen, Anna; Cattran, Daniel; Paterson, Andrew D; Pei, York

    2018-03-01

    Alport syndrome is a rare hereditary disorder caused by rare variants in 1 of 3 genes encoding for type IV collagen. Rare variants in COL4A5 on chromosome Xq22 cause X-linked Alport syndrome, which accounts for ∼80% of the cases. Alport syndrome has a variable clinical presentation, including progressive kidney failure, hearing loss, and ocular defects. Exome sequencing performed in 2 affected related males with an undefined X-linked glomerulopathy characterized by global and segmental glomerulosclerosis, mesangial hypercellularity, and vague basement membrane immune complex deposition revealed a COL4A5 sequence variant, a substitution of a thymine by a guanine at nucleotide 665 (c.T665G; rs281874761) of the coding DNA predicted to lead to a cysteine to phenylalanine substitution at amino acid 222, which was not seen in databases cataloguing natural human genetic variation, including dbSNP138, 1000 Genomes Project release version 01-11-2004, Exome Sequencing Project 21-06-2014, or ExAC 01-11-2014. Review of the literature identified 2 additional families with the same COL4A5 variant leading to similar atypical histopathologic features, suggesting a unique pathologic mechanism initiated by this specific rare variant. Homology modeling suggests that the substitution alters the structural and dynamic properties of the type IV collagen trimer. Genetic analysis comparing members of the 3 families indicated a distant relationship with a shared haplotype, implying a founder effect. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.

  13. Producing approximate answers to database queries

    NASA Technical Reports Server (NTRS)

    Vrbsky, Susan V.; Liu, Jane W. S.

    1993-01-01

    We have designed and implemented a query processor, called APPROXIMATE, that makes approximate answers available if part of the database is unavailable or if there is not enough time to produce an exact answer. The accuracy of the approximate answers produced improves monotonically with the amount of data retrieved to produce the result. The exact answer is produced if all of the needed data are available and query processing is allowed to continue until completion. The monotone query processing algorithm of APPROXIMATE works within the standard relational algebra framework and can be implemented on a relational database system with little change to the relational architecture. We describe here the approximation semantics of APPROXIMATE that serves as the basis for meaningful approximations of both set-valued and single-valued queries. We show how APPROXIMATE is implemented to make effective use of semantic information, provided by an object-oriented view of the database, and describe the additional overhead required by APPROXIMATE.

  14. TAPAS: tools to assist the targeted protein quantification of human alternative splice variants.

    PubMed

    Yang, Jae-Seong; Sabidó, Eduard; Serrano, Luis; Kiel, Christina

    2014-10-15

    In proteomes of higher eukaryotes, many alternative splice variants can only be detected by their shared peptides. This makes it highly challenging to use peptide-centric mass spectrometry to distinguish and to quantify protein isoforms resulting from alternative splicing events. We have developed two complementary algorithms based on linear mathematical models to efficiently compute a minimal set of shared and unique peptides needed to quantify a set of isoforms and splice variants. Further, we developed a statistical method to estimate the splice variant abundances based on stable isotope labeled peptide quantities. The algorithms and databases are integrated in a web-based tool, and we have experimentally tested the limits of our quantification method using spiked proteins and cell extracts. The TAPAS server is available at URL http://davinci.crg.es/tapas/. luis.serrano@crg.eu or christina.kiel@crg.eu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. MaizeGDB update: New tools, data, and interface for the maize model organism database

    USDA-ARS?s Scientific Manuscript database

    MaizeGDB is a highly curated, community-oriented database and informatics service to researchers focused on the crop plant and model organism Zea mays ssp. mays. Although some form of the maize community database has existed over the last 25 years, there have only been two major releases. In 1991, ...

  16. Specialized microbial databases for inductive exploration of microbial genome sequences

    PubMed Central

    Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine

    2005-01-01

    Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474

  17. Comparative transcriptome analysis of three color variants of the sea cucumber Apostichopus japonicus.

    PubMed

    Jo, Jihoon; Park, Jongsun; Lee, Hyun-Gwan; Kern, Elizabeth M A; Cheon, Seongmin; Jin, Soyeong; Park, Joong-Ki; Cho, Sung-Jin; Park, Chungoo

    2016-08-01

    The sea cucumber Apostichopus japonicus Selenka 1867 represents an important resource in biomedical research, traditional medicine, and the seafood industry. Much of the commercial value of A. japonicus is determined by dorsal/ventral color variation (red, green, and black), yet the taxonomic relationships between these color variants are not clearly understood. We performed the first comparative analysis of de novo assembled transcriptome data from three color variants of A. japonicus. Using the Illumina platform, we sequenced nearly 177,596,774 clean reads representing a total of 18.2Gbp of sea cucumber transcriptome. A comparison of over 0.3 million transcript scaffolds against the Uniprot/Swiss-Prot database yielded 8513, 8602, and 8588 positive matches for green, red, and black body color transcriptomes, respectively. Using the Panther gene classification system, we assessed an extensive and diverse set of expressed genes in three color variants and found that (1) among the three color variants of A. japonicus, genes associated with RNA binding protein, oxidoreductase, nucleic acid binding, transferase, and KRAB box transcription factor were most commonly expressed; and (2) the main protein functional classes are differently regulated in all three color variants (extracellular matrix protein and phosphatase for green color, transporter and potassium channel for red color, and G-protein modulator and enzyme modulator for black color). This work will assist in the discovery and annotation of novel genes that play significant morphological and physiological roles in color variants of A. japonicus, and these sequence data will provide a useful set of resources for the rapidly growing sea cucumber aquaculture industry. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Variant pathogenicity evaluation in the community-driven Inherited Neuropathy Variant Browser.

    PubMed

    Saghira, Cima; Bis, Dana M; Stanek, David; Strickland, Alleene; Herrmann, David N; Reilly, Mary M; Scherer, Steven S; Shy, Michael E; Züchner, Stephan

    2018-05-01

    Charcot-Marie-Tooth disease (CMT) is an umbrella term for inherited neuropathies affecting an estimated one in 2,500 people. Over 120 CMT and related genes have been identified and clinical gene panels often contain more than 100 genes. Such a large genomic space will invariantly yield variants of uncertain clinical significance (VUS) in nearly any person tested. This rise in number of VUS creates major challenges for genetic counseling. Additionally, fewer individual variants in known genes are being published as the academic merit is decreasing, and most testing now happens in clinical laboratories, which typically do not correlate their variants with clinical phenotypes. For CMT, we aim to encourage and facilitate the global capture of variant data to gain a large collection of alleles in CMT genes, ideally in conjunction with phenotypic information. The Inherited Neuropathy Variant Browser provides user-friendly open access to currently reported variation in CMT genes. Geneticists, physicians, and genetic counselors can enter variants detected by clinical tests or in research studies in addition to genetic variation gathered from published literature, which are then submitted to ClinVar biannually. Active participation of the broader CMT community will provide an advance over existing resources for interpretation of CMT genetic variation. © 2018 Wiley Periodicals, Inc.

  19. Variant terminology. [for aerospace information systems

    NASA Technical Reports Server (NTRS)

    Buchan, Ronald L.

    1991-01-01

    A system called Variant Terminology Switching (VTS) is set forth that is intended to provide computer-assisted spellings for terms that have American and British versions. VTS is based on the use of brackets, parentheses, and other symbols in conjunction with letters that distinguish American and British spellings. The symbols are used in the systems as indicators of actions such as deleting, adding, and replacing letters as well as replacing entire words and concepts. The system is shown to be useful for the intended purpose and also for the recognition of misspellings and for the standardization of computerized input/output. The VTS system is of interest to the development of international retrieval systems for aerospace and other technical databases that enhance the use by the global scientific community.

  20. Association of Arrhythmia-Related Genetic Variants With Phenotypes Documented in Electronic Medical Records.

    PubMed

    Van Driest, Sara L; Wells, Quinn S; Stallings, Sarah; Bush, William S; Gordon, Adam; Nickerson, Deborah A; Kim, Jerry H; Crosslin, David R; Jarvik, Gail P; Carrell, David S; Ralston, James D; Larson, Eric B; Bielinski, Suzette J; Olson, Janet E; Ye, Zi; Kullo, Iftikhar J; Abul-Husn, Noura S; Scott, Stuart A; Bottinger, Erwin; Almoguera, Berta; Connolly, John; Chiavacci, Rosetta; Hakonarson, Hakon; Rasmussen-Torvik, Laura J; Pan, Vivian; Persell, Stephen D; Smith, Maureen; Chisholm, Rex L; Kitchner, Terrie E; He, Max M; Brilliant, Murray H; Wallace, John R; Doheny, Kimberly F; Shoemaker, M Benjamin; Li, Rongling; Manolio, Teri A; Callis, Thomas E; Macaya, Daniela; Williams, Marc S; Carey, David; Kapplinger, Jamie D; Ackerman, Michael J; Ritchie, Marylyn D; Denny, Joshua C; Roden, Dan M

    2016-01-05

    Large-scale DNA sequencing identifies incidental rare variants in established Mendelian disease genes, but the frequency of related clinical phenotypes in unselected patient populations is not well established. Phenotype data from electronic medical records (EMRs) may provide a resource to assess the clinical relevance of rare variants. To determine the clinical phenotypes from EMRs for individuals with variants designated as pathogenic by expert review in arrhythmia susceptibility genes. This prospective cohort study included 2022 individuals recruited for nonantiarrhythmic drug exposure phenotypes from October 5, 2012, to September 30, 2013, for the Electronic Medical Records and Genomics Network Pharmacogenomics project from 7 US academic medical centers. Variants in SCN5A and KCNH2, disease genes for long QT and Brugada syndromes, were assessed for potential pathogenicity by 3 laboratories with ion channel expertise and by comparison with the ClinVar database. Relevant phenotypes were determined from EMRs, with data available from 2002 (or earlier for some sites) through September 10, 2014. One or more variants designated as pathogenic in SCN5A or KCNH2. Arrhythmia or electrocardiographic (ECG) phenotypes defined by International Classification of Diseases, Ninth Revision (ICD-9) codes, ECG data, and manual EMR review. Among 2022 study participants (median age, 61 years [interquartile range, 56-65 years]; 1118 [55%] female; 1491 [74%] white), a total of 122 rare (minor allele frequency <0.5%) nonsynonymous and splice-site variants in 2 arrhythmia susceptibility genes were identified in 223 individuals (11% of the study cohort). Forty-two variants in 63 participants were designated potentially pathogenic by at least 1 laboratory or ClinVar, with low concordance across laboratories (Cohen κ = 0.26). An ICD-9 code for arrhythmia was found in 11 of 63 (17%) variant carriers vs 264 of 1959 (13%) of those without variants (difference, +4%; 95% CI, -5% to +13

  1. Spectrum of PAH gene variants among a population of Han Chinese patients with phenylketonuria from northern China.

    PubMed

    Liu, Ning; Huang, Qiuying; Li, Qingge; Zhao, Dehua; Li, Xiaole; Cui, Lixia; Bai, Ying; Feng, Yin; Kong, Xiangdong

    2017-10-05

    Phenylketonuria (PKU), which primarily results from a deficiency of phenylalanine hydroxylase (PAH), is one of the most common inherited inborn errors of metabolism that impairs postnatal cognitive development. The incidence of various PAH variations differs by race and ethnicity. The aim of the present study was to characterize the PAH gene variants of a Han population from Northern China. In total, 655 PKU patients and their families were recruited for this study; each proband was diagnosed both clinically and biochemically with phenylketonuria. Subjects were sequentially screened for single-base variants and exon deletions or duplications within PAH via direct Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA). A spectrum of 174 distinct PAH variants was identified: 152 previously documented variants and 22 novel variants. While single-base variants were distributed throughout the 13 exons, they were particularly concentrated in exons 7 (33.3%), 11 (14.2%), 6 (13.2%), 12 (11.0%), 3 (10.4%), and 5 (4.4%). The predominant variant was p.Arg243Gln (17.7%), followed by Ex6-96A > G (8.3%), p.Val399 = (6.4%), p.Arg53His (4.7%), p.Tyr356* (4.7%), p.Arg241Cys (4.6%), p.Arg413Pro (4.6%), p.Arg111* (4.4%), and c.442-1G > A (3.4%). Notably, two patients were also identified as carrying de novo variants. The composition of PAH gene variants in this Han population from Northern China was distinct from those of other ethnic groups. As such, the construction of a PAH gene variant database for Northern China is necessary to lay a foundation for genetic-based diagnoses, prenatal diagnoses, and population screening.

  2. Mutation databases and other online sites as a resource for transfusion medicine: history and attributes.

    PubMed

    Blumenfeld, Olga O

    2002-04-01

    Recent advances in molecular biology and technology have provided evidence, at a molecular level, for long-known observations that the human genome is not unique but is characterized by individual sequence variation. At the present time, documentation of genetic variation occurring in a large number of genes is increasing exponentially. The characterization of alleles that encode a variety of blood group antigens has been particularly fruitful for transfusion medicine. Phenotypic variation, as identified by the serologic study of blood group variants, is required to identify the presence of a variant allele. Many of the other alleles currently recorded have been selected and identified on the basis of inherited disease traits. New approaches document single nucleotide polymorphisms that occur throughout the genome and best show how the DNA sequence varies in the human population. The primary data dealing with variant alleles or more general genomic variation are scattered throughout the scientific literature and only within the last few years has information begun to be organized into databases. This article provides guidance on how to access those databases online as a source of information about genetic variation for purposes of molecular, clinical, and diagnostic medicine, research, and teaching. The attributes of the sites are described. A more detailed view of the database dealing specifically with alleles of genes encoding the blood group antigens includes a brief preliminary analysis of the molecular basis for observed polymorphisms. Other online sites that may be particularly useful to the transfusion medicine readership as well as a brief historical account are also presented. Copyright 2002, Elsevier Science (USA). All rights reserved.

  3. Penicillinase Studies on L-Phase Variants, G-Phase Variants, and Reverted Strains of Staphylococcus aureus

    PubMed Central

    Simon, Harold J.; Yin, Elaine Jong

    1970-01-01

    L-phase variants and small colony (G-phase) variants derived from penicillinase-producing Staphylococcus aureus strains were tested for penicillinase (beta lactamase) production. A refined variation of the modified Gots test for penicillinase was used to demonstrate penicillinase synthesis. Penicillinase synthesis was reduced in L-phase variants and G-phase variants when compared to parental strains. After reversion of variants to vegetative stages had been induced, revertants were tested for production of penicillinase, coagulase, and alpha hemolysin, mannitol fermentation, and pigment production, and comparisons were made between parent and reverted vegetative forms. All revertants of G-phase variants retained penicillinase activity. Most revertants of L-phase variants showed reduction or loss of penicillinase activity. Retention of coagulase activity, alpha hemolysin production, mannitol fermentation, pigmentation, and phage type varied among revertants. Images PMID:16557890

  4. Group-oriented coordination models for distributed client-server computing

    NASA Technical Reports Server (NTRS)

    Adler, Richard M.; Hughes, Craig S.

    1994-01-01

    This paper describes group-oriented control models for distributed client-server interactions. These models transparently coordinate requests for services that involve multiple servers, such as queries across distributed databases. Specific capabilities include: decomposing and replicating client requests; dispatching request subtasks or copies to independent, networked servers; and combining server results into a single response for the client. The control models were implemented by combining request broker and process group technologies with an object-oriented communication middleware tool. The models are illustrated in the context of a distributed operations support application for space-based systems.

  5. BigQ: a NoSQL based framework to handle genomic variants in i2b2.

    PubMed

    Gabetta, Matteo; Limongelli, Ivan; Rizzo, Ettore; Riva, Alberto; Segagni, Daniele; Bellazzi, Riccardo

    2015-12-29

    Precision medicine requires the tight integration of clinical and molecular data. To this end, it is mandatory to define proper technological solutions able to manage the overwhelming amount of high throughput genomic data needed to test associations between genomic signatures and human phenotypes. The i2b2 Center (Informatics for Integrating Biology and the Bedside) has developed a widely internationally adopted framework to use existing clinical data for discovery research that can help the definition of precision medicine interventions when coupled with genetic data. i2b2 can be significantly advanced by designing efficient management solutions of Next Generation Sequencing data. We developed BigQ, an extension of the i2b2 framework, which integrates patient clinical phenotypes with genomic variant profiles generated by Next Generation Sequencing. A visual programming i2b2 plugin allows retrieving variants belonging to the patients in a cohort by applying filters on genomic variant annotations. We report an evaluation of the query performance of our system on more than 11 million variants, showing that the implemented solution scales linearly in terms of query time and disk space with the number of variants. In this paper we describe a new i2b2 web service composed of an efficient and scalable document-based database that manages annotations of genomic variants and of a visual programming plug-in designed to dynamically perform queries on clinical and genetic data. The system therefore allows managing the fast growing volume of genomic variants and can be used to integrate heterogeneous genomic annotations.

  6. The insulin-sensitivity sulphonylurea receptor variant is associated with thyrotoxic paralysis.

    PubMed

    Rolim, Ana Luiza R; Lindsey, Susan C; Kunii, Ilda S; Crispim, Felipe; Moisés, Regina Célia M S; Maciel, Rui M B; Dias-da-Silva, Magnus R

    2014-10-01

    Thyrotoxicosis is the most common cause of the acquired flaccid muscle paralysis in adults called thyrotoxic periodic paralysis (TPP) and is characterised by transient hypokalaemia and hypophosphataemia under high thyroid hormone levels that is frequently precipitated by carbohydrate load. The sulphonylurea receptor 1 (SUR1 (ABCC8)) is an essential regulatory subunit of the β-cell ATP-sensitive K(+) channel that controls insulin secretion after feeding. Additionally, the SUR1 Ala1369Ser variant appears to be associated with insulin sensitivity. We examined the ABCC8 gene at the single nucleotide level using PCR-restriction fragment length polymorphism (RFLP) analysis to determine its allelic variant frequency and calculated the frequency of the Ala1369Ser C-allele variant in a cohort of 36 Brazilian TPP patients in comparison with 32 controls presenting with thyrotoxicosis without paralysis (TWP). We verified that the frequency of the alanine 1369 C-allele was significantly higher in TPP patients than in TWP patients (61.1 vs 34.4%, odds ratio (OR)=3.42, P=0.039) and was significantly more common than the minor allele frequency observed in the general population from the 1000 Genomes database (61.1 vs 29.0%, OR=4.87, P<0.005). Additionally, the C-allele frequency was similar between TWP patients and the general population (34.4 vs 29%, OR=1.42, P=0.325). We have demonstrated that SUR1 alanine 1369 variant is associated with allelic susceptibility to TPP. We suggest that the hyperinsulinaemia that is observed in TPP may be linked to the ATP-sensitive K(+)/SUR1 alanine variant and, therefore, contribute to the major feedforward precipitating factors in the pathophysiology of TPP. © 2014 Society for Endocrinology.

  7. Identification of genomic variants putatively targeted by selection during dog domestication.

    PubMed

    Cagan, Alex; Blass, Torsten

    2016-01-12

    Dogs [Canis lupus familiaris] were the first animal species to be domesticated and continue to occupy an important place in human societies. Recent studies have begun to reveal when and where dog domestication occurred. While much progress has been made in identifying the genetic basis of phenotypic differences between dog breeds we still know relatively little about the genetic changes underlying the phenotypes that differentiate all dogs from their wild progenitors, wolves [Canis lupus]. In particular, dogs generally show reduced aggression and fear towards humans compared to wolves. Therefore, selection for tameness was likely a necessary prerequisite for dog domestication. With the increasing availability of whole-genome sequence data it is possible to try and directly identify the genetic variants contributing to the phenotypic differences between dogs and wolves. We analyse the largest available database of genome-wide polymorphism data in a global sample of dogs 69 and wolves 7. We perform a scan to identify regions of the genome that are highly differentiated between dogs and wolves. We identify putatively functional genomic variants that are segregating or at high frequency [> = 0.75 Fst] for alternative alleles between dogs and wolves. A biological pathways analysis of the genes containing these variants suggests that there has been selection on the 'adrenaline and noradrenaline biosynthesis pathway', well known for its involvement in the fight-or-flight response. We identify 11 genes with putatively functional variants fixed for alternative alleles between dogs and wolves. The segregating variants in these genes are strong candidates for having been targets of selection during early dog domestication. We present the first genome-wide analysis of the different categories of putatively functional variants that are fixed or segregating at high frequency between a global sampling of dogs and wolves. We find evidence that selection has been strongest

  8. Study on the crystallographic orientation relationship and formation mechanism of reversed austenite in economical Cr12 super martensitic stainless steel

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ye, Dong; Li, Shaohong; Li, Jun

    Effect of carbides and crystallographic orientation relationship on the formation mechanism of reversed austenite of economical Cr12 super martensitic stainless steel (SMSS) has been investigated mainly by transmission electron microscopy (TEM) and electron backscatter diffraction (EBSD). The results indicate that the M{sub 23}C{sub 6} precipitation and the formation of the reversed austenite have the interaction effect during tempering process in SMSS. The reversed austenite forms intensively at the sub-block boundary and the lath boundary within a misorientation range of 0–60°. M{sub 23}C{sub 6} has the same crystallographic orientation relationship with reversed austenite. There are two different kinds of formation modesmore » for reversed austenite. One is a nondiffusional shear reversion; the other is a diffusion transformation. Both are strictly limited by crystallographic orientation relationship. The austenite variants are limited to two kinds within one packet and five kinds within one prior austenite grain. - Highlights: • Reversed austenite forms at martensite boundaries with misorientation of 0–60° • M{sub 23}C{sub 6} precipitation and reversed austenite formation have the interaction effect. • Two austenite variants with different orientations can be formed inside a packet. • Two reversed austenite formation modes: shear reversion; diffusion transformation.« less

  9. Using the ICF to develop the capability-oriented database of persons with disabilities: a case study in Nakornpanom province, Thailand.

    PubMed

    Tongsiri, Sirinart; Riewpaiboon, Wachara

    2013-06-01

    This study aims to determine functioning information, rehabilitation needs, and environmental barriers of persons with disabilities (PWDs) using a developed ICF-based questionnaire with community survey approach in Thailand. A systematic review of the use of ICF and disability surveys from January 2000- June 2010 was undertaken. A questionnaire was then developed and tested in two pilot studies before using in a face-to-face interview conducted with legally registered PWDs in Nakornpanom province. Forty-six ICF codes were used in the questionnaire; two second-level codes in body functions, 18 second-level and six third-level codes in activities & participation and 14 second-level and six third-level codes in environmental factors. Each code had 2-6 qualifiers. One thousand and seven PWDs (56.6% male, mean age = 48.4 ± 0.64 years) were interviewed by 16 trained-interviewers. Interview duration was approximately 17 min. The functioning profile could be revealed for both individual and population. These reflected the need for rehabilitation. Several cut-off points to identify "disabled persons" were offered. Regarding participation, PWDs were concerned more about environmental barriers. One-fourth of PWDs needed home environment adaptation, almost 13% were uneducated and 23% had limited chance to participate in social activities. ICF framework and codes can be used to develop a questionnaire to measure population functioning profile and rehabilitation needs of PWDs by community survey. Results can be used to develop a capability-oriented disability database to identify prevalence of disabilities and rehabilitation needs. Policy makers may use this database to plan, monitor and evaluate rehabilitation service programs and removal of environmental barriers.

  10. VIRUS NOMENCLATURE BELOW THE SPECIES LEVEL: A STANDARDIZED NOMENCLATURE FOR LABORATORY ANIMAL-ADAPTED STRAINS AND VARIANTS OF VIRUSES ASSIGNED TO THE FAMILY FILOVIRIDAE

    PubMed Central

    Kuhn, Jens H.; Bao, Yiming; Bavari, Sina; Becker, Stephan; Bradfute, Steven; Brister, J. Rodney; Bukreyev, Alexander A.; Caì, Yíngyún; Chandran, Kartik; Davey, Robert A.; Dolnik, Olga; Dye, John M.; Enterlein, Sven; Gonzalez, Jean-Paul; Formenty, Pierre; Freiberg, Alexander N.; Hensley, Lisa E.; Honko, Anna N.; Ignatyev, Georgy M.; Jahrling, Peter B.; Johnson, Karl M.; Klenk, Hans-Dieter; Kobinger, Gary; Lackemeyer, Matthew G.; Leroy, Eric M.; Lever, Mark S.; Lofts, Loreen L.; Mühlberger, Elke; Netesov, Sergey V.; Olinger, Gene G.; Palacios, Gustavo; Patterson, Jean L.; Paweska, Janusz T.; Pitt, Louise; Radoshitzky, Sheli R.; Ryabchikova, Elena I.; Saphire, Erica Ollmann; Shestopalov, Aleksandr M.; Smither, Sophie J.; Sullivan, Nancy J.; Swanepoel, Robert; Takada, Ayato; Towner, Jonathan S.; van der Groen, Guido; Volchkov, Viktor E.; Wahl-Jensen, Victoria; Warren, Travis K.; Warfield, Kelly L.; Weidmann, Manfred; Nichol, Stuart T.

    2013-01-01

    The International Committee on Taxonomy of Viruses (ICTV) organizes the classification of viruses into taxa, but is not responsible for the nomenclature for taxa members. International experts groups, such as the ICTV Study Groups, recommend the classification and naming of viruses and their strains, variants, and isolates. The ICTV Filoviridae Study Group has recently introduced an updated classification and nomenclature for filoviruses. Subsequently, and together with numerous other filovirus experts, a consistent nomenclature for their natural genetic variants and isolates was developed that aims at simplifying the retrieval of sequence data from electronic databases. This is a first important step toward a viral genome annotation standard as sought by the US National Center for Biotechnology Information (NCBI). Here, this work is extended to include filoviruses obtained in the laboratory by artificial selection through passage in laboratory hosts. The previously developed template for natural filovirus genetic variant naming ( ///variant designation>-) is retained, but it is proposed to adapt the type of information added to each field for laboratory animal-adapted variants. For instance, the full-length designation of an Ebola virus Mayinga variant adapted at the State Research Center for Virology and Biotechnology “Vector” to cause disease in guinea pigs after seven passages would be akin to “Ebola virus VECTOR/C.porcellus-lab/COD/1976/Mayinga-GPA-P7”. As was proposed for the names of natural filovirus variants, we suggest using the full-length designation in databases, as well as in the method section of publications. Shortened designations (such as “EBOV VECTOR/C.por/COD/76/May-GPA-P7”) and abbreviations (such as “EBOV/May-GPA-P7”) could be used in the remainder of the text depending on how critical it is to convey information contained in

  11. Lost and forgotten? Orientation versus memory in Alzheimer's disease and frontotemporal dementia.

    PubMed

    Yew, Belinda; Alladi, Suvarna; Shailaja, Mekala; Hodges, John R; Hornberger, Michael

    2013-01-01

    Recent studies suggest that significant memory problems are not specific to Alzheimer's disease (AD) but can be also observed in other neurodegenerative conditions, such as behavioral variant frontotemporal dementia (bvFTD). We investigated whether orientation (spatial & temporal) information is a better diagnostic marker for AD compared to memory and whether their atrophy correlates of orientation and memory differ. A large sample (n = 190) of AD patients (n = 73), bvFTD patients (n = 54), and healthy controls (n = 63) underwent testing. A subset of the patients (n = 72) underwent structural imaging using voxel-based morphometry analysis of magnetic resonance brain imaging. Orientation and memory scores from the Addenbrooke's Cognitive Examination showed that AD patients had impaired orientation and memory, while bvFTD patients performing at control level for orientation but had impaired memory. A logistic regression showed that 78% of patients could be classified on the basis of orientation and memory scores alone at clinic presentation. Voxel-based morphometry analysis was conducted using orientation and memory scores as covariates, which showed that the neural correlates for orientation and memory also dissociated with posterior hippocampus cortex being related to orientation in AD, while the anterior hippocampus was associated with memory performance in the AD and bvFTD patients. Orientation and memory measures discriminate AD and bvFTD to a high degree and tap into different hippocampal regions. Disorientation and posterior hippocampus appears therefore specific to AD and will allow clinicians to discriminate AD patients from other neurodegenerative conditions with similar memory deficits at clinic presentation.

  12. MECP2 variation in Rett syndrome-An overview of current coverage of genetic and phenotype data within existing databases.

    PubMed

    Townend, Gillian S; Ehrhart, Friederike; van Kranen, Henk J; Wilkinson, Mark; Jacobsen, Annika; Roos, Marco; Willighagen, Egon L; van Enckevort, David; Evelo, Chris T; Curfs, Leopold M G

    2018-04-27

    Rett syndrome (RTT) is a monogenic rare disorder that causes severe neurological problems. In most cases, it results from a loss-of-function mutation in the gene encoding methyl-CPG-binding protein 2 (MECP2). Currently, about 900 unique MECP2 variations (benign and pathogenic) have been identified and it is suspected that the different mutations contribute to different levels of disease severity. For researchers and clinicians, it is important that genotype-phenotype information is available to identify disease-causing mutations for diagnosis, to aid in clinical management of the disorder, and to provide counseling for parents. In this study, 13 genotype-phenotype databases were surveyed for their general functionality and availability of RTT-specific MECP2 variation data. For each database, we investigated findability and interoperability alongside practical user functionality, and type and amount of genetic and phenotype data. The main conclusions are that, as well as being challenging to find these databases and specific MECP2 variants held within, interoperability is as yet poorly developed and requires effort to search across databases. Nevertheless, we found several thousand online database entries for MECP2 variations and their associated phenotypes, diagnosis, or predicted variant effects, which is a good starting point for researchers and clinicians who want to provide, annotate, and use the data. © 2018 The Authors. Human Mutation published by Wiley Periodicals, Inc.

  13. Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

    PubMed

    Singhal, Ayush; Simmons, Michael; Lu, Zhiyong

    2016-11-01

    The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant

  14. Sleep atlas and multimedia database.

    PubMed

    Penzel, T; Kesper, K; Mayer, G; Zulley, J; Peter, J H

    2000-01-01

    The ENN sleep atlas and database was set up on a dedicated server connected to the internet thus providing all services such as WWW, ftp and telnet access. The database serves as a platform to promote the goals of the European Neurological Network, to exchange patient cases for second opinion between experts and to create a case-oriented multimedia sleep atlas with descriptive text, images and video-clips of all known sleep disorders. The sleep atlas consists of a small public and a large private part for members of the consortium. 20 patient cases were collected and presented with educational information similar to published case reports. Case reports are complemented with images, video-clips and biosignal recordings. A Java based viewer for biosignals provided in EDF format was installed in order to move free within the sleep recordings without the need to download the full recording on the client.

  15. An object-oriented approach to the management of meteorological and hydrological data

    NASA Technical Reports Server (NTRS)

    Graves, S. J.; Williams, S. F.; Criswell, E. A.

    1990-01-01

    An interface to several meteorological and hydrological databases have been developed that enables researchers efficiently to access and interrelate data through a customized menu system. By extending a relational database system with object-oriented concepts, each user or group of users may have different 'views' of the data to allow user access to data in customized ways without altering the organization of the database. An application to COHMEX and WetNet, two earth science projects within NASA Marshall Space Flight Center's Earth Science and Applications Division, are described.

  16. Intrahaplotypic Variants Differentiate Complex Linkage Disequilibrium within Human MHC Haplotypes

    PubMed Central

    Lam, Tze Hau; Tay, Matthew Zirui; Wang, Bei; Xiao, Ziwei; Ren, Ee Chee

    2015-01-01

    Distinct regions of long-range genetic fixation in the human MHC region, known as conserved extended haplotypes (CEHs), possess unique genomic characteristics and are strongly associated with numerous diseases. While CEHs appear to be homogeneous by SNP analysis, the nature of fine variations within their genomic structure is unknown. Using multiple, MHC-homozygous cell lines, we demonstrate extensive sequence conservation in two common Asian MHC haplotypes: A33-B58-DR3 and A2-B46-DR9. However, characterization of phase-resolved MHC haplotypes revealed unique intra-CEH patterns of variation and uncovered 127 single nucleotide variants (SNVs) which are missing from public databases. We further show that the strong linkage disequilibrium structure within the human MHC that typically confounds precise identification of genetic features can be resolved using intra-CEH variants, as evidenced by rs3129063 and rs448489, which affect expression of ZFP57, a gene important in methylation and epigenetic regulation. This study demonstrates an improved strategy that can be used towards genetic dissection of diseases. PMID:26593880

  17. The Yak genome database: an integrative database for studying yak biology and high-altitude adaption

    PubMed Central

    2012-01-01

    Background The yak (Bos grunniens) is a long-haired bovine that lives at high altitudes and is an important source of milk, meat, fiber and fuel. The recent sequencing, assembly and annotation of its genome are expected to further our understanding of the means by which it has adapted to life at high altitudes and its ecologically important traits. Description The Yak Genome Database (YGD) is an internet-based resource that provides access to genomic sequence data and predicted functional information concerning the genes and proteins of Bos grunniens. The curated data stored in the YGD includes genome sequences, predicted genes and associated annotations, non-coding RNA sequences, transposable elements, single nucleotide variants, and three-way whole-genome alignments between human, cattle and yak. YGD offers useful searching and data mining tools, including the ability to search for genes by name or using function keywords as well as GBrowse genome browsers and/or BLAST servers, which can be used to visualize genome regions and identify similar sequences. Sequence data from the YGD can also be downloaded to perform local searches. Conclusions A new yak genome database (YGD) has been developed to facilitate studies on high-altitude adaption and bovine genomics. The database will be continuously updated to incorporate new information such as transcriptome data and population resequencing data. The YGD can be accessed at http://me.lzu.edu.cn/yak. PMID:23134687

  18. Screening of whole genome sequences identified high-impact variants for stallion fertility.

    PubMed

    Schrimpf, Rahel; Gottschalk, Maren; Metzger, Julia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2016-04-14

    Stallion fertility is an economically important trait due to the increase of artificial insemination in horses. The availability of whole genome sequence data facilitates identification of rare high-impact variants contributing to stallion fertility. The aim of our study was to genotype rare high-impact variants retrieved from next-generation sequencing (NGS)-data of 11 horses in order to unravel harmful genetic variants in large samples of stallions. Gene ontology (GO) terms and search results from public databases were used to obtain a comprehensive list of human und mice genes predicted to participate in the regulation of male reproduction. The corresponding equine orthologous genes were searched in whole genome sequence data of seven stallions and four mares and filtered for high-impact genetic variants using SnpEFF, SIFT and Polyphen 2 software. All genetic variants with the missing homozygous mutant genotype were genotyped on 337 fertile stallions of 19 breeds using KASP genotyping assays or PCR-RFLP. Mixed linear model analysis was employed for an association analysis with de-regressed estimated breeding values of the paternal component of the pregnancy rate per estrus (EBV-PAT). We screened next generation sequenced data of whole genomes from 11 horses for equine genetic variants in 1194 human and mice genes involved in male fertility and linked through common gene ontology (GO) with male reproductive processes. Variants were filtered for high-impact on protein structure and validated through SIFT and Polyphen 2. Only those genetic variants were followed up when the homozygote mutant genotype was missing in the detection sample comprising 11 horses. After this filtering process, 17 single nucleotide polymorphism (SNPs) were left. These SNPs were genotyped in 337 fertile stallions of 19 breeds using KASP genotyping assays or PCR-RFLP. An association analysis in 216 Hanoverian stallions revealed a significant association of the splice-site disruption variant

  19. Gamma-aminobutyric acid A receptor, α-2 (GABRA2) variants as individual markers for alcoholism: a meta-analysis.

    PubMed

    Zintzaras, Elias

    2012-08-01

    The available evidence from the genetic association studies (GAS) published to date on the association between variants in the GABRA2 gene and alcoholism has produced inconclusive results. To interpret these results, a meticulous meta-analysis of all available studies was carried out. The PubMed database and the HuGE Navigator were searched for published GAS-related variants in the GABRA2 gene with susceptibility to alcoholism. Then, the GAS were synthesized to decrease the uncertainty of estimated genetic risk effects. The risk effects were estimated on the basis of the odds ratio (OR) of the allele contrast and the generalized odds ratio (OR(G)), a model-free approach. Cumulative and recursive cumulative meta-analyses (CMA) were also carried out to investigate the trend and stability of effect sizes as evidence accumulates. Fourteen variants investigated in eight studies were analyzed. Significant associations were derived for four variants either for the allele contrast or for the OR(G). In particular, the variants rs279858 and rs279845 showed marginal significance for OR(G): OR(G)=1.27 (1.01-1.60) and OR(G)=1.49 (1.02-2.19), respectively. Also, the variants rs567926 and rs279844 showed significance for the allele contrast: OR=1.24 (1.06-1.46) and OR=1.23 (1.08-1.43), respectively; the ORG produced similar results. The variant rs279858 produced a large heterogeneity between studies. CMA showed a trend of an association only for the variant rs567926. Recursive CMA indicated that more evidence is needed to conclude on the status of significance of all variants. There is evidence that variants in the GABRA2 gene are associated with alcoholism. However, the present findings should be interpreted with caution.

  20. Reactome graph database: Efficient access to complex pathway data

    PubMed Central

    Korninger, Florian; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D’Eustachio, Peter

    2018-01-01

    Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types. PMID:29377902

  1. Reactome graph database: Efficient access to complex pathway data.

    PubMed

    Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning

    2018-01-01

    Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.

  2. Beyond Same-Sex Attraction: Gender-Variant-Based Victimization Is Associated with Suicidal Behavior and Substance Use for Other-Sex Attracted Adolescents

    PubMed Central

    Chen, Peter Y.; Cigularov, Konstantin P.; Tomazic, Rocco G.

    2015-01-01

    Gender-variant-based victimization is victimization based on the way others perceive an individual to convey masculine, feminine, and androgynous characteristics through their appearance, mannerisms, and behaviors. Previous work identifies gender-variant-based victimization as a risk factor for health-risking outcomes among same-sex attracted youths. The current study seeks to examine this relationship among other-sex attracted youths and same-sex attracted youth, and determine if gender-variant-based victimization is similarly or differentially associated with poor outcomes between these two groups. Anonymous data from a school-based survey of 2,438 racially diverse middle and high school students in the Eastern U.S. was examined. For other-sex attracted adolescents, gender-variant-based victimization was associated with a higher odds of suicidal thoughts and behaviors, regular use of cigarettes, and drug use. When compared to same-sex attracted adolescents, the harmful relationship between gender-variant-based victimization and each of these outcomes was similar in nature. These findings suggest that gender-variant-based victimization has potentially serious implications for the psychological wellbeing and substance use of other-sex attracted adolescents, not just same-sex attracted adolescents, supporting the need to address gender expression as a basis for victimization separate from sexuality- or gender-minority status. The impact that gender-variant-based victimization has on all adolescents should not be overlooked in research and interventions aimed at addressing sexual orientation-based and gender-variant-based victimization, substance use, and suicide prevention. PMID:26068796

  3. Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics.

    PubMed

    Deutsch, Eric W; Sun, Zhi; Campbell, David S; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S; Moritz, Robert L

    2016-11-04

    The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the

  4. Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics

    PubMed Central

    Deutsch, Eric W.; Sun, Zhi; Campbell, David S.; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S.; Moritz, Robert L.

    2016-01-01

    The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances – a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ~20,000 primary isoforms plus contaminants to a very large database that includes almost all non-redundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the

  5. STCRDab: the structural T-cell receptor database

    PubMed Central

    de Oliveira, Saulo H P; Krawczyk, Konrad

    2018-01-01

    Abstract The Structural T–cell Receptor Database (STCRDab; http://opig.stats.ox.ac.uk/webapps/stcrdab) is an online resource that automatically collects and curates TCR structural data from the Protein Data Bank. For each entry, the database provides annotations, such as the α/β or γ/δ chain pairings, major histocompatibility complex details, and where available, antigen binding affinities. In addition, the orientation between the variable domains and the canonical forms of the complementarity-determining region loops are also provided. Users can select, view, and download individual or bulk sets of structures based on these criteria. Where available, STCRDab also finds antibody structures that are similar to TCRs, helping users explore the relationship between TCRs and antibodies. PMID:29087479

  6. NoSQL technologies for the CMS Conditions Database

    NASA Astrophysics Data System (ADS)

    Sipos, Roland

    2015-12-01

    With the restart of the LHC in 2015, the growth of the CMS Conditions dataset will continue, therefore the need of consistent and highly available access to the Conditions makes a great cause to revisit different aspects of the current data storage solutions. We present a study of alternative data storage backends for the Conditions Databases, by evaluating some of the most popular NoSQL databases to support a key-value representation of the CMS Conditions. The definition of the database infrastructure is based on the need of storing the conditions as BLOBs. Because of this, each condition can reach the size that may require special treatment (splitting) in these NoSQL databases. As big binary objects may be problematic in several database systems, and also to give an accurate baseline, a testing framework extension was implemented to measure the characteristics of the handling of arbitrary binary data in these databases. Based on the evaluation, prototypes of a document store, using a column-oriented and plain key-value store, are deployed. An adaption layer to access the backends in the CMS Offline software was developed to provide transparent support for these NoSQL databases in the CMS context. Additional data modelling approaches and considerations in the software layer, deployment and automatization of the databases are also covered in the research. In this paper we present the results of the evaluation as well as a performance comparison of the prototypes studied.

  7. In-situ neutron diffraction study of martensitic variant redistribution in polycrystalline Ni-Mn-Ga alloy under cyclic thermo-mechanical treatment

    NASA Astrophysics Data System (ADS)

    Li, Zongbin; Zhang, Yudong; Esling, Claude; Gan, Weimin; Zou, Naifu; Zhao, Xiang; Zuo, Liang

    2014-07-01

    The influences of uniaxial compressive stress on martensitic transformation were studied on a polycrystalline Ni-Mn-Ga bulk alloy prepared by directional solidification. Based upon the integrated in-situ neutron diffraction measurements, direct experimental evidence was obtained on the variant redistribution of seven-layered modulated (7M) martensite, triggered by external uniaxial compression during martensitic transformation. Large anisotropic lattice strain, induced by the cyclic thermo-mechanical treatment, has led to the microstructure modification by forming martensitic variants with a strong ⟨0 1 0⟩7M preferential orientation along the loading axis. As a result, the saturation of magnetization became easier to be reached.

  8. In-situ neutron diffraction study of martensitic variant redistribution in polycrystalline Ni-Mn-Ga alloy under cyclic thermo-mechanical treatment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Zongbin; Zou, Naifu; Zhao, Xiang

    2014-07-14

    The influences of uniaxial compressive stress on martensitic transformation were studied on a polycrystalline Ni-Mn-Ga bulk alloy prepared by directional solidification. Based upon the integrated in-situ neutron diffraction measurements, direct experimental evidence was obtained on the variant redistribution of seven-layered modulated (7M) martensite, triggered by external uniaxial compression during martensitic transformation. Large anisotropic lattice strain, induced by the cyclic thermo-mechanical treatment, has led to the microstructure modification by forming martensitic variants with a strong 〈0 1 0〉{sub 7M} preferential orientation along the loading axis. As a result, the saturation of magnetization became easier to be reached.

  9. Report of a Novel SHOX Missense Variant in a Boy With Short Stature and His Mother With Leri–Weill Dyschondrosteosis

    PubMed Central

    Lucchetti, Laura; Prontera, Paolo; Mencarelli, Amedea; Sallicandro, Ester; Mencarelli, Annalisa; Cofini, Marta; Leonardi, Alberto; Stangoni, Gabriela; Penta, Laura; Esposito, Susanna

    2018-01-01

    Heterozygous mutations in the SHOX gene or in the upstream and downstream enhancer elements are associated with 2–22% of cases of idiopathic short stature (OMIM #300582) and with 60% of cases of Leri–Weill dyschondrosteosis (OMIM #127300) with which female subjects are generally more severely affected. Approximately 80–90% of SHOX pathogenic variants are deletions or duplications, and the remaining 10–20% are point mutations that primarily give rise to missense variants. The clinical interpretation of novel variants, particularly missense variants, can be challenging and can remain of uncertain significance. Here, we describe a novel missense variant (c.1044 G>T, p.Arg118Met) in a Moroccan boy with a disproportionately short stature and without any radiological traits or bone deformities and in his mother, who had a disproportionately short stature and a Madelung deformity. This variant has not been reported to date in the updated SHOX allelic variant or Human Gene Mutation Databases nor is it listed as a polymorphism in the ExAC browser, dbSNP, or 1000G. This mutation was predicted to be deleterious by three different bioinformatics tools since it modifies an amino acid in a highly conserved DNA-binding domain of the SHOX protein. Based on this evidence, the patient was treated with recombinant human growth hormone. PMID:29692759

  10. Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists

    PubMed Central

    Wiley, Laura K.; Sivley, R. Michael; Bush, William S.

    2013-01-01

    Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185

  11. Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists.

    PubMed

    Wiley, Laura K; Sivley, R Michael; Bush, William S

    2013-01-01

    Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.

  12. Polarimetric Scattering Database for Non-spherical Ice Particles at Microwave Wavelengths

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aydin, Kultegin; Verlinde, Johannes; Clothiaux, Eugene

    A database containing polarimetric single-scattering properties of various types of ice particles at millimeter to centimeter wavelengths is presented. This database is complementary to earlier ones in that it contains complete (polarimetric) scattering property information for each ice particle - 44 plates, 30 columns, 405 branched planar crystals, 660 aggregates, and 640 conical graupel - and direction of incident radiation but is limited to four frequencies (W-, Ka-, Ku- and X-bands), does not include temperature dependencies of the single-scattering properties and does not include scattering properties averaged over randomly oriented ice particles. Rules for constructing the morphologies of ice particlesmore » from one database to the next often differ; consequently, analyses that incorporate all of the different databases will contain the most variability, while illuminating important differences between them.« less

  13. Analysis of predicted loss-of-function variants in UK Biobank identifies variants protective for disease.

    PubMed

    Emdin, Connor A; Khera, Amit V; Chaffin, Mark; Klarin, Derek; Natarajan, Pradeep; Aragam, Krishna; Haas, Mary; Bick, Alexander; Zekavat, Seyedeh M; Nomura, Akihiro; Ardissino, Diego; Wilson, James G; Schunkert, Heribert; McPherson, Ruth; Watkins, Hugh; Elosua, Roberto; Bown, Matthew J; Samani, Nilesh J; Baber, Usman; Erdmann, Jeanette; Gupta, Namrata; Danesh, John; Chasman, Daniel; Ridker, Paul; Denny, Joshua; Bastarache, Lisa; Lichtman, Judith H; D'Onofrio, Gail; Mattera, Jennifer; Spertus, John A; Sheu, Wayne H-H; Taylor, Kent D; Psaty, Bruce M; Rich, Stephen S; Post, Wendy; Rotter, Jerome I; Chen, Yii-Der Ida; Krumholz, Harlan; Saleheen, Danish; Gabriel, Stacey; Kathiresan, Sekar

    2018-04-24

    Less than 3% of protein-coding genetic variants are predicted to result in loss of protein function through the introduction of a stop codon, frameshift, or the disruption of an essential splice site; however, such predicted loss-of-function (pLOF) variants provide insight into effector transcript and direction of biological effect. In >400,000 UK Biobank participants, we conduct association analyses of 3759 pLOF variants with six metabolic traits, six cardiometabolic diseases, and twelve additional diseases. We identified 18 new low-frequency or rare (allele frequency < 5%) pLOF variant-phenotype associations. pLOF variants in the gene GPR151 protect against obesity and type 2 diabetes, in the gene IL33 against asthma and allergic disease, and in the gene IFIH1 against hypothyroidism. In the gene PDE3B, pLOF variants associate with elevated height, improved body fat distribution and protection from coronary artery disease. Our findings prioritize genes for which pharmacologic mimics of pLOF variants may lower risk for disease.

  14. Molecular models of NS3 protease variants of the Hepatitis C virus.

    PubMed

    da Silveira, Nelson J F; Arcuri, Helen A; Bonalumi, Carlos E; de Souza, Fátima P; Mello, Isabel M V G C; Rahal, Paula; Pinho, João R R; de Azevedo, Walter F

    2005-01-21

    Hepatitis C virus (HCV) currently infects approximately three percent of the world population. In view of the lack of vaccines against HCV, there is an urgent need for an efficient treatment of the disease by an effective antiviral drug. Rational drug design has not been the primary way for discovering major therapeutics. Nevertheless, there are reports of success in the development of inhibitor using a structure-based approach. One of the possible targets for drug development against HCV is the NS3 protease variants. Based on the three-dimensional structure of these variants we expect to identify new NS3 protease inhibitors. In order to speed up the modeling process all NS3 protease variant models were generated in a Beowulf cluster. The potential of the structural bioinformatics for development of new antiviral drugs is discussed. The atomic coordinates of crystallographic structure 1CU1 and 1DY9 were used as starting model for modeling of the NS3 protease variant structures. The NS3 protease variant structures are composed of six subdomains, which occur in sequence along the polypeptide chain. The protease domain exhibits the dual beta-barrel fold that is common among members of the chymotrypsin serine protease family. The helicase domain contains two structurally related beta-alpha-beta subdomains and a third subdomain of seven helices and three short beta strands. The latter domain is usually referred to as the helicase alpha-helical subdomain. The rmsd value of bond lengths and bond angles, the average G-factor and Verify 3D values are presented for NS3 protease variant structures. This project increases the certainty that homology modeling is an useful tool in structural biology and that it can be very valuable in annotating genome sequence information and contributing to structural and functional genomics from virus. The structural models will be used to guide future efforts in the structure-based drug design of a new generation of NS3 protease variants

  15. User-oriented views in health care information systems.

    PubMed

    Portoni, Luisa; Combi, Carlo; Pinciroli, Francesco

    2002-12-01

    In this paper, we present the methodology we adopted in designing and developing an object-oriented database system for the management of medical records. The designed system provides technical solutions to important requirements of most clinical information systems, such as 1) the support of tools to create and manage views on data and view schemas, offering to different users specific perspectives on data tailored to their needs; 2) the capability to handle in a suitable way the temporal aspects related to clinical information; and 3) the effective integration of multimedia data. Remote data access for authorized users is also considered. As clinical application, we describe here the prototype of a user-oriented clinical information system for the archiving and the management of multimedia and temporally oriented clinical data related to percutaneous transluminal coronary angioplasty (PTCA) patients. Suitable view schemas for various user roles (cath-lab physician, ward nurse, general practitioner) have been modeled and implemented on the basis of a detailed analysis of the considered clinical environment, carried out by an object-oriented approach.

  16. Genetic risk variants for metabolic traits in Arab populations.

    PubMed

    Hebbar, Prashantha; Elkum, Naser; Alkayal, Fadi; John, Sumi Elsa; Thanaraj, Thangavel Alphonse; Alsmadi, Osama

    2017-01-20

    Despite a high prevalence of metabolic trait related diseases in Arabian Peninsula, there is a lack of convincingly identified genetic determinants for metabolic traits in this population. Arab populations are underrepresented in global genome-wide association studies. We genotyped 1965 unrelated Arab individuals from Kuwait using Cardio-MetaboChip, and tested SNP associations with 13 metabolic traits. Models based on recessive mode of inheritance identified Chr15:40531386-rs12440118/ZNF106/W->R as a risk variant associated with glycated-hemoglobin at close to 'genome-wide significant' p-value and five other risk variants 'nominally' associated (p-value ≤ 5.45E-07) with fasting plasma glucose (rs7144734/[OTX2-AS1,RPL3P3]) and triglyceride (rs17501809/PLGRKT; rs11143005/LOC105376072; rs900543/[THSD4,NR2E3]; and Chr12:101494770/IGF1). Furthermore, we identified 33 associations (30 SNPs with 12 traits) with 'suggestive' evidence of association (p-value < 1.0E-05); 20 of these operate under recessive mode of inheritance. Two of these 'suggestive' associations (rs1800775-CETP/HDL; and rs9326246-BUD13/TGL) showed evidence at genome-wide significance in previous studies on Euro-centric populations. Involvement of many of the identified loci in mediating metabolic traits was supported by literature evidences. The identified loci participate in critical metabolic pathways (such as Ceramide signaling, and Mitogen-Activated Protein Kinase/Extracellular Signal Regulated Kinase signaling). Data from Genotype-Tissue Expression database affirmed that 7 of the identified variants differentially regulate the up/downstream genes that mediate metabolic traits.

  17. The Histone Database: an integrated resource for histones and histone fold-containing proteins

    PubMed Central

    Mariño-Ramírez, Leonardo; Levine, Kevin M.; Morales, Mario; Zhang, Suiyuan; Moreland, R. Travis; Baxevanis, Andreas D.; Landsman, David

    2011-01-01

    Eukaryotic chromatin is composed of DNA and protein components—core histones—that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins. Database URL: The Histone Sequence Database is freely available and can be accessed at http://research.nhgri.nih.gov/histones/. PMID:22025671

  18. Genetic variants in IL-6/JAK/STAT3 pathway and the risk of CRC.

    PubMed

    Wang, Shuwei; Zhang, Weidong

    2016-05-01

    Interleukin (IL)-6 and the downstream Janus kinase (JAK)/signal transducer and activator of transcription (STAT) pathway have previously been reported to be important in the development of colorectal cancer (CRC), and several studies have shown the relationship between the polymorphisms of related genes in this pathway with the risk of CRC. However, the findings of these related studies are inconsistent. Moreover, there has no systematic review and meta-analysis to evaluate the relationship between genetic variants in IL-6/JAK/STAT3 pathway and CRC susceptibility. Hence, we conducted a meta-analysis to explore the relationship between polymorphisms in IL-6/JAK/STAT3 pathway genes and CRC risk. Eighteen eligible studies with a total of 13,795 CRC cases and 18,043 controls were identified by searching PubMed, Web of Science, Embase, and the Cochrane Library databases for the period up to September 15, 2015. Odds ratios (ORs) and their 95 % confidence intervals (CIs) were used to calculate the strength of the association. Our results indicated that IL-6 genetic variants in allele additive model (OR = 1.05, 95 % CI = 1.00, 1.09) and JAK2 genetic variants (OR = 1.40, 95 % CI = 1.15, 1.65) in genotype recessive model were significantly associated with CRC risk. Moreover, the pooled data revealed that IL-6 rs1800795 polymorphism significantly increased the risk of CRC in allele additive model in Europe (OR = 1.07, 95 % CI = 1.01, 1.14). In conclusion, the present findings indicate that IL-6 and JAK2 genetic variants are associated with the increased risk of CRC while STAT3 genetic variants not. We need more well-designed clinical studies covering more countries and population to definitively establish the association between genetic variants in IL-6/JAK/STAT3 pathway and CRC susceptibility.

  19. An ECG storage and retrieval system embedded in client server HIS utilizing object-oriented DB.

    PubMed

    Wang, C; Ohe, K; Sakurai, T; Nagase, T; Kaihara, S

    1996-02-01

    In the University of Tokyo Hospital, the improved client server HIS has been applied to clinical practice and physicians can order prescription, laboratory examination, ECG examination and radiographic examination, etc. directly by themselves and read results of these examinations, except medical signal waves, schema and image, on UNIX workstations. Recently, we designed and developed an ECG storage and retrieval system embedded in the client server HIS utilizing object-oriented database to take the first step in dealing with digitized signal, schema and image data and show waves, graphics, and images directly to physicians by the client server HIS. The system was developed based on object-oriented analysis and design, and implemented with object-oriented database management system (OODMS) and C++ programming language. In this paper, we describe the ECG data model, functions of the storage and retrieval system, features of user interface and the result of its implementation in the HIS.

  20. PANTHER-PSEP: predicting disease-causing genetic variants using position-specific evolutionary preservation.

    PubMed

    Tang, Haiming; Thomas, Paul D

    2016-07-15

    PANTHER-PSEP is a new software tool for predicting non-synonymous genetic variants that may play a causal role in human disease. Several previous variant pathogenicity prediction methods have been proposed that quantify evolutionary conservation among homologous proteins from different organisms. PANTHER-PSEP employs a related but distinct metric based on 'evolutionary preservation': homologous proteins are used to reconstruct the likely sequences of ancestral proteins at nodes in a phylogenetic tree, and the history of each amino acid can be traced back in time from its current state to estimate how long that state has been preserved in its ancestors. Here, we describe the PSEP tool, and assess its performance on standard benchmarks for distinguishing disease-associated from neutral variation in humans. On these benchmarks, PSEP outperforms not only previous tools that utilize evolutionary conservation, but also several highly used tools that include multiple other sources of information as well. For predicting pathogenic human variants, the trace back of course starts with a human 'reference' protein sequence, but the PSEP tool can also be applied to predicting deleterious or pathogenic variants in reference proteins from any of the ∼100 other species in the PANTHER database. PANTHER-PSEP is freely available on the web at http://pantherdb.org/tools/csnpScoreForm.jsp Users can also download the command-line based tool at ftp://ftp.pantherdb.org/cSNP_analysis/PSEP/ CONTACT: pdthomas@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. A Virtual "Hello": A Web-Based Orientation to the Library.

    ERIC Educational Resources Information Center

    Borah, Eloisa Gomez

    1997-01-01

    Describes the development of Web-based library services and resources available at the Rosenfeld Library of the Anderson Graduate School of Management at University of California at Los Angeles. Highlights include library orientation sessions; virtual tours of the library; a database of basic business sources; and research strategies, including…

  2. Rare high-impact disease variants: properties and identifications.

    PubMed

    Park, Leeyoung; Kim, Ju Han

    2016-03-21

    Although many genome-wide association studies have been performed, the identification of disease polymorphisms remains important. It is now suspected that many rare disease variants induce the association signal of common variants in linkage disequilibrium (LD). Based on recent development of genetic models, the current study provides explanations of the existence of rare variants with high impacts and common variants with low impacts. Disease variants are neither necessary nor sufficient due to gene-gene or gene-environment interactions. A new method was developed based on theoretical aspects to identify both rare and common disease variants by their genotypes. Common disease variants were identified with relatively small odds ratios and relatively small sample sizes, except for specific situations in which the disease variants were in strong LD with a variant with a higher frequency. Rare disease variants with small impacts were difficult to identify without increasing sample sizes; however, the method was reasonably accurate for rare disease variants with high impacts. For rare variants, dominant variants generally showed better Type II error rates than recessive variants; however, the trend was reversed for common variants. Type II error rates increased in gene regions containing more than two disease variants because the more common variant, rather than both disease variants, was usually identified. The proposed method would be useful for identifying common disease variants with small impacts and rare disease variants with large impacts when disease variants have the same effects on disease presentation.

  3. Beta-glucosidase I variants with improved properties

    DOEpatents

    Bott, Richard R.; Kaper, Thijs; Kelemen, Bradley; Goedegebuur, Frits; Hommes, Ronaldus Wilhelmus; Kralj, Slavko; Kruithof, Paulien; Nikolaev, Igor; Van Der Kley, Wilhelmus Antonious Hendricus; Van Lieshout, Johannes Franciscus Thomas; Van Stigt Thans, Sander

    2016-09-20

    The present disclosure is generally directed to enzymes and in particular beta-glucosidase variants. Also described are nucleic acids encoding beta-glucosidase variants, compositions comprising beta-glucosidase variants, methods of using beta-glucosidase variants, and methods of identifying additional useful beta-glucosidase variants.

  4. Houston Methodist Variant Viewer: An Application to Support Clinical Laboratory Interpretation of Next-generation Sequencing Data for Cancer

    PubMed Central

    Christensen, Paul A.; Ni, Yunyun; Bao, Feifei; Hendrickson, Heather L.; Greenwood, Michael; Thomas, Jessica S.; Long, S. Wesley; Olsen, Randall J.

    2017-01-01

    Introduction: Next-generation-sequencing (NGS) is increasingly used in clinical and research protocols for patients with cancer. NGS assays are routinely used in clinical laboratories to detect mutations bearing on cancer diagnosis, prognosis and personalized therapy. A typical assay may interrogate 50 or more gene targets that encompass many thousands of possible gene variants. Analysis of NGS data in cancer is a labor-intensive process that can become overwhelming to the molecular pathologist or research scientist. Although commercial tools for NGS data analysis and interpretation are available, they are often costly, lack key functionality or cannot be customized by the end user. Methods: To facilitate NGS data analysis in our clinical molecular diagnostics laboratory, we created a custom bioinformatics tool termed Houston Methodist Variant Viewer (HMVV). HMVV is a Java-based solution that integrates sequencing instrument output, bioinformatics analysis, storage resources and end user interface. Results: Compared to the predicate method used in our clinical laboratory, HMVV markedly simplifies the bioinformatics workflow for the molecular technologist and facilitates the variant review by the molecular pathologist. Importantly, HMVV reduces time spent researching the biological significance of the variants detected, standardizes the online resources used to perform the variant investigation and assists generation of the annotated report for the electronic medical record. HMVV also maintains a searchable variant database, including the variant annotations generated by the pathologist, which is useful for downstream quality improvement and research projects. Conclusions: HMVV is a clinical grade, low-cost, feature-rich, highly customizable platform that we have made available for continued development by the pathology informatics community. PMID:29226007

  5. Mental-orientation: A new approach to assessing patients across the Alzheimer's disease spectrum.

    PubMed

    Peters-Founshtein, Gregory; Peer, Michael; Rein, Yanai; Kahana Merhavi, Shlomzion; Meiner, Zeev; Arzy, Shahar

    2018-05-21

    This study aims to assess the role of mental-orientation in the diagnosis of mild cognitive impairment and Alzheimer's disease using a novel task. A behavioral study (Experiment 1) compared the mental-orientation task to standard neuropsychological tests in patients across the Alzheimer's disease spectrum. A functional MRI study (Experiment 2) in young adults compared activations evoked by the mental-orientation and standard-orientation tasks as well as their overlap with brain regions susceptible to Alzheimer's disease pathology. The mental-orientation task differentiated mild cognitively impaired and healthy controls at 95% accuracy, while the Addenbrooke's Cognitive Examination, Mini-Mental State Examination and standard-orientation achieved 74%, 70% and 50% accuracy, respectively. Functional MRI revealed the mental-orientation task to preferentially recruit brain regions exhibiting early Alzheimer's-related atrophy, unlike the standard-orientation test. Mental-orientation is suggested to play a key role in Alzheimer's disease, and consequently in early detection and follow-up of patients along the Alzheimer's disease spectrum. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  6. Conjunctival malignant melanoma: A rare variant and review of important diagnostic and therapeutic considerations

    PubMed Central

    Albreiki, Danah H.; Gilberg, Steven M.; Farmer, James P.

    2012-01-01

    Malignant melanoma of the conjunctiva is a relatively infrequent neoplasm that can be associated with significant morbidity and cause diagnostic difficulty to both the ophthalmologist and pathologist. We herein describe the first reported case in North American and European databases of a rare variant-signet ring cell melanoma – arising in the background of primary acquired melanosis (PAM) and use this case as a review of important diagnostic and therapeutic considerations when faced with this condition. PMID:23960986

  7. Cardiological database management system as a mediator to clinical decision support.

    PubMed

    Pappas, C; Mavromatis, A; Maglaveras, N; Tsikotis, A; Pangalos, G; Ambrosiadou, V

    1996-03-01

    An object-oriented medical database management system is presented for a typical cardiologic center, facilitating epidemiological trials. Object-oriented analysis and design were used for the system design, offering advantages for the integrity and extendibility of medical information systems. The system was developed using object-oriented design and programming methodology, the C++ language and the Borland Paradox Relational Data Base Management System on an MS-Windows NT environment. Particular attention was paid to system compatibility, portability, the ease of use, and the suitable design of the patient record so as to support the decisions of medical personnel in cardiovascular centers. The system was designed to accept complex, heterogeneous, distributed data in various formats and from different kinds of examinations such as Holter, Doppler and electrocardiography.

  8. Enhancing user privacy in SARG04-based private database query protocols

    NASA Astrophysics Data System (ADS)

    Yu, Fang; Qiu, Daowen; Situ, Haozhen; Wang, Xiaoming; Long, Shun

    2015-11-01

    The well-known SARG04 protocol can be used in a private query application to generate an oblivious key. By usage of the key, the user can retrieve one out of N items from a database without revealing which one he/she is interested in. However, the existing SARG04-based private query protocols are vulnerable to the attacks of faked data from the database since in its canonical form, the SARG04 protocol lacks means for one party to defend attacks from the other. While such attacks can cause significant loss of user privacy, a variant of the SARG04 protocol is proposed in this paper with new mechanisms designed to help the user protect its privacy in private query applications. In the protocol, it is the user who starts the session with the database, trying to learn from it bits of a raw key in an oblivious way. An honesty test is used to detect a cheating database who had transmitted faked data. The whole private query protocol has O( N) communication complexity for conveying at least N encrypted items. Compared with the existing SARG04-based protocols, it is efficient in communication for per-bit learning.

  9. Outcomes of Technical Variant Liver Transplantation versus Whole Liver Transplantation for Pediatric Patients: A Meta-Analysis.

    PubMed

    Ye, Hui; Zhao, Qiang; Wang, Yufang; Wang, Dongping; Zheng, Zhouying; Schroder, Paul Michael; Lu, Yao; Kong, Yuan; Liang, Wenhua; Shang, Yushu; Guo, Zhiyong; He, Xiaoshun

    2015-01-01

    To overcome the shortage of appropriate-sized whole liver grafts for children, technical variant liver transplantation has been practiced for decades. We perform a meta-analysis to compare the survival rates and incidence of surgical complications between pediatric whole liver transplantation and technical variant liver transplantation. To identify relevant studies up to January 2014, we searched PubMed/Medline, Embase, and Cochrane library databases. The primary outcomes measured were patient and graft survival rates, and the secondary outcomes were the incidence of surgical complications. The outcomes were pooled using a fixed-effects model or random-effects model. The one-year, three-year, five-year patient survival rates and one-year, three-year graft survival rates were significantly higher in whole liver transplantation than technical variant liver transplantation (OR = 1.62, 1.90, 1.65, 1.78, and 1.62, respectively, p<0.05). There was no significant difference in five-year graft survival rate between the two groups (OR = 1.47, p = 0.10). The incidence of portal vein thrombosis and biliary complications were significantly lower in the whole liver transplantation group (OR = 0.45 and 0.42, both p<0.05). The incidence of hepatic artery thrombosis was comparable between the two groups (OR = 1.21, p = 0.61). Pediatric whole liver transplantation is associated with better outcomes than technical variant liver transplantation. Continuing efforts should be made to minimize surgical complications to improve the outcomes of technical variant liver transplantation.

  10. Variants of cellobiohydrolases

    DOEpatents

    Bott, Richard R.; Foukaraki, Maria; Hommes, Ronaldus Wilhelmus; Kaper, Thijs; Kelemen, Bradley R.; Kralj, Slavko; Nikolaev, Igor; Sandgren, Mats; Van Lieshout, Johannes Franciscus Thomas; Van Stigt Thans, Sander

    2018-04-10

    Disclosed are a number of homologs and variants of Hypocrea jecorina Ce17A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.

  11. In silico prediction of splice-altering single nucleotide variants in the human genome.

    PubMed

    Jian, Xueqiu; Boerwinkle, Eric; Liu, Xiaoming

    2014-12-16

    In silico tools have been developed to predict variants that may have an impact on pre-mRNA splicing. The major limitation of the application of these tools to basic research and clinical practice is the difficulty in interpreting the output. Most tools only predict potential splice sites given a DNA sequence without measuring splicing signal changes caused by a variant. Another limitation is the lack of large-scale evaluation studies of these tools. We compared eight in silico tools on 2959 single nucleotide variants within splicing consensus regions (scSNVs) using receiver operating characteristic analysis. The Position Weight Matrix model and MaxEntScan outperformed other methods. Two ensemble learning methods, adaptive boosting and random forests, were used to construct models that take advantage of individual methods. Both models further improved prediction, with outputs of directly interpretable prediction scores. We applied our ensemble scores to scSNVs from the Catalogue of Somatic Mutations in Cancer database. Analysis showed that predicted splice-altering scSNVs are enriched in recurrent scSNVs and known cancer genes. We pre-computed our ensemble scores for all potential scSNVs across the human genome, providing a whole genome level resource for identifying splice-altering scSNVs discovered from large-scale sequencing studies.

  12. The Innate Immune Database (IIDB)

    PubMed Central

    Korb, Martin; Rust, Aistair G; Thorsson, Vesteinn; Battail, Christophe; Li, Bin; Hwang, Daehee; Kennedy, Kathleen A; Roach, Jared C; Rosenberger, Carrie M; Gilchrist, Mark; Zak, Daniel; Johnson, Carrie; Marzolf, Bruz; Aderem, Alan; Shmulevich, Ilya; Bolouri, Hamid

    2008-01-01

    Background As part of a National Institute of Allergy and Infectious Diseases funded collaborative project, we have performed over 150 microarray experiments measuring the response of C57/BL6 mouse bone marrow macrophages to toll-like receptor stimuli. These microarray expression profiles are available freely from our project web site . Here, we report the development of a database of computationally predicted transcription factor binding sites and related genomic features for a set of over 2000 murine immune genes of interest. Our database, which includes microarray co-expression clusters and a host of web-based query, analysis and visualization facilities, is available freely via the internet. It provides a broad resource to the research community, and a stepping stone towards the delineation of the network of transcriptional regulatory interactions underlying the integrated response of macrophages to pathogens. Description We constructed a database indexed on genes and annotations of the immediate surrounding genomic regions. To facilitate both gene-specific and systems biology oriented research, our database provides the means to analyze individual genes or an entire genomic locus. Although our focus to-date has been on mammalian toll-like receptor signaling pathways, our database structure is not limited to this subject, and is intended to be broadly applicable to immunology. By focusing on selected immune-active genes, we were able to perform computationally intensive expression and sequence analyses that would currently be prohibitive if applied to the entire genome. Using six complementary computational algorithms and methodologies, we identified transcription factor binding sites based on the Position Weight Matrices available in TRANSFAC. For one example transcription factor (ATF3) for which experimental data is available, over 50% of our predicted binding sites coincide with genome-wide chromatin immnuopreciptation (ChIP-chip) results. Our database can

  13. Comparability of Essay Question Variants

    ERIC Educational Resources Information Center

    Bridgeman, Brent; Trapani, Catherine; Bivens-Tatum, Jennifer

    2011-01-01

    Writing task variants can increase test security in high-stakes essay assessments by substantially increasing the pool of available writing stimuli and by making the specific writing task less predictable. A given prompt (parent) may be used as the basis for one or more different variants. Six variant types based on argument essay prompts from a…

  14. Nurses' hospital orientation and future research challenges: an integrative review.

    PubMed

    Peltokoski, J; Vehviläinen-Julkunen, K; Miettinen, M

    2016-03-01

    This study aimed to describe the research on registered nurses' orientation processes in specialized hospital settings in order to illustrate directions for future research. The complex healthcare environment and the impact of nursing shortage and turnover make the hospital orientation process imperative. There is a growing recognition regarding research interests to meet the needs for evidence-based, effective and economically sound hospital orientation strategies. An integrative literature review was performed on publications from the period 2000 to 2013 included in the CINAHL and PubMed databases. English-language studies were included. Themes guiding the analysis were definition of the hospital orientation process, research topics, data collection and instruments and research evidence. Narrative synthesis was used. Eleven papers met the inclusion criteria. The conceptualization of orientation process reflected the complexity of the phenomenon. Less attention has been paid to designs to establish correlations or relationships between selected variables and hospital orientation process. The outcomes of hospital orientation programmes were limited primarily to retention and job satisfaction. The research evidence therefore cannot be evaluated as strong. The lack of an evidence-based approach makes it difficult to develop a comprehensive orientation process. Further research should explore interventions that will enhance the quality of hospital orientation practices to improve nurses' retention and job satisfaction. To provide a comprehensive hospital orientation process, hospital administrators have to put in place human resource development strategies along with practice implications and research efforts. Comprehensive hospital orientation benefits and outcomes should be visible to policy makers. © 2016 International Council of Nurses.

  15. Variants of beta-glucosidase

    DOEpatents

    Fidantsef, Ana; Lamsa, Michael; Gorre-Clancy, Brian

    2015-07-14

    The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  16. Variants of beta-glucosidases

    DOEpatents

    Fidantsef, Ana; Lamsa, Michael; Gorre-Clancy, Brian

    2014-10-07

    The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  17. Variants of beta-glucosidase

    DOEpatents

    Fidantsef, Ana [Davis, CA; Lamsa, Michael [Davis, CA; Gorre-Clancy, Brian [Elk Grove, CA

    2009-12-29

    The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  18. Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework.

    PubMed

    Glusman, Gustavo; Rose, Peter W; Prlić, Andreas; Dougherty, Jennifer; Duarte, José M; Hoffman, Andrew S; Barton, Geoffrey J; Bendixen, Emøke; Bergquist, Timothy; Bock, Christian; Brunk, Elizabeth; Buljan, Marija; Burley, Stephen K; Cai, Binghuang; Carter, Hannah; Gao, JianJiong; Godzik, Adam; Heuer, Michael; Hicks, Michael; Hrabe, Thomas; Karchin, Rachel; Leman, Julia Koehler; Lane, Lydie; Masica, David L; Mooney, Sean D; Moult, John; Omenn, Gilbert S; Pearl, Frances; Pejaver, Vikas; Reynolds, Sheila M; Rokem, Ariel; Schwede, Torsten; Song, Sicheng; Tilgner, Hagen; Valasatava, Yana; Zhang, Yang; Deutsch, Eric W

    2017-12-18

    The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.

  19. Genetic risk variants for metabolic traits in Arab populations

    PubMed Central

    Hebbar, Prashantha; Elkum, Naser; Alkayal, Fadi; John, Sumi Elsa; Thanaraj, Thangavel Alphonse; Alsmadi, Osama

    2017-01-01

    Despite a high prevalence of metabolic trait related diseases in Arabian Peninsula, there is a lack of convincingly identified genetic determinants for metabolic traits in this population. Arab populations are underrepresented in global genome-wide association studies. We genotyped 1965 unrelated Arab individuals from Kuwait using Cardio-MetaboChip, and tested SNP associations with 13 metabolic traits. Models based on recessive mode of inheritance identified Chr15:40531386-rs12440118/ZNF106/W->R as a risk variant associated with glycated-hemoglobin at close to ‘genome-wide significant’ p-value and five other risk variants ‘nominally’ associated (p-value ≤ 5.45E-07) with fasting plasma glucose (rs7144734/[OTX2-AS1,RPL3P3]) and triglyceride (rs17501809/PLGRKT; rs11143005/LOC105376072; rs900543/[THSD4,NR2E3]; and Chr12:101494770/IGF1). Furthermore, we identified 33 associations (30 SNPs with 12 traits) with ‘suggestive’ evidence of association (p-value < 1.0E-05); 20 of these operate under recessive mode of inheritance. Two of these ‘suggestive’ associations (rs1800775-CETP/HDL; and rs9326246-BUD13/TGL) showed evidence at genome-wide significance in previous studies on Euro-centric populations. Involvement of many of the identified loci in mediating metabolic traits was supported by literature evidences. The identified loci participate in critical metabolic pathways (such as Ceramide signaling, and Mitogen-Activated Protein Kinase/Extracellular Signal Regulated Kinase signaling). Data from Genotype-Tissue Expression database affirmed that 7 of the identified variants differentially regulate the up/downstream genes that mediate metabolic traits. PMID:28106113

  20. Retrieving high-resolution images over the Internet from an anatomical image database

    NASA Astrophysics Data System (ADS)

    Strupp-Adams, Annette; Henderson, Earl

    1999-12-01

    The Visible Human Data set is an important contribution to the national collection of anatomical images. To enhance the availability of these images, the National Library of Medicine has supported the design and development of a prototype object-oriented image database which imports, stores, and distributes high resolution anatomical images in both pixel and voxel formats. One of the key database modules is its client-server Internet interface. This Web interface provides a query engine with retrieval access to high-resolution anatomical images that range in size from 100KB for browser viewable rendered images, to 1GB for anatomical structures in voxel file formats. The Web query and retrieval client-server system is composed of applet GUIs, servlets, and RMI application modules which communicate with each other to allow users to query for specific anatomical structures, and retrieve image data as well as associated anatomical images from the database. Selected images can be downloaded individually as single files via HTTP or downloaded in batch-mode over the Internet to the user's machine through an applet that uses Netscape's Object Signing mechanism. The image database uses ObjectDesign's object-oriented DBMS, ObjectStore that has a Java interface. The query and retrieval systems has been tested with a Java-CDE window system, and on the x86 architecture using Windows NT 4.0. This paper describes the Java applet client search engine that queries the database; the Java client module that enables users to view anatomical images online; the Java application server interface to the database which organizes data returned to the user, and its distribution engine that allow users to download image files individually and/or in batch-mode.

  1. Object-oriented analysis and design of an ECG storage and retrieval system integrated with an HIS.

    PubMed

    Wang, C; Ohe, K; Sakurai, T; Nagase, T; Kaihara, S

    1996-03-01

    For a hospital information system, object-oriented methodology plays an increasingly important role, especially for the management of digitized data, e.g., the electrocardiogram, electroencephalogram, electromyogram, spirogram, X-ray, CT and histopathological images, which are not yet computerized in most hospitals. As a first step in an object-oriented approach to hospital information management and storing medical data in an object-oriented database, we connected electrocardiographs to a hospital network and established the integration of ECG storage and retrieval systems with a hospital information system. In this paper, the object-oriented analysis and design of the ECG storage and retrieval systems is reported.

  2. Missing data in substance abuse research? Researchers’ reporting practices of sexual orientation and gender identity

    PubMed Central

    Bacca, Cristina L.; Cochran, Bryan N.

    2014-01-01

    Background Lesbian, gay, bisexual, and transgender individuals are at higher risk for substance use and substance use disorders than heterosexual individuals and are more likely to seek substance use treatment, yet sexual orientation and gender identity are frequently not reported in the research literature. The purpose of this study was to identify if sexual orientation and gender identity are being reported in the recent substance use literature, and if this has changed over time. Method The PsycINFO and PubMed databases were searched for articles released in 2007 and 2012 using the term “substance abuse” and 200 articles were randomly selected from each time period and database. Articles were coded for the presence or absence of sexual orientation and gender identity information. Results Participants’ sexual orientation was reported in 3.0% and 4.9% of the 2007 and 2.3% and 6.5% of the 2012 sample, in PsycINFO and PubMed sample articles, respectively, while non-binary gender identity was reported in 0% and 1.0% of the 2007 sample and 2.3% and 1.9% of the 2012 PsycINFO and PubMed sample articles. There were no differences in rates of reporting over time. Conclusions Sexual orientation and gender identity are rarely reported in the substance abuse literature, and there has not been a change in reporting practices between 2007 and 2012. Recommendations for future investigators in reporting sexual orientation and gender identity are included. PMID:25496705

  3. Missing data in substance abuse research? Researchers' reporting practices of sexual orientation and gender identity.

    PubMed

    Flentje, Annesa; Bacca, Cristina L; Cochran, Bryan N

    2015-02-01

    Lesbian, gay, bisexual, and transgender individuals are at higher risk for substance use and substance use disorders than heterosexual individuals and are more likely to seek substance use treatment, yet sexual orientation and gender identity are frequently not reported in the research literature. The purpose of this study was to identify if sexual orientation and gender identity are being reported in the recent substance use literature, and if this has changed over time. The PsycINFO and PubMed databases were searched for articles released in 2007 and 2012 using the term "substance abuse" and 200 articles were randomly selected from each time period and database. Articles were coded for the presence or absence of sexual orientation and gender identity information. Participants' sexual orientation was reported in 3.0% and 4.9% of the 2007 and 2.3% and 6.5% of the 2012 sample, in PsycINFO and PubMed sample articles, respectively, while non-binary gender identity was reported in 0% and 1.0% of the 2007 sample and 2.3% and 1.9% of the 2012 PsycINFO and PubMed sample articles. There were no differences in rates of reporting over time. Sexual orientation and gender identity are rarely reported in the substance abuse literature, and there has not been a change in reporting practices between 2007 and 2012. Recommendations for future investigators in reporting sexual orientation and gender identity are included. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  4. LymPHOS 2.0: an update of a phosphosite database of primary human T cells

    PubMed Central

    Nguyen, Tien Dung; Vidal-Cortes, Oriol; Gallardo, Oscar; Abian, Joaquin; Carrascal, Montserrat

    2015-01-01

    LymPHOS is a web-oriented database containing peptide and protein sequences and spectrometric information on the phosphoproteome of primary human T-Lymphocytes. Current release 2.0 contains 15 566 phosphorylation sites from 8273 unique phosphopeptides and 4937 proteins, which correspond to a 45-fold increase over the original database description. It now includes quantitative data on phosphorylation changes after time-dependent treatment with activators of the TCR-mediated signal transduction pathway. Sequence data quality has also been improved with the use of multiple search engines for database searching. LymPHOS can be publicly accessed at http://www.lymphos.org. Database URL: http://www.lymphos.org. PMID:26708986

  5. Orientation and mobility training for partially-sighted older adults using an identification cane: a systematic review

    PubMed Central

    Ballemans, Judith; Kempen, Gertrudis IJM; Zijlstra, GA Rixt

    2011-01-01

    Objective: This study aimed to provide an overview of the development, content, feasibility, and effectiveness of existing orientation and mobility training programmes in the use of the identification cane. Data sources: A systematic bibliographic database search in PubMed, PsychInfo, ERIC, CINAHL and the Cochrane Library was performed, in combination with the expert consultation (n = 42; orientation and mobility experts), and hand-searching of reference lists. Review methods: Selection criteria included a description of the development, the content, the feasibility, or the effectiveness of orientation and mobility training in the use of the identification cane. Two reviewers independently agreed on eligibility and methodological quality. A narrative/qualitative data analysis method was applied to extract data from obtained documents. Results: The sensitive database search and hand-searching of reference lists revealed 248 potentially relevant abstracts. None met the eligibility criteria. Expert consultation resulted in the inclusion of six documents in which the information presented on the orientation and mobility training in the use of the identification cane was incomplete and of low methodological quality. Conclusion: Our review of the literature showed a lack of well-described protocols and studies on orientation and mobility training in identification cane use. PMID:21795405

  6. Orientation Modeling for Amateur Cameras by Matching Image Line Features and Building Vector Data

    NASA Astrophysics Data System (ADS)

    Hung, C. H.; Chang, W. C.; Chen, L. C.

    2016-06-01

    With the popularity of geospatial applications, database updating is getting important due to the environmental changes over time. Imagery provides a lower cost and efficient way to update the database. Three dimensional objects can be measured by space intersection using conjugate image points and orientation parameters of cameras. However, precise orientation parameters of light amateur cameras are not always available due to their costliness and heaviness of precision GPS and IMU. To automatize data updating, the correspondence of object vector data and image may be built to improve the accuracy of direct georeferencing. This study contains four major parts, (1) back-projection of object vector data, (2) extraction of image feature lines, (3) object-image feature line matching, and (4) line-based orientation modeling. In order to construct the correspondence of features between an image and a building model, the building vector features were back-projected onto the image using the initial camera orientation from GPS and IMU. Image line features were extracted from the imagery. Afterwards, the matching procedure was done by assessing the similarity between the extracted image features and the back-projected ones. Then, the fourth part utilized line features in orientation modeling. The line-based orientation modeling was performed by the integration of line parametric equations into collinearity condition equations. The experiment data included images with 0.06 m resolution acquired by Canon EOS Mark 5D II camera on a Microdrones MD4-1000 UAV. Experimental results indicate that 2.1 pixel accuracy may be reached, which is equivalent to 0.12 m in the object space.

  7. Low Frequency Variants, Collapsed Based on Biological Knowledge, Uncover Complexity of Population Stratification in 1000 Genomes Project Data

    PubMed Central

    Moore, Carrie B.; Wallace, John R.; Wolfe, Daniel J.; Frase, Alex T.; Pendergrass, Sarah A.; Weiss, Kenneth M.; Ritchie, Marylyn D.

    2013-01-01

    Analyses investigating low frequency variants have the potential for explaining additional genetic heritability of many complex human traits. However, the natural frequencies of rare variation between human populations strongly confound genetic analyses. We have applied a novel collapsing method to identify biological features with low frequency variant burden differences in thirteen populations sequenced by the 1000 Genomes Project. Our flexible collapsing tool utilizes expert biological knowledge from multiple publicly available database sources to direct feature selection. Variants were collapsed according to genetically driven features, such as evolutionary conserved regions, regulatory regions genes, and pathways. We have conducted an extensive comparison of low frequency variant burden differences (MAF<0.03) between populations from 1000 Genomes Project Phase I data. We found that on average 26.87% of gene bins, 35.47% of intergenic bins, 42.85% of pathway bins, 14.86% of ORegAnno regulatory bins, and 5.97% of evolutionary conserved regions show statistically significant differences in low frequency variant burden across populations from the 1000 Genomes Project. The proportion of bins with significant differences in low frequency burden depends on the ancestral similarity of the two populations compared and types of features tested. Even closely related populations had notable differences in low frequency burden, but fewer differences than populations from different continents. Furthermore, conserved or functionally relevant regions had fewer significant differences in low frequency burden than regions under less evolutionary constraint. This degree of low frequency variant differentiation across diverse populations and feature elements highlights the critical importance of considering population stratification in the new era of DNA sequencing and low frequency variant genomic analyses. PMID:24385916

  8. Cellobiohydrolase variants and polynucleotides encoding same

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wogulis, Mark

    The present invention relates to variants of a parent cellobiohydrolase II. The present invention also relates to polynucleotides encoding the variants; nucleic acid constructs, vectors, and host cells comprising the polynucleotides; and methods of using the variants.

  9. Characterization of form variants of Xenorhabdus luminescens.

    PubMed Central

    Gerritsen, L J; de Raay, G; Smits, P H

    1992-01-01

    From Xenorhabdus luminescens XE-87.3 four variants were isolated. One, which produced a red pigment and antibiotics, was luminescent, and could take up dye from culture media, was considered the primary form (XE-red). A pink-pigmented variant (XE-pink) differed from the primary form only in pigmentation and uptake of dye. Of the two other variants, one produced a yellow pigment and fewer antibiotics (XE-yellow), while the other did not produce a pigment or antibiotics (XE-white). Both were less luminescent, did not take up dye, and had small cell and colony sizes. These two variants were very unstable and shifted to the primary form after 3 to 5 days. It was not possible to separate the primary form and the white variant completely; subcultures of one colony always contained a few colonies of the other variant. The white variant was also found in several other X. luminescens strains. DNA fingerprints showed that all four variants are genetically identical and are therefore derivatives of the same parent. Protein patterns revealed a few differences among the four variants. None of the variants could be considered the secondary form. The pathogenicity of the variants decreased in the following order: XE-red, XE-pink, XE-yellow, and XE-white. The mechanism and function of this variability are discussed. Images PMID:1622273

  10. X-Linked and Autosomal Recessive Alport Syndrome: Pathogenic Variant Features and Further Genotype-Phenotype Correlations

    PubMed Central

    Savige, Judith; Storey, Helen; Il Cheong, Hae; Gyung Kang, Hee; Park, Eujin; Hilbert, Pascale; Persikov, Anton; Torres-Fernandez, Carmen; Ars, Elisabet; Torra, Roser; Hertz, Jens Michael; Thomassen, Mads; Shagam, Lev; Wang, Dongmao; Wang, Yanyan; Flinter, Frances; Nagel, Mato

    2016-01-01

    Alport syndrome results from mutations in the COL4A5 (X-linked) or COL4A3/COL4A4 (recessive) genes. This study examined 754 previously- unpublished variants in these genes from individuals referred for genetic testing in 12 accredited diagnostic laboratories worldwide, in addition to all published COL4A5, COL4A3 and COL4A4 variants in the LOVD databases. It also determined genotype-phenotype correlations for variants where clinical data were available. Individuals were referred for genetic testing where Alport syndrome was suspected clinically or on biopsy (renal failure, hearing loss, retinopathy, lamellated glomerular basement membrane), variant pathogenicity was assessed using currently-accepted criteria, and variants were examined for gene location, and age at renal failure onset. Results were compared using Fisher’s exact test (DNA Stata). Altogether 754 new DNA variants were identified, an increase of 25%, predominantly in people of European background. Of the 1168 COL4A5 variants, 504 (43%) were missense mutations, 273 (23%) splicing variants, 73 (6%) nonsense mutations, 169 (14%) short deletions and 76 (7%) complex or large deletions. Only 135 of the 432 Gly residues in the collagenous sequence were substituted (31%), which means that fewer than 10% of all possible variants have been identified. Both missense and nonsense mutations in COL4A5 were not randomly distributed but more common at the 70 CpG sequences (p<10−41 and p<0.001 respectively). Gly>Ala substitutions were underrepresented in all three genes (p< 0.0001) probably because of an association with a milder phenotype. The average age at end-stage renal failure was the same for all mutations in COL4A5 (24.4 ±7.8 years), COL4A3 (23.3 ± 9.3) and COL4A4 (25.4 ± 10.3) (COL4A5 and COL4A3, p = 0.45; COL4A5 and COL4A4, p = 0.55; COL4A3 and COL4A4, p = 0.41). For COL4A5, renal failure occurred sooner with non-missense than missense variants (p<0.01). For the COL4A3 and COL4A4 genes, age at renal

  11. X-Linked and Autosomal Recessive Alport Syndrome: Pathogenic Variant Features and Further Genotype-Phenotype Correlations.

    PubMed

    Savige, Judith; Storey, Helen; Il Cheong, Hae; Gyung Kang, Hee; Park, Eujin; Hilbert, Pascale; Persikov, Anton; Torres-Fernandez, Carmen; Ars, Elisabet; Torra, Roser; Hertz, Jens Michael; Thomassen, Mads; Shagam, Lev; Wang, Dongmao; Wang, Yanyan; Flinter, Frances; Nagel, Mato

    2016-01-01

    Alport syndrome results from mutations in the COL4A5 (X-linked) or COL4A3/COL4A4 (recessive) genes. This study examined 754 previously- unpublished variants in these genes from individuals referred for genetic testing in 12 accredited diagnostic laboratories worldwide, in addition to all published COL4A5, COL4A3 and COL4A4 variants in the LOVD databases. It also determined genotype-phenotype correlations for variants where clinical data were available. Individuals were referred for genetic testing where Alport syndrome was suspected clinically or on biopsy (renal failure, hearing loss, retinopathy, lamellated glomerular basement membrane), variant pathogenicity was assessed using currently-accepted criteria, and variants were examined for gene location, and age at renal failure onset. Results were compared using Fisher's exact test (DNA Stata). Altogether 754 new DNA variants were identified, an increase of 25%, predominantly in people of European background. Of the 1168 COL4A5 variants, 504 (43%) were missense mutations, 273 (23%) splicing variants, 73 (6%) nonsense mutations, 169 (14%) short deletions and 76 (7%) complex or large deletions. Only 135 of the 432 Gly residues in the collagenous sequence were substituted (31%), which means that fewer than 10% of all possible variants have been identified. Both missense and nonsense mutations in COL4A5 were not randomly distributed but more common at the 70 CpG sequences (p<10-41 and p<0.001 respectively). Gly>Ala substitutions were underrepresented in all three genes (p< 0.0001) probably because of an association with a milder phenotype. The average age at end-stage renal failure was the same for all mutations in COL4A5 (24.4 ±7.8 years), COL4A3 (23.3 ± 9.3) and COL4A4 (25.4 ± 10.3) (COL4A5 and COL4A3, p = 0.45; COL4A5 and COL4A4, p = 0.55; COL4A3 and COL4A4, p = 0.41). For COL4A5, renal failure occurred sooner with non-missense than missense variants (p<0.01). For the COL4A3 and COL4A4 genes, age at renal failure

  12. An iterated local search algorithm for the team orienteering problem with variable profits

    NASA Astrophysics Data System (ADS)

    Gunawan, Aldy; Ng, Kien Ming; Kendall, Graham; Lai, Junhan

    2018-07-01

    The orienteering problem (OP) is a routing problem that has numerous applications in various domains such as logistics and tourism. The objective is to determine a subset of vertices to visit for a vehicle so that the total collected score is maximized and a given time budget is not exceeded. The extensive application of the OP has led to many different variants, including the team orienteering problem (TOP) and the team orienteering problem with time windows. The TOP extends the OP by considering multiple vehicles. In this article, the team orienteering problem with variable profits (TOPVP) is studied. The main characteristic of the TOPVP is that the amount of score collected from a visited vertex depends on the duration of stay on that vertex. A mathematical programming model for the TOPVP is first presented and an algorithm based on iterated local search (ILS) that is able to solve modified benchmark instances is then proposed. It is concluded that ILS produces solutions which are comparable to those obtained by the commercial solver CPLEX for smaller instances. For the larger instances, ILS obtains good-quality solutions that have significantly better objective value than those found by CPLEX under reasonable computational times.

  13. Design and Implementation of an Interface Editor for the Amadeus Multi- Relational Database Front-end System

    DTIC Science & Technology

    1993-03-25

    application of Object-Oriented Programming (OOP) and Human-Computer Interface (HCI) design principles. Knowledge gained from each topic has been incorporated...through the ap- plication of Object-Oriented Programming (OOP) and Human-Computer Interface (HCI) design principles. Knowledge gained from each topic has...programming and Human-Computer Interface (HCI) design. Knowledge gained from each is applied to the design of a Form-based interface for database data

  14. The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes

    PubMed Central

    Rigden, Daniel J

    2017-01-01

    Abstract This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR. As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein–protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as ‘breakthrough’ contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the ‘golden set’ of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/. PMID:28053160

  15. Crystallographic features of the martensitic transformation and their impact on variant organization in the intermetallic compound Ni50Mn38Sb12 studied by SEM/EBSD.

    PubMed

    Zhang, Chunyang; Zhang, Yudong; Esling, Claude; Zhao, Xiang; Zuo, Liang

    2017-09-01

    The mechanical and magnetic properties of Ni-Mn-Sb intermetallic compounds are closely related to the martensitic transformation and martensite variant organization. However, studies of these issues are very limited. Thus, a thorough crystallographic investigation of the martensitic transformation orientation relationship (OR), the transformation deformation and their impact on the variant organization of an Ni 50 Mn 38 Sb 12 alloy using scanning electron microscopy/electron backscatter diffraction (SEM/EBSD) was conducted in this work. It is shown that the martensite variants are hierarchically organized into plates, each possessing four distinct twin-related variants, and the plates into plate colonies, each containing four distinct plates delimited by compatible and incompatible plate interfaces. Such a characteristic organization is produced by the martensitic transformation. It is revealed that the transformation obeys the Pitsch relation ({0[Formula: see text]} A // {2[Formula: see text]} M and 〈0[Formula: see text]1〉 A // 〈[Formula: see text]2〉 M ; the subscripts A and M refer to austenite and martensite, respectively). The type I twinning plane K 1 of the intra-plate variants and the compatible plate interface plane correspond to the respective orientation relationship planes {0[Formula: see text]} A and {0[Formula: see text]} A of austenite. The three {0[Formula: see text]} A planes possessed by each pair of compatible plates, one corresponding to the compatible plate interface and the other two to the variants in the two plates, are interrelated by 60° and belong to a single 〈11[Formula: see text]〉 A axis zone. The {0[Formula: see text]} A planes representing the two pairs of compatible plates in each plate colony belong to two 〈11[Formula: see text]〉 A axis zones having one {0[Formula: see text]} A plane in common. This common plane defines the compatible plate interfaces of the two pairs of plates. The transformation strains to form the

  16. Reconciling newborn screening and a novel splice variant in BTD associated with partial biotinidase deficiency: A BabySeq Project case report.

    PubMed

    Murry, Jaclyn B; Machini, Kalotina; Ceyhan-Birsoy, Ozge; Kritzer, Amy; Krier, Joel B; Lebo, Matthew S; Fayer, Shawn; Genetti, Casie A; Vannoy, Grace E; Yu, Timothy W; Agrawal, Pankaj B; Parad, Richard B; Holm, Ingrid A; McGuire, Amy L; Green, Robert C; Beggs, Alan H; Rehm, Heidi L; Project, The BabySeq

    2018-05-04

    Here, we report a newborn female infant from the well-baby cohort of the BabySeq Project who was identified with compound heterozygous BTD gene variants. The two identified variants included a well-established pathogenic variant (c.1612C>T, p.Arg538Cys) that causes profound biotinidase deficiency (BTD) in homozygosity. In addition, a novel splice variant (c.44+1G>A, p.?) was identified in the invariant splice donor region of intron 1, potentially predictive of loss of function. The novel variant was predicted to impact splicing of exon 1; however, given the absence of any reported pathogenic variants in exon 1 and the presence of alternative splicing with exon 1 absent in most tissues in the GTEx database, we assigned an initial classification of uncertain significance. Follow-up medical record review of state mandated newborn screen (NBS) results revealed an initial out-of-range biotinidase activity level. Levels from a repeat NBS sample barely passed cut-off into the normal range. To determine whether the infant was biotinidase deficient, subsequent diagnostic enzyme activity testing was performed, confirming partial BTD, and resulted in a change of management for this patient. This led to reclassification of the novel splice variant based on these results. In conclusion, combining the genetic and NBS results together prompted clinical follow-up that confirmed partial biotinidase deficiency, and informed this novel splice site's reclassification emphasizing the importance of combining iterative genetic and phenotypic evaluations. Cold Spring Harbor Laboratory Press.

  17. Random Plant Viral Variants Attain Temporal Advantages During Systemic Infections and in Turn Resist other Variants of the Same Virus.

    PubMed

    Zhang, Xiao-Feng; Guo, Jiangbo; Zhang, Xiuchun; Meulia, Tea; Paul, Pierce; Madden, Laurence V; Li, Dawei; Qu, Feng

    2015-10-20

    Infection of plants with viruses containing multiple variants frequently leads to dominance by a few random variants in the systemically infected leaves (SLs), for which a plausible explanation is lacking. We show here that SL dominance by a given viral variant is adequately explained by its fortuitous lead in systemic spread, coupled with its resistance to superinfection by other variants. We analyzed the fate of a multi-variant turnip crinkle virus (TCV) population in Arabidopsis and N. benthamiana plants. Both wild-type and RNA silencing-defective plants displayed a similar pattern of random dominance by a few variant genotypes, thus discounting a prominent role for RNA silencing. When introduced to plants sequentially as two subpopulations, a twelve-hour head-start was sufficient for the first set to dominate. Finally, SLs of TCV-infected plants became highly resistant to secondary invasions of another TCV variant. We propose that random distribution of variant foci on inoculated leaves allows different variants to lead systemic movement in different plants. The leading variants then colonize large areas of SLs, and resist the superinfection of lagging variants in the same areas. In conclusion, superinfection resistance is the primary driver of random enrichment of viral variants in systemically infected plants.

  18. Proposal for Implementing Multi-User Database (MUD) Technology in an Academic Library.

    ERIC Educational Resources Information Center

    Filby, A. M. Iliana

    1996-01-01

    Explores the use of MOO (multi-user object oriented) virtual environments in academic libraries to enhance reference services. Highlights include the development of multi-user database (MUD) technology from gaming to non-recreational settings; programming issues; collaborative MOOs; MOOs as distinguished from other types of virtual reality; audio…

  19. Omitted data in randomized controlled trials for anxiety and depression: A systematic review of the inclusion of sexual orientation and gender identity.

    PubMed

    Heck, Nicholas C; Mirabito, Lucas A; LeMaire, Kelly; Livingston, Nicholas A; Flentje, Annesa

    2017-01-01

    The current study examined the frequency with which randomized controlled trials (RCTs) of behavioral and psychological interventions for anxiety and depression include data pertaining to participant sexual orientation and nonbinary gender identities. Using systematic review methodology, the databases PubMed and PsycINFO were searched to identify RCTs published in 2004, 2009, and 2014. Random selections of 400 articles per database per year (2,400 articles in total) were considered for inclusion in the review. Articles meeting inclusion criteria were read and coded by the research team to identify whether the trial reported data pertaining to participant sexual orientation and nonbinary gender identities. Additional trial characteristics were also identified and indexed in our database (e.g., sample size, funding source). Of the 232 articles meeting inclusion criteria, only 1 reported participants' sexual orientation, and zero articles included nonbinary gender identities. A total of 52,769 participants were represented in the trials, 93 of which were conducted in the United States, and 43 acknowledged the National Institutes of Health as a source of funding. Despite known mental health disparities on the basis of sexual orientation and nonbinary gender identification, researchers evaluating interventions for anxiety and depression are not reporting on these important demographic characteristics. Reporting practices must change to ensure that our interventions generalize to lesbian, gay, bisexual, and transgender persons. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  20. Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Na, Seungjin; Payne, Samuel H.; Bandeira, Nuno

    The spectral networks approach enables the detection of pairs of spectra from related peptides and thus allows for the propagation of annotations from identified peptides to unidentified spectra. Beyond allowing for unbiased discovery of unexpected post-translational modifications, spectral networks are also applicable to multi-species comparative proteomics or metaproteomics to identify numerous orthologous versions of a protein. We present algorithmic and statistical advances in spectral networks that have made it possible to rigorously assess the statistical significance of spectral pairs and accurately estimate the error rate of identifications via propagation. In the analysis of three related Cyanothece species, a model organismmore » for biohydrogen production, spectral networks identified peptides with highly divergent sequences with up to dozens of variants per peptide, including many novel peptides in species that lack a sequenced genome. Furthermore, spectral networks strongly suggested the presence of novel peptides even in genomically characterized species (i.e. missing from databases) in that a significant portion of unidentified multi-species networks included at least two polymorphic peptide variants.« less

  1. Growth mechanism of extension twin variants during annealing of pure magnesium: An ‘ex situ’ electron backscattered diffraction investigation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sabat, R.K.

    Pure magnesium was subjected to plastic deformation through CSM (continuous stiffness measurement) indentation followed by annealing at 200 °C for 30 min. Nucleation of no new grains was observed neither at the twin–twin intersections nor at the multiple twin variants of a grain after annealing. Significant growth of off-basal twin orientation compared to basal twin orientation was observed in the sample after annealing and is attributed to the partial coherent nature of twin boundary in the later case. Further, growth of twins was independent of the strain distribution between parent and twinned grains. - Highlights: • An ‘ex situ’ EBSDmore » of pure Mg during annealing was investigated. • Nucleation of no new grains was observed. • Significant growth of off-basal twin orientation was observed. • Growth of twins may be attributed to the partial coherent nature of twin boundary.« less

  2. Evaluation of relational and NoSQL database architectures to manage genomic annotations.

    PubMed

    Schulz, Wade L; Nelson, Brent G; Felker, Donn K; Durant, Thomas J S; Torres, Richard

    2016-12-01

    While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. An Extensible "SCHEMA-LESS" Database Framework for Managing High-Throughput Semi-Structured Documents

    NASA Technical Reports Server (NTRS)

    Maluf, David A.; Tran, Peter B.

    2003-01-01

    Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object-oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK, is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword search of records spanning across both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semistructured documents existing within NASA enterprises. Today, NETMARK is a flexible, high-throughput open database framework for managing, storing, and searching unstructured or semi-structured arbitrary hierarchal models, such as XML and HTML.

  4. An Extensible Schema-less Database Framework for Managing High-throughput Semi-Structured Documents

    NASA Technical Reports Server (NTRS)

    Maluf, David A.; Tran, Peter B.; La, Tracy; Clancy, Daniel (Technical Monitor)

    2002-01-01

    Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword searches of records for both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semi-structured documents existing within NASA enterprises. Today, NETMARK is a flexible, high throughput open database framework for managing, storing, and searching unstructured or semi structured arbitrary hierarchal models, XML and HTML.

  5. αIIbβ3 variants defined by next-generation sequencing: Predicting variants likely to cause Glanzmann thrombasthenia

    PubMed Central

    Buitrago, Lorena; Rendon, Augusto; Liang, Yupu; Simeoni, Ilenia; Negri, Ana; Filizola, Marta; Ouwehand, Willem H.; Coller, Barry S.; Alessi, Marie-Christine; Ballmaier, Matthias; Bariana, Tadbir; Bellissimo, Daniel; Bertoli, Marta; Bray, Paul; Bury, Loredana; Carrell, Robin; Cattaneo, Marco; Collins, Peter; French, Deborah; Favier, Remi; Freson, Kathleen; Furie, Bruce; Germeshausen, Manuela; Ghevaert, Cedric; Gomez, Keith; Goodeve, Anne; Gresele, Paolo; Guerrero, Jose; Hampshire, Dan J.; Hadinnapola, Charaka; Heemskerk, Johan; Henskens, Yvonne; Hill, Marian; Hogg, Nancy; Johnsen, Jill; Kahr, Walter; Kerr, Ron; Kunishima, Shinji; Laffan, Michael; Natwani, Amit; Neerman-Arbez, Marguerite; Nurden, Paquita; Nurden, Alan; Ormiston, Mark; Othman, Maha; Ouwehand, Willem; Perry, David; Vilk, Shoshana Ravel; Reitsma, Pieter; Rondina, Matthew; Simeoni, Ilenia; Smethurst, Peter; Stephens, Jonathan; Stevenson, William; Szkotak, Artur; Turro, Ernest; Van Geet, Christel; Vries, Minka; Ward, June; Waye, John; Westbury, Sarah; Whiteheart, Sidney; Wilcox, David; Zhang, Bi

    2015-01-01

    Next-generation sequencing is transforming our understanding of human genetic variation but assessing the functional impact of novel variants presents challenges. We analyzed missense variants in the integrin αIIbβ3 receptor subunit genes ITGA2B and ITGB3 identified by whole-exome or -genome sequencing in the ThromboGenomics project, comprising ∼32,000 alleles from 16,108 individuals. We analyzed the results in comparison with 111 missense variants in these genes previously reported as being associated with Glanzmann thrombasthenia (GT), 20 associated with alloimmune thrombocytopenia, and 5 associated with aniso/macrothrombocytopenia. We identified 114 novel missense variants in ITGA2B (affecting ∼11% of the amino acids) and 68 novel missense variants in ITGB3 (affecting ∼9% of the amino acids). Of the variants, 96% had minor allele frequencies (MAF) < 0.1%, indicating their rarity. Based on sequence conservation, MAF, and location on a complete model of αIIbβ3, we selected three novel variants that affect amino acids previously associated with GT for expression in HEK293 cells. αIIb P176H and β3 C547G severely reduced αIIbβ3 expression, whereas αIIb P943A partially reduced αIIbβ3 expression and had no effect on fibrinogen binding. We used receiver operating characteristic curves of combined annotation-dependent depletion, Polyphen 2-HDIV, and sorting intolerant from tolerant to estimate the percentage of novel variants likely to be deleterious. At optimal cut-off values, which had 69–98% sensitivity in detecting GT mutations, between 27% and 71% of the novel αIIb or β3 missense variants were predicted to be deleterious. Our data have implications for understanding the evolutionary pressure on αIIbβ3 and highlight the challenges in predicting the clinical significance of novel missense variants. PMID:25827233

  6. Autosomal-Recessive Hearing Impairment Due to Rare Missense Variants within S1PR2

    PubMed Central

    Santos-Cortez, Regie Lyn P.; Faridi, Rabia; Rehman, Atteeq U.; Lee, Kwanghyuk; Ansar, Muhammad; Wang, Xin; Morell, Robert J.; Isaacson, Rivka; Belyantseva, Inna A.; Dai, Hang; Acharya, Anushree; Qaiser, Tanveer A.; Muhammad, Dost; Ali, Rana Amjad; Shams, Sulaiman; Hassan, Muhammad Jawad; Shahzad, Shaheen; Raza, Syed Irfan; Bashir, Zil-e-Huma; Smith, Joshua D.; Nickerson, Deborah A.; Bamshad, Michael J.; Riazuddin, Sheikh; Ahmad, Wasim; Friedman, Thomas B.; Leal, Suzanne M.

    2016-01-01

    The sphingosine-1-phosphate receptors (S1PRs) are a well-studied class of transmembrane G protein-coupled sphingolipid receptors that mediate multiple cellular processes. However, S1PRs have not been previously reported to be involved in the genetic etiology of human traits. S1PR2 lies within the autosomal-recessive nonsyndromic hearing impairment (ARNSHI) locus DFNB68 on 19p13.2. From exome sequence data we identified two pathogenic S1PR2 variants, c.323G>C (p.Arg108Pro) and c.419A>G (p.Tyr140Cys). Each of these variants co-segregates with congenital profound hearing impairment in consanguineous Pakistani families with maximum LOD scores of 6.4 for family DEM4154 and 3.3 for family PKDF1400. Neither S1PR2 missense variant was reported among ∼120,000 chromosomes in the Exome Aggregation Consortium database, in 76 unrelated Pakistani exomes, or in 720 Pakistani control chromosomes. Both DNA variants affect highly conserved residues of S1PR2 and are predicted to be damaging by multiple bioinformatics tools. Molecular modeling predicts that these variants affect binding of sphingosine-1-phosphate (p.Arg108Pro) and G protein docking (p.Tyr140Cys). In the previously reported S1pr2−/− mice, stria vascularis abnormalities, organ of Corti degeneration, and profound hearing loss were observed. Additionally, hair cell defects were seen in both knockout mice and morphant zebrafish. Family PKDF1400 presents with ARNSHI, which is consistent with the lack of gross malformations in S1pr2−/− mice, whereas family DEM4154 has lower limb malformations in addition to hearing loss. Our findings suggest the possibility of developing therapies against hair cell damage (e.g., from ototoxic drugs) through targeted stimulation of S1PR2. PMID:26805784

  7. Neanderthal and Denisova tooth protein variants in present-day humans

    PubMed Central

    Zanolli, Clément; Hourset, Mathilde; Esclassan, Rémi

    2017-01-01

    Environment parameters, diet and genetic factors interact to shape tooth morphostructure. In the human lineage, archaic and modern hominins show differences in dental traits, including enamel thickness, but variability also exists among living populations. Several polymorphisms, in particular in the non-collagenous extracellular matrix proteins of the tooth hard tissues, like enamelin, are involved in dental structure variation and defects and may be associated with dental disorders or susceptibility to caries. To gain insights into the relationships between tooth protein polymorphisms and dental structural morphology and defects, we searched for non-synonymous polymorphisms in tooth proteins from Neanderthal and Denisova hominins. The objective was to identify archaic-specific missense variants that may explain the dental morphostructural variability between extinct and modern humans, and to explore their putative impact on present-day dental phenotypes. Thirteen non-collagenous extracellular matrix proteins specific to hard dental tissues have been selected, searched in the publicly available sequence databases of Neanderthal and Denisova individuals and compared with modern human genome data. A total of 16 non-synonymous polymorphisms were identified in 6 proteins (ameloblastin, amelotin, cementum protein 1, dentin matrix acidic phosphoprotein 1, enamelin and matrix Gla protein). Most of them are encoded by dentin and enamel genes located on chromosome 4, previously reported to show signs of archaic introgression within Africa. Among the variants shared with modern humans, two are ancestral (common with apes) and one is the derived enamelin major variant, T648I (rs7671281), associated with a thinner enamel and specific to the Homo lineage. All the others are specific to Neanderthals and Denisova, and are found at a very low frequency in modern Africans or East and South Asians, suggesting that they may be related to particular dental traits or disease

  8. Variant Interpretation: Functional Assays to the Rescue.

    PubMed

    Starita, Lea M; Ahituv, Nadav; Dunham, Maitreya J; Kitzman, Jacob O; Roth, Frederick P; Seelig, Georg; Shendure, Jay; Fowler, Douglas M

    2017-09-07

    Classical genetic approaches for interpreting variants, such as case-control or co-segregation studies, require finding many individuals with each variant. Because the overwhelming majority of variants are present in only a few living humans, this strategy has clear limits. Fully realizing the clinical potential of genetics requires that we accurately infer pathogenicity even for rare or private variation. Many computational approaches to predicting variant effects have been developed, but they can identify only a small fraction of pathogenic variants with the high confidence that is required in the clinic. Experimentally measuring a variant's functional consequences can provide clearer guidance, but individual assays performed only after the discovery of the variant are both time and resource intensive. Here, we discuss how multiplex assays of variant effect (MAVEs) can be used to measure the functional consequences of all possible variants in disease-relevant loci for a variety of molecular and cellular phenotypes. The resulting large-scale functional data can be combined with machine learning and clinical knowledge for the development of "lookup tables" of accurate pathogenicity predictions. A coordinated effort to produce, analyze, and disseminate large-scale functional data generated by multiplex assays could be essential to addressing the variant-interpretation crisis. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  9. Predicting helix orientation for coiled-coil dimers

    PubMed Central

    Apgar, James R.; Gutwin, Karl N.; Keating, Amy E.

    2008-01-01

    The alpha-helical coiled coil is a structurally simple protein oligomerization or interaction motif consisting of two or more alpha helices twisted into a supercoiled bundle. Coiled coils can differ in their stoichiometry, helix orientation and axial alignment. Because of the near degeneracy of many of these variants, coiled coils pose a challenge to fold recognition methods for structure prediction. Whereas distinctions between some protein folds can be discriminated on the basis of hydrophobic/polar patterning or secondary structure propensities, the sequence differences that encode important details of coiled-coil structure can be subtle. This is emblematic of a larger problem in the field of protein structure and interaction prediction: that of establishing specificity between closely similar structures. We tested the behavior of different computational models on the problem of recognizing the correct orientation - parallel vs. antiparallel - of pairs of alpha helices that can form a dimeric coiled coil. For each of 131 examples of known structure, we constructed a large number of both parallel and antiparallel structural models and used these to asses the ability of five energy functions to recognize the correct fold. We also developed and tested three sequenced-based approaches that make use of varying degrees of implicit structural information. The best structural methods performed similarly to the best sequence methods, correctly categorizing ∼81% of dimers. Steric compatibility with the fold was important for some coiled coils we investigated. For many examples, the correct orientation was determined by smaller energy differences between parallel and antiparallel structures distributed over many residues and energy components. Prediction methods that used structure but incorporated varying approximations and assumptions showed quite different behaviors when used to investigate energetic contributions to orientation preference. Sequence based methods were

  10. Rotation-invariant features for multi-oriented text detection in natural images.

    PubMed

    Yao, Cong; Zhang, Xin; Bai, Xiang; Liu, Wenyu; Ma, Yi; Tu, Zhuowen

    2013-01-01

    Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes.

  11. RareVariantVis: new tool for visualization of causative variants in rare monogenic disorders using whole genome sequencing data.

    PubMed

    Stokowy, Tomasz; Garbulowski, Mateusz; Fiskerstrand, Torunn; Holdhus, Rita; Labun, Kornel; Sztromwasser, Pawel; Gilissen, Christian; Hoischen, Alexander; Houge, Gunnar; Petersen, Kjell; Jonassen, Inge; Steen, Vidar M

    2016-10-01

    The search for causative genetic variants in rare diseases of presumed monogenic inheritance has been boosted by the implementation of whole exome (WES) and whole genome (WGS) sequencing. In many cases, WGS seems to be superior to WES, but the analysis and visualization of the vast amounts of data is demanding. To aid this challenge, we have developed a new tool-RareVariantVis-for analysis of genome sequence data (including non-coding regions) for both germ line and somatic variants. It visualizes variants along their respective chromosomes, providing information about exact chromosomal position, zygosity and frequency, with point-and-click information regarding dbSNP IDs, gene association and variant inheritance. Rare variants as well as de novo variants can be flagged in different colors. We show the performance of the RareVariantVis tool in the Genome in a Bottle WGS data set. https://www.bioconductor.org/packages/3.3/bioc/html/RareVariantVis.html tomasz.stokowy@k2.uib.no Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Swine Influenza/Variant Influenza Viruses

    MedlinePlus

    ... Address What's this? Submit What's this? Submit Button Influenza Types Seasonal Avian Swine Variant Pandemic Other Information on Swine Influenza/Variant Influenza Virus Language: English (US) Español Recommend ...

  13. A Python package for parsing, validating, mapping and formatting sequence variants using HGVS nomenclature.

    PubMed

    Hart, Reece K; Rico, Rudolph; Hare, Emily; Garcia, John; Westbrook, Jody; Fusaro, Vincent A

    2015-01-15

    Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  14. A Python package for parsing, validating, mapping and formatting sequence variants using HGVS nomenclature

    PubMed Central

    Hart, Reece K.; Rico, Rudolph; Hare, Emily; Garcia, John; Westbrook, Jody; Fusaro, Vincent A.

    2015-01-01

    Summary: Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. Availability and implementation: The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). Contact: reecehart@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25273102

  15. Structural determinants of phosphoinositide selectivity in splice variants of Grp1 family PH domains

    PubMed Central

    Cronin, Thomas C; DiNitto, Jonathan P; Czech, Michael P; Lambright, David G

    2004-01-01

    The pleckstrin homology (PH) domains of the homologous proteins Grp1 (general receptor for phosphoinositides), ARNO (Arf nucleotide binding site opener), and Cytohesin-1 bind phosphatidylinositol (PtdIns) 3,4,5-trisphosphate with unusually high selectivity. Remarkably, splice variants that differ only by the insertion of a single glycine residue in the β1/β2 loop exhibit dual specificity for PtdIns(3,4,5)P3 and PtdIns(4,5)P2. The structural basis for this dramatic specificity switch is not apparent from the known modes of phosphoinositide recognition. Here, we report crystal structures for dual specificity variants of the Grp1 and ARNO PH domains in either the unliganded form or in complex with the head groups of PtdIns(4,5)P2 and PtdIns(3,4,5)P3. Loss of contacts with the β1/β2 loop with no significant change in head group orientation accounts for the significant decrease in PtdIns(3,4,5)P3 affinity observed for the dual specificity variants. Conversely, a small increase rather than decrease in affinity for PtdIns(4,5)P2 is explained by a novel binding mode, in which the glycine insertion alleviates unfavorable interactions with the β1/β2 loop. These observations are supported by a systematic mutational analysis of the determinants of phosphoinositide recognition. PMID:15359279

  16. Beta-glucosidase variants and polynucleotides encoding same

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wogulis, Mark; Harris, Paul; Osborn, David

    The present invention relates to beta-glucosidase variants, e.g. beta-glucosidase variants of a parent Family GH3A beta-glucosidase from Aspergillus fumigatus. The present invention also relates to polynucleotides encoding the beta-glucosidase variants; nucleic acid constructs, vectors, and host cells comprising the polynucleotides; and methods of using the beta-glucosidase variants.

  17. Device Oriented Project Controller

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dalesio, Leo; Kraimer, Martin

    2013-11-20

    This proposal is directed at the issue of developing control systems for very large HEP projects. A de-facto standard in accelerator control is the Experimental Physics and Industrial Control System (EPICS), which has been applied successfully to many physics projects. EPICS is a channel based system that requires that each channel of each device be configured and controlled. In Phase I, the feasibility of a device oriented extension to the distributed channel database was demonstrated by prototyping a device aware version of an EPICS I/O controller that functions with the current version of the channel access communication protocol. Extensions havemore » been made to the grammar to define the database. Only a multi-stage position controller with limit switches was developed in the demonstration, but the grammar should support a full range of functional record types. In phase II, a full set of record types will be developed to support all existing record types, a set of process control functions for closed loop control, and support for experimental beam line control. A tool to configure these records will be developed. A communication protocol will be developed or extensions will be made to Channel Access to support introspection of components of a device. Performance bench marks will be made on both communication protocol and the database. After these records and performance tests are under way, a second of the grammar will be undertaken.« less

  18. Automated correction of improperly rotated diffusion gradient orientations in diffusion weighted MRI.

    PubMed

    Jeurissen, Ben; Leemans, Alexander; Sijbers, Jan

    2014-10-01

    Ensuring one is using the correct gradient orientations in a diffusion MRI study can be a challenging task. As different scanners, file formats and processing tools use different coordinate frame conventions, in practice, users can end up with improperly oriented gradient orientations. Using such wrongly oriented gradient orientations for subsequent diffusion parameter estimation will invalidate all rotationally variant parameters and fiber tractography results. While large misalignments can be detected by visual inspection, small rotations of the gradient table (e.g. due to angulation of the acquisition plane), are much more difficult to detect. In this work, we propose an automated method to align the coordinate frame of the gradient orientations with that of the corresponding diffusion weighted images, using a metric based on whole brain fiber tractography. By transforming the gradient table and measuring the average fiber trajectory length, we search for the transformation that results in the best global 'connectivity'. To ensure a fast calculation of the metric we included a range of algorithmic optimizations in our tractography routine. To make the optimization routine robust to spurious local maxima, we use a stochastic optimization routine that selects a random set of seed points on each evaluation. Using simulations, we show that our method can recover the correct gradient orientations with high accuracy and precision. In addition, we demonstrate that our technique can successfully recover rotated gradient tables on a wide range of clinically realistic data sets. As such, our method provides a practical and robust solution to an often overlooked pitfall in the processing of diffusion MRI. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. Gender variance in childhood and sexual orientation in adulthood: a prospective study.

    PubMed

    Steensma, Thomas D; van der Ende, Jan; Verhulst, Frank C; Cohen-Kettenis, Peggy T

    2013-11-01

    Several retrospective and prospective studies have reported on the association between childhood gender variance and sexual orientation and gender discomfort in adulthood. In most of the retrospective studies, samples were drawn from the general population. The samples in the prospective studies consisted of clinically referred children. In understanding the extent to which the association applies for the general population, prospective studies using random samples are needed. This prospective study examined the association between childhood gender variance, and sexual orientation and gender discomfort in adulthood in the general population. In 1983, we measured childhood gender variance, in 406 boys and 473 girls. In 2007, sexual orientation and gender discomfort were assessed. Childhood gender variance was measured with two items from the Child Behavior Checklist/4-18. Sexual orientation was measured for four parameters of sexual orientation (attraction, fantasy, behavior, and identity). Gender discomfort was assessed by four questions (unhappiness and/or uncertainty about one's gender, wish or desire to be of the other gender, and consideration of living in the role of the other gender). For both men and women, the presence of childhood gender variance was associated with homosexuality for all four parameters of sexual orientation, but not with bisexuality. The report of adulthood homosexuality was 8 to 15 times higher for participants with a history of gender variance (10.2% to 12.2%), compared to participants without a history of gender variance (1.2% to 1.7%). The presence of childhood gender variance was not significantly associated with gender discomfort in adulthood. This study clearly showed a significant association between childhood gender variance and a homosexual sexual orientation in adulthood in the general population. In contrast to the findings in clinically referred gender-variant children, the presence of a homosexual sexual orientation in

  20. Image classification independent of orientation and scale

    NASA Astrophysics Data System (ADS)

    Arsenault, Henri H.; Parent, Sebastien; Moisan, Sylvain

    1998-04-01

    The recognition of targets independently of orientation has become fairly well developed in recent years for in-plane rotation. The out-of-plane rotation problem is much less advanced. When both out-of-plane rotations and changes of scale are present, the problem becomes very difficult. In this paper we describe our research on the combined out-of- plane rotation problem and the scale invariance problem. The rotations were limited to rotations about an axis perpendicular to the line of sight. The objects to be classified were three kinds of military vehicles. The inputs used were infrared imagery and photographs. We used a variation of a method proposed by Neiberg and Casasent, where a neural network is trained with a subset of the database and a minimum distances from lines in feature space are used for classification instead of nearest neighbors. Each line in the feature space corresponds to one class of objects, and points on one line correspond to different orientations of the same target. We found that the training samples needed to be closer for some orientations than for others, and that the most difficult orientations are where the target is head-on to the observer. By means of some additional training of the neural network, we were able to achieve 100% correct classification for 360 degree rotation and a range of scales over a factor of five.

  1. Lack of association between the P413L variant of chromogranin B and ALS risk or age at onset: a meta-analysis.

    PubMed

    Yang, Xinglong; Li, Shimei; Xing, Dongmei; Li, Peiyun; Li, Ci; Qi, Ling; Xu, Yanming; Ren, Hui

    2018-02-01

    Amyotrophic lateral sclerosis (ALS), the most common motor neuron disease, is thought to result from interaction of genetic and environmental risk factors. Whether the potentially functional exonic P413L variant in the chromogranin B gene influences ALS risk and age at onset is controversial. We meta-analysed or other studies assessing the association between the P413L variant and ALS risk or age at ALS onset indexed in Web of Science, PubMed, Embase, Chinese National Knowledge Infrastructure, Wanfang, and SinoMed databases. Five case-control studies were analysed, involving 2639 patients with sporadic ALS, 201 with familial ALS and 3381 controls. No association was detected between risk of either ALS type and the CT + TT genotype or T-allele of the P413L variant. Age at ALS onset was similar between carriers and non-carriers of the T-allele. The available evidence suggests that the P413L variant of chromogranin B is not associated with ALS risk or age at ALS onset. These results should be validated in large, well-designed studies.

  2. Information model construction of MES oriented to mechanical blanking workshop

    NASA Astrophysics Data System (ADS)

    Wang, Jin-bo; Wang, Jin-ye; Yue, Yan-fang; Yao, Xue-min

    2016-11-01

    Manufacturing Execution System (MES) is one of the crucial technologies to implement informatization management in manufacturing enterprises, and the construction of its information model is the base of MES database development. Basis on the analysis of the manufacturing process information in mechanical blanking workshop and the information requirement of MES every function module, the IDEF1X method was adopted to construct the information model of MES oriented to mechanical blanking workshop, and a detailed description of the data structure feature included in MES every function module and their logical relationship was given from the point of view of information relationship, which laid the foundation for the design of MES database.

  3. Comprehensive Analysis of Non-Synonymous Natural Variants of G Protein-Coupled Receptors.

    PubMed

    Kim, Hee Ryung; Duc, Nguyen Minh; Chung, Ka Young

    2018-03-01

    G protein-coupled receptors (GPCRs) are the largest superfamily of transmembrane receptors and have vital signaling functions in various organs. Because of their critical roles in physiology and pathology, GPCRs are the most commonly used therapeutic target. It has been suggested that GPCRs undergo massive genetic variations such as genetic polymorphisms and DNA insertions or deletions. Among these genetic variations, non-synonymous natural variations change the amino acid sequence and could thus alter GPCR functions such as expression, localization, signaling, and ligand binding, which may be involved in disease development and altered responses to GPCR-targeting drugs. Despite the clinical importance of GPCRs, studies on the genotype-phenotype relationship of GPCR natural variants have been limited to a few GPCRs such as β-adrenergic receptors and opioid receptors. Comprehensive understanding of non-synonymous natural variations within GPCRs would help to predict the unknown genotype-phenotype relationship and yet-to-be-discovered natural variants. Here, we analyzed the non-synonymous natural variants of all non-olfactory GPCRs available from a public database, UniProt. The results suggest that non-synonymous natural variations occur extensively within the GPCR superfamily especially in the N-terminus and transmembrane domains. Within the transmembrane domains, natural variations observed more frequently in the conserved residues, which leads to disruption of the receptor function. Our analysis also suggests that only few non-synonymous natural variations have been studied in efforts to link the variations with functional consequences.

  4. SISSY: An example of a multi-threaded, networked, object-oriented databased application

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scipioni, B.; Liu, D.; Song, T.

    1993-05-01

    The Systems Integration Support SYstem (SISSY) is presented and its capabilities and techniques are discussed. It is fully automated data collection and analysis system supporting the SSCL`s systems analysis activities as they relate to the Physics Detector and Simulation Facility (PDSF). SISSY itself is a paradigm of effective computing on the PDSF. It uses home-grown code (C++), network programming (RPC, SNMP), relational (SYBASE) and object-oriented (ObjectStore) DBMSs, UNIX operating system services (IRIX threads, cron, system utilities, shells scripts, etc.), and third party software applications (NetCentral Station, Wingz, DataLink) all of which act together as a single application to monitor andmore » analyze the PDSF.« less

  5. Cloud-Based NoSQL Open Database of Pulmonary Nodules for Computer-Aided Lung Cancer Diagnosis and Reproducible Research.

    PubMed

    Ferreira Junior, José Raniery; Oliveira, Marcelo Costa; de Azevedo-Marques, Paulo Mazzoncini

    2016-12-01

    Lung cancer is the leading cause of cancer-related deaths in the world, and its main manifestation is pulmonary nodules. Detection and classification of pulmonary nodules are challenging tasks that must be done by qualified specialists, but image interpretation errors make those tasks difficult. In order to aid radiologists on those hard tasks, it is important to integrate the computer-based tools with the lesion detection, pathology diagnosis, and image interpretation processes. However, computer-aided diagnosis research faces the problem of not having enough shared medical reference data for the development, testing, and evaluation of computational methods for diagnosis. In order to minimize this problem, this paper presents a public nonrelational document-oriented cloud-based database of pulmonary nodules characterized by 3D texture attributes, identified by experienced radiologists and classified in nine different subjective characteristics by the same specialists. Our goal with the development of this database is to improve computer-aided lung cancer diagnosis and pulmonary nodule detection and classification research through the deployment of this database in a cloud Database as a Service framework. Pulmonary nodule data was provided by the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI), image descriptors were acquired by a volumetric texture analysis, and database schema was developed using a document-oriented Not only Structured Query Language (NoSQL) approach. The proposed database is now with 379 exams, 838 nodules, and 8237 images, 4029 of them are CT scans and 4208 manually segmented nodules, and it is allocated in a MongoDB instance on a cloud infrastructure.

  6. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegeburr, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund

    2013-02-19

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  7. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA

    2011-05-31

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  8. Variant humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Edmund, Larenas

    2014-09-09

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  9. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund

    2014-03-18

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  10. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund

    2017-05-09

    Disclosed are variants of Humicola grisea CeI7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  11. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA

    2011-08-16

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  12. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA

    2012-08-07

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  13. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA

    2008-12-02

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  14. OBSIFRAC: database-supported software for 3D modeling of rock mass fragmentation

    NASA Astrophysics Data System (ADS)

    Empereur-Mot, Luc; Villemin, Thierry

    2003-03-01

    Under stress, fractures in rock masses tend to form fully connected networks. The mass can thus be thought of as a 3D series of blocks produced by fragmentation processes. A numerical model has been developed that uses a relational database to describe such a mass. The model, which assumes the fractures to be plane, allows data from natural networks to test theories concerning fragmentation processes. In the model, blocks are bordered by faces that are composed of edges and vertices. A fracture can originate from a seed point, its orientation being controlled by the stress field specified by an orientation matrix. Alternatively, it can be generated from a discrete set of given orientations and positions. Both kinds of fracture can occur together in a model. From an original simple block, a given fracture produces two simple polyhedral blocks, and the original block becomes compound. Compound and simple blocks created throughout fragmentation are stored in the database. Several fragmentation processes have been studied. In one scenario, a constant proportion of blocks is fragmented at each step of the process. The resulting distribution appears to be fractal, although seed points are random in each fragmented block. In a second scenario, division affects only one random block at each stage of the process, and gives a Weibull volume distribution law. This software can be used for a large number of other applications.

  15. Advancements in web-database applications for rabies surveillance.

    PubMed

    Rees, Erin E; Gendron, Bruno; Lelièvre, Frédérick; Coté, Nathalie; Bélanger, Denise

    2011-08-02

    Protection of public health from rabies is informed by the analysis of surveillance data from human and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. Aggregation of agency-specific data into one database application would enable more comprehensive data analyses and effective communication among participating agencies. In Québec, RageDB was developed to house surveillance data for the raccoon rabies variant, representing the next generation in web-based database applications that provide a key resource for the protection of public health. RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of raccoon rabies in Québec. Technological advancements of RageDB to rabies surveillance databases include (1) automatic integration of multi-agency data and diagnostic results on a daily basis; (2) a web-based data editing interface that enables authorized users to add, edit and extract data; and (3) an interactive dashboard to help visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources for protecting public health. RageDB provides an example in the evolution of spatio-temporal database applications for the storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The database increases communication among agencies collaborating to protect human health from raccoon rabies. Furthermore, health agencies have real

  16. Advancements in web-database applications for rabies surveillance

    PubMed Central

    2011-01-01

    Background Protection of public health from rabies is informed by the analysis of surveillance data from human and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. Aggregation of agency-specific data into one database application would enable more comprehensive data analyses and effective communication among participating agencies. In Québec, RageDB was developed to house surveillance data for the raccoon rabies variant, representing the next generation in web-based database applications that provide a key resource for the protection of public health. Results RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of raccoon rabies in Québec. Technological advancements of RageDB to rabies surveillance databases include 1) automatic integration of multi-agency data and diagnostic results on a daily basis; 2) a web-based data editing interface that enables authorized users to add, edit and extract data; and 3) an interactive dashboard to help visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources for protecting public health. Conclusions RageDB provides an example in the evolution of spatio-temporal database applications for the storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The database increases communication among agencies collaborating to protect human health from raccoon rabies

  17. NETMARK: A Schema-less Extension for Relational Databases for Managing Semi-structured Data Dynamically

    NASA Technical Reports Server (NTRS)

    Maluf, David A.; Tran, Peter B.

    2003-01-01

    Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object-oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK, is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword search of records spanning across both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semi-structured documents existing within NASA enterprises. Today, NETMARK is a flexible, high-throughput open database framework for managing, storing, and searching unstructured or semi-structured arbitrary hierarchal models, such as XML and HTML.

  18. DHAD variants and methods of screening

    DOEpatents

    Kelly, Kristen J.; Ye, Rick W.

    2017-02-28

    Methods of screening for dihydroxy-acid dehydratase (DHAD) variants that display increased DHAD activity are disclosed, along with DHAD variants identified by these methods. Such enzymes can result in increased production of compounds from DHAD requiring biosynthetic pathways. Also disclosed are isolated nucleic acids encoding the DHAD variants, recombinant host cells comprising the isolated nucleic acid molecules, and methods of producing butanol.

  19. Integration of NASA/GSFC and USGS Rock Magnetic Databases.

    NASA Astrophysics Data System (ADS)

    Nazarova, K. A.; Glen, J. M.

    2004-05-01

    A global Magnetic Petrology Database (MPDB) was developed and continues to be updated at NASA/Goddard Space Flight Center. The purpose of this database is to provide the geomagnetic community with a comprehensive and user-friendly method of accessing magnetic petrology data via the Internet for a more realistic interpretation of satellite (as well as aeromagnetic and ground) lithospheric magnetic anomalies. The MPDB contains data on rocks from localities around the world (about 19,000 samples) including the Ukranian and Baltic Shields, Kamchatka, Iceland, Urals Mountains, etc. The MPDB is designed, managed and presented on the web as a research oriented database. Several database applications have been specifically developed for data manipulation and analysis of the MPDB. The geophysics unit at the USGS in Menlo Park has over 17,000 rock-property data, largely from sites within the western U.S. This database contains rock-density and rock-magnetic parameters collected for use in gravity and magnetic field modeling, and paleomagnetic studies. Most of these data were taken from surface outcrops and together they span a broad range of rock types. Measurements were made either in-situ at the outcrop, or in the laboratory on hand samples and paleomagnetic cores acquired in the field. The USGS and NASA/GSFC data will be integrated as part of an effort to provide public access to a single, uniformly maintained database. Due to the large number of data and the very large area sampled, the database can yield rock-property statistics on a broad range of rock types; it is thus applicable to study areas beyond the geographic scope of the database. The intent of this effort is to provide incentive for others to further contribute to the database, and a tool with which the geophysical community can entertain studies formerly precluded.

  20. Clinician-Oriented Access to Data - C.O.A.D.: A Natural Language Interface to a VA DHCP Database

    PubMed Central

    Levy, Christine; Rogers, Elizabeth

    1995-01-01

    Hospitals collect enormous amounts of data related to the on-going care of patients. Unfortunately, a clinicians access to the data is limited by complexities of the database structure and/or programming skills required to access the database. The COAD project attempts to bridge the gap between the clinical user's need for specific information from the database, and the wealth of data residing in the hospital information system. The project design includes a natural language interface to data contained in a VA DHCP database. We have developed a prototype which links natural language software to certain DHCP data elements, including, patient demographics, prescriptions, diagnoses, laboratory data, and provider information. English queries can by typed onto the system, and answers to the questions are returned. Future work includes refinement of natural language/DHCP connections to enable more sophisticated queries, and optimization of the system to reduce response time to user questions.

  1. Interactive Scene Analysis Module - A sensor-database fusion system for telerobotic environments

    NASA Technical Reports Server (NTRS)

    Cooper, Eric G.; Vazquez, Sixto L.; Goode, Plesent W.

    1992-01-01

    Accomplishing a task with telerobotics typically involves a combination of operator control/supervision and a 'script' of preprogrammed commands. These commands usually assume that the location of various objects in the task space conform to some internal representation (database) of that task space. The ability to quickly and accurately verify the task environment against the internal database would improve the robustness of these preprogrammed commands. In addition, the on-line initialization and maintenance of a task space database is difficult for operators using Cartesian coordinates alone. This paper describes the Interactive Scene' Analysis Module (ISAM) developed to provide taskspace database initialization and verification utilizing 3-D graphic overlay modelling, video imaging, and laser radar based range imaging. Through the fusion of taskspace database information and image sensor data, a verifiable taskspace model is generated providing location and orientation data for objects in a task space. This paper also describes applications of the ISAM in the Intelligent Systems Research Laboratory (ISRL) at NASA Langley Research Center, and discusses its performance relative to representation accuracy and operator interface efficiency.

  2. Cellobiohydrolase variants and polynucleotides encoding the same

    DOEpatents

    Wogulis, Mark

    2014-09-09

    The present invention relates to variants of a parent cellobiohydrolase. The present invention also relates to polynucleotides encoding the cellobiohydrolase variants; nucleic acid constructs, vectors, and host cells comprising the polynucleotides; and methods of using the cellobiohydrolase variants.

  3. Brain calcifications and PCDH12 variants

    PubMed Central

    Nicolas, Gaël; Sanchez-Contreras, Monica; Ramos, Eliana Marisa; Lemos, Roberta R.; Ferreira, Joana; Moura, Denis; Sobrido, Maria J.; Richard, Anne-Claire; Lopez, Alma Rosa; Legati, Andrea; Deleuze, Jean-François; Boland, Anne; Quenez, Olivier; Krystkowiak, Pierre; Favrole, Pascal; Geschwind, Daniel H.; Aran, Adi; Segel, Reeval; Levy-Lahad, Ephrat; Dickson, Dennis W.; Coppola, Giovanni; Rademakers, Rosa

    2017-01-01

    Objective: To assess the potential connection between PCDH12 and brain calcifications in a patient carrying a homozygous nonsense variant in PCDH12 and in adult patients with brain calcifications. Methods: We performed a CT scan in 1 child with a homozygous PCDH12 nonsense variant. We screened DNA samples from 53 patients with primary familial brain calcification (PFBC) and 26 patients with brain calcification of unknown cause (BCUC). Results: We identified brain calcifications in subcortical and perithalamic regions in the patient with a homozygous PCDH12 nonsense variant. The calcification pattern was different from what has been observed in PFBC and more similar to what is described in in utero infections. In patients with PFBC or BCUC, we found no protein-truncating variant and 3 rare (minor allele frequency <0.001) PCDH12 predicted damaging missense heterozygous variants in 3 unrelated patients, albeit with no segregation data available. Conclusions: Brain calcifications should be added to the phenotypic spectrum associated with PCDH12 biallelic loss of function, in the context of severe cerebral developmental abnormalities. A putative role for PCDH12 variants remains to be determined in PFBC. PMID:28804758

  4. MoonProt: a database for proteins that are known to moonlight

    PubMed Central

    Mani, Mathew; Chen, Chang; Amblee, Vaishak; Liu, Haipeng; Mathur, Tanu; Zwicke, Grant; Zabad, Shadi; Patel, Bansi; Thakkar, Jagravi; Jeffery, Constance J.

    2015-01-01

    Moonlighting proteins comprise a class of multifunctional proteins in which a single polypeptide chain performs multiple biochemical functions that are not due to gene fusions, multiple RNA splice variants or pleiotropic effects. The known moonlighting proteins perform a variety of diverse functions in many different cell types and species, and information about their structures and functions is scattered in many publications. We have constructed the manually curated, searchable, internet-based MoonProt Database (http://www.moonlightingproteins.org) with information about the over 200 proteins that have been experimentally verified to be moonlighting proteins. The availability of this organized information provides a more complete picture of what is currently known about moonlighting proteins. The database will also aid researchers in other fields, including determining the functions of genes identified in genome sequencing projects, interpreting data from proteomics projects and annotating protein sequence and structural databases. In addition, information about the structures and functions of moonlighting proteins can be helpful in understanding how novel protein functional sites evolved on an ancient protein scaffold, which can also help in the design of proteins with novel functions. PMID:25324305

  5. Toward a mtDNA locus-specific mutation database using the LOVD platform.

    PubMed

    Elson, Joanna L; Sweeney, Mary G; Procaccio, Vincent; Yarham, John W; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H; Pitceathly, Robert D S; Thorburn, David R; Lott, Marie T; Wallace, Douglas C; Taylor, Robert W; McFarland, Robert

    2012-09-01

    The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. © 2012 Wiley Periodicals, Inc.

  6. Toward a mtDNA Locus-Specific Mutation Database Using the LOVD Platform

    PubMed Central

    Elson, Joanna L.; Sweeney, Mary G.; Procaccio, Vincent; Yarham, John W.; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H.; Pitceathly, Robert D.S.; Thorburn, David R.; Lott, Marie T.; Wallace, Douglas C.; Taylor, Robert W.; McFarland, Robert

    2015-01-01

    The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. PMID:22581690

  7. Technical Aspects of Interfacing MUMPS to an External SQL Relational Database Management System

    PubMed Central

    Kuzmak, Peter M.; Walters, Richard F.; Penrod, Gail

    1988-01-01

    This paper describes an interface connecting InterSystems MUMPS (M/VX) to an external relational DBMS, the SYBASE Database Management System. The interface enables MUMPS to operate in a relational environment and gives the MUMPS language full access to a complete set of SQL commands. MUMPS generates SQL statements as ASCII text and sends them to the RDBMS. The RDBMS executes the statements and returns ASCII results to MUMPS. The interface suggests that the language features of MUMPS make it an attractive tool for use in the relational database environment. The approach described in this paper separates MUMPS from the relational database. Positioning the relational database outside of MUMPS promotes data sharing and permits a number of different options to be used for working with the data. Other languages like C, FORTRAN, and COBOL can access the RDBMS database. Advanced tools provided by the relational database vendor can also be used. SYBASE is an advanced high-performance transaction-oriented relational database management system for the VAX/VMS and UNIX operating systems. SYBASE is designed using a distributed open-systems architecture, and is relatively easy to interface with MUMPS.

  8. Rare genetic variants and the risk of cancer.

    PubMed

    Bodmer, Walter; Tomlinson, Ian

    2010-06-01

    There are good reasons to expect that common genetic variants do not explain all of the inherited risk of the common cancers, not least of these being the relatively low proportion of familial relative risk that common cancer SNPs currently explain. One promising source of the unexplained risk is rare, low-penetrance genetic variants, a class that ranges from low-frequency polymorphisms (allele frequency < 5%) through subpolymorphic variants (frequency 0.1-1.0%) to very low frequency or 'private' variants with frequencies of 0.1% or less. Examples of rare cancer variants include breast cancer susceptibility loci CHEK2, BRIP1 and PALB2. There are considerable challenges associated with the discovery and testing of rare predisposition alleles, many of which are illustrated by the issues associated with variants of unknown significance in the Mendelian cancer predisposition genes. However, whilst cost constraints remain, the technological barriers to rare variant discovery and large-scale genotyping no longer exist. If each individual carries many disease-causing rare variants, the so-called missing heritability of cancer might largely be explained. Whether or not rare variants do end up filling the heritability gap, it is imperative to look for them along side common variants.

  9. UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein-DNA interactions.

    PubMed

    Robasky, Kimberly; Bulyk, Martha L

    2011-01-01

    The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.

  10. Palaeo-sea-level and palaeo-ice-sheet databases: problems, strategies, and perspectives

    NASA Astrophysics Data System (ADS)

    Düsterhus, André; Rovere, Alessio; Carlson, Anders E.; Horton, Benjamin P.; Klemann, Volker; Tarasov, Lev; Barlow, Natasha L. M.; Bradwell, Tom; Clark, Jorie; Dutton, Andrea; Gehrels, W. Roland; Hibbert, Fiona D.; Hijma, Marc P.; Khan, Nicole; Kopp, Robert E.; Sivan, Dorit; Törnqvist, Torbjörn E.

    2016-04-01

    Sea-level and ice-sheet databases have driven numerous advances in understanding the Earth system. We describe the challenges and offer best strategies that can be adopted to build self-consistent and standardised databases of geological and geochemical information used to archive palaeo-sea-levels and palaeo-ice-sheets. There are three phases in the development of a database: (i) measurement, (ii) interpretation, and (iii) database creation. Measurement should include the objective description of the position and age of a sample, description of associated geological features, and quantification of uncertainties. Interpretation of the sample may have a subjective component, but it should always include uncertainties and alternative or contrasting interpretations, with any exclusion of existing interpretations requiring a full justification. During the creation of a database, an approach based on accessibility, transparency, trust, availability, continuity, completeness, and communication of content (ATTAC3) must be adopted. It is essential to consider the community that creates and benefits from a database. We conclude that funding agencies should not only consider the creation of original data in specific research-question-oriented projects, but also include the possibility of using part of the funding for IT-related and database creation tasks, which are essential to guarantee accessibility and maintenance of the collected data.

  11. OVA: integrating molecular and physical phenotype data from multiple biomedical domain ontologies with variant filtering for enhanced variant prioritization.

    PubMed

    Antanaviciute, Agne; Watson, Christopher M; Harrison, Sally M; Lascelles, Carolina; Crinnion, Laura; Markham, Alexander F; Bonthron, David T; Carr, Ian M

    2015-12-01

    Exome sequencing has become a de facto standard method for Mendelian disease gene discovery in recent years, yet identifying disease-causing mutations among thousands of candidate variants remains a non-trivial task. Here we describe a new variant prioritization tool, OVA (ontology variant analysis), in which user-provided phenotypic information is exploited to infer deeper biological context. OVA combines a knowledge-based approach with a variant-filtering framework. It reduces the number of candidate variants by considering genotype and predicted effect on protein sequence, and scores the remainder on biological relevance to the query phenotype.We take advantage of several ontologies in order to bridge knowledge across multiple biomedical domains and facilitate computational analysis of annotations pertaining to genes, diseases, phenotypes, tissues and pathways. In this way, OVA combines information regarding molecular and physical phenotypes and integrates both human and model organism data to effectively prioritize variants. By assessing performance on both known and novel disease mutations, we show that OVA performs biologically meaningful candidate variant prioritization and can be more accurate than another recently published candidate variant prioritization tool. OVA is freely accessible at http://dna2.leeds.ac.uk:8080/OVA/index.jsp. Supplementary data are available at Bioinformatics online. umaan@leeds.ac.uk. © The Author 2015. Published by Oxford University Press.

  12. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds.

    PubMed

    Stafuzza, Nedenia Bonvino; Zerlotini, Adhemar; Lobo, Francisco Pereira; Yamagishi, Michel Eduardo Beleza; Chud, Tatiane Cristina Seleguim; Caetano, Alexandre Rodrigues; Munari, Danísio Prado; Garrick, Dorian J; Machado, Marco Antonio; Martins, Marta Fonseca; Carvalho, Maria Raquel; Cole, John Bruce; Barbosa da Silva, Marcos Vinicius Gualberto

    2017-01-01

    Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose), Gyr, Girolando and Holstein (dairy production). A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs) and 3,828,041 insertions/deletions (InDels) were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs.

  13. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds

    PubMed Central

    Lobo, Francisco Pereira; Yamagishi, Michel Eduardo Beleza; Chud, Tatiane Cristina Seleguim; Caetano, Alexandre Rodrigues; Munari, Danísio Prado; Garrick, Dorian J.; Machado, Marco Antonio; Martins, Marta Fonseca; Carvalho, Maria Raquel; Cole, John Bruce; Barbosa da Silva, Marcos Vinicius Gualberto

    2017-01-01

    Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose), Gyr, Girolando and Holstein (dairy production). A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs) and 3,828,041 insertions/deletions (InDels) were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs. PMID:28323836

  14. Selecting a Persistent Data Support Environment for Object-Oriented Applications

    DTIC Science & Technology

    1998-03-01

    key features of most object DBMS products is contained in the <DWAS 9{eeds Assessment for Objects from Barry and Associates. The developer should...data structure and behavior in a self- contained module enhances maintainability of the system and promotes reuse of modules for similar domains...considered together, represent a survey of commercial object-oriented database management systems. These references contain detailed information needed

  15. DR-GAS: a database of functional genetic variants and their phosphorylation states in human DNA repair systems.

    PubMed

    Sehgal, Manika; Singh, Tiratha Raj

    2014-04-01

    We present DR-GAS(1), a unique, consolidated and comprehensive DNA repair genetic association studies database of human DNA repair system. It presents information on repair genes, assorted mechanisms of DNA repair, linkage disequilibrium, haplotype blocks, nsSNPs, phosphorylation sites, associated diseases, and pathways involved in repair systems. DNA repair is an intricate process which plays an essential role in maintaining the integrity of the genome by eradicating the damaging effect of internal and external changes in the genome. Hence, it is crucial to extensively understand the intact process of DNA repair, genes involved, non-synonymous SNPs which perhaps affect the function, phosphorylated residues and other related genetic parameters. All the corresponding entries for DNA repair genes, such as proteins, OMIM IDs, literature references and pathways are cross-referenced to their respective primary databases. DNA repair genes and their associated parameters are either represented in tabular or in graphical form through images elucidated by computational and statistical analyses. It is believed that the database will assist molecular biologists, biotechnologists, therapeutic developers and other scientific community to encounter biologically meaningful information, and meticulous contribution of genetic level information towards treacherous diseases in human DNA repair systems. DR-GAS is freely available for academic and research purposes at: http://www.bioinfoindia.org/drgas. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Rare Variant Association Test with Multiple Phenotypes

    PubMed Central

    Lee, Selyeong; Won, Sungho; Kim, Young Jin; Kim, Yongkang; Kim, Bong-Jo; Park, Taesung

    2016-01-01

    Although genome-wide association studies (GWAS) have now discovered thousands of genetic variants associated with common traits, such variants cannot explain the large degree of “missing heritability,” likely due to rare variants. The advent of next generation sequencing technology has allowed rare variant detection and association with common traits, often by investigating specific genomic regions for rare variant effects on a trait. Although multiply correlated phenotypes are often concurrently observed in GWAS, most studies analyze only single phenotypes, which may lessen statistical power. To increase power, multivariate analyses, which consider correlations between multiple phenotypes, can be used. However, few existing multi-variant analyses can identify rare variants for assessing multiple phenotypes. Here, we propose Multivariate Association Analysis using Score Statistics (MAAUSS), to identify rare variants associated with multiple phenotypes, based on the widely used Sequence Kernel Association Test (SKAT) for a single phenotype. We applied MAAUSS to Whole Exome Sequencing (WES) data from a Korean population of 1,058 subjects, to discover genes associated with multiple traits of liver function. We then assessed validation of those genes by a replication study, using an independent dataset of 3,445 individuals. Notably, we detected the gene ZNF620 among five significant genes. We then performed a simulation study to compare MAAUSS's performance with existing methods. Overall, MAAUSS successfully conserved type 1 error rates and in many cases, had a higher power than the existing methods. This study illustrates a feasible and straightforward approach for identifying rare variants correlated with multiple phenotypes, with likely relevance to missing heritability. PMID:28039885

  17. Crystal Structure of an Activated Variant of Small Heat Shock Protein Hsp16.5

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mchaourab, Hassane S.; Lin, Yi-Lun; Spiller, Benjamin W.

    How does the sequence of a single small heat shock protein (sHSP) assemble into oligomers of different sizes? To gain insight into the underlying structural mechanism, we determined the crystal structure of an engineered variant of Methanocaldococcus jannaschii Hsp16.5 wherein a 14 amino acid peptide from human heat shock protein 27 (Hsp27) was inserted at the junction of the N-terminal region and the {alpha}-crystallin domain. In response to this insertion, the oligomer shell expands from 24 to 48 subunits while maintaining octahedral symmetry. Oligomer rearrangement does not alter the fold of the conserved {alpha}-crystallin domain nor does it disturb themore » interface holding the dimeric building block together. Rather, the flexible C-terminal tail of Hsp16.5 changes its orientation relative to the {alpha}-crystallin domain which enables alternative packing of dimers. This change in orientation preserves a peptide-in-groove interaction of the C-terminal tail with an adjacent {beta}-sandwich, thereby holding the assembly together. The interior of the expanded oligomer, where substrates presumably bind, retains its predominantly nonpolar character relative to the outside surface. New large windows in the outer shell provide increased access to these substrate-binding regions, thus accounting for the higher affinity of this variant to substrates. Oligomer polydispersity regulates sHSPs chaperone activity in vitro and has been implicated in their physiological roles. The structural mechanism of Hsp16.5 oligomer flexibility revealed here, which is likely to be highly conserved across the sHSP superfamily, explains the relationship between oligomer expansion observed in disease-linked mutants and changes in chaperone activity.« less

  18. Crystal structure of an activated variant of small heat shock protein Hsp16.5.

    PubMed

    McHaourab, Hassane S; Lin, Yi-Lun; Spiller, Benjamin W

    2012-06-26

    How does the sequence of a single small heat shock protein (sHSP) assemble into oligomers of different sizes? To gain insight into the underlying structural mechanism, we determined the crystal structure of an engineered variant of Methanocaldococcus jannaschii Hsp16.5 wherein a 14 amino acid peptide from human heat shock protein 27 (Hsp27) was inserted at the junction of the N-terminal region and the α-crystallin domain. In response to this insertion, the oligomer shell expands from 24 to 48 subunits while maintaining octahedral symmetry. Oligomer rearrangement does not alter the fold of the conserved α-crystallin domain nor does it disturb the interface holding the dimeric building block together. Rather, the flexible C-terminal tail of Hsp16.5 changes its orientation relative to the α-crystallin domain which enables alternative packing of dimers. This change in orientation preserves a peptide-in-groove interaction of the C-terminal tail with an adjacent β-sandwich, thereby holding the assembly together. The interior of the expanded oligomer, where substrates presumably bind, retains its predominantly nonpolar character relative to the outside surface. New large windows in the outer shell provide increased access to these substrate-binding regions, thus accounting for the higher affinity of this variant to substrates. Oligomer polydispersity regulates sHSPs chaperone activity in vitro and has been implicated in their physiological roles. The structural mechanism of Hsp16.5 oligomer flexibility revealed here, which is likely to be highly conserved across the sHSP superfamily, explains the relationship between oligomer expansion observed in disease-linked mutants and changes in chaperone activity.

  19. Rare variants and cardiovascular disease.

    PubMed

    Wain, Louise V

    2014-09-01

    Cardiovascular disease (CVD) is a leading cause of mortality and morbidity in the Western world. Large genome-wide association studies (GWASs) of coronary artery disease, myocardial infarction, stroke and dilated cardiomyopathy have identified a number of common genetic variants with modest effects on disease risk. Similarly, studies of important modifiable risk factors of CVD have identified a large number of predominantly common variant associations, for example, with blood pressure and blood lipid levels. In each case, despite the often large numbers of loci identified, only a small proportion of the phenotypic variance is explained. It has been hypothesised that rare variants with large effects may account for some of the missing variance but large-scale studies of rare variation are in their infancy for cardiovascular traits and have yet to produce fruitful results. Studies of monogenic CVDs, inherited disorders believed to be entirely driven by individual rare mutations, have highlighted genes that play a key role in disease aetiology. In this review, we discuss how findings from studies of rare variants in monogenic disease and GWAS of predominantly common variants are converging to provide further insight into biological disease mechanisms. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  20. TREM2 Variants in Alzheimer's Disease

    PubMed Central

    Guerreiro, Rita; Wojtas, Aleksandra; Bras, Jose; Carrasquillo, Minerva; Rogaeva, Ekaterina; Majounie, Elisa; Cruchaga, Carlos; Sassi, Celeste; Kauwe, John S.K.; Younkin, Steven; Hazrati, Lilinaz; Collinge, John; Pocock, Jennifer; Lashley, Tammaryn; Williams, Julie; Lambert, Jean-Charles; Amouyel, Philippe; Goate, Alison; Rademakers, Rosa; Morgan, Kevin; Powell, John; St. George-Hyslop, Peter; Singleton, Andrew; Hardy, John

    2013-01-01

    BACKGROUND Homozygous loss-of-function mutations in TREM2, encoding the triggering receptor expressed on myeloid cells 2 protein, have previously been associated with an autosomal recessive form of early-onset dementia. METHODS We used genome, exome, and Sanger sequencing to analyze the genetic variability in TREM2 in a series of 1092 patients with Alzheimer's disease and 1107 controls (the discovery set). We then performed a meta-analysis on imputed data for the TREM2 variant rs75932628 (predicted to cause a R47H substitution) from three genomewide association studies of Alzheimer's disease and tested for the association of the variant with disease. We genotyped the R47H variant in an additional 1887 cases and 4061 controls. We then assayed the expression of TREM2 across different regions of the human brain and identified genes that are differentially expressed in a mouse model of Alzheimer's disease and in control mice. RESULTS We found significantly more variants in exon 2 of TREM2 in patients with Alzheimer's disease than in controls in the discovery set (P = 0.02). There were 22 variant alleles in 1092 patients with Alzheimer's disease and 5 variant alleles in 1107 controls (P<0.001). The most commonly associated variant, rs75932628 (encoding R47H), showed highly significant association with Alzheimer's disease (P<0.001). Meta-analysis of rs75932628 genotypes imputed from genomewide association studies confirmed this association (P = 0.002), as did direct genotyping of an additional series of 1887 patients with Alzheimer's disease and 4061 controls (P<0.001). Trem2 expression differed between control mice and a mouse model of Alzheimer's disease. CONCLUSIONS Heterozygous rare variants in TREM2 are associated with a significant increase in the risk of Alzheimer's disease. (Funded by Alzheimer's Research UK and others.) PMID:23150934

  1. Lynx: a database and knowledge extraction engine for integrative medicine.

    PubMed

    Sulakhe, Dinanath; Balasubramanian, Sandhya; Xie, Bingqing; Feng, Bo; Taylor, Andrew; Wang, Sheng; Berrocal, Eduardo; Dave, Utpal; Xu, Jinbo; Börnigen, Daniela; Gilliam, T Conrad; Maltsev, Natalia

    2014-01-01

    We have developed Lynx (http://lynx.ci.uchicago.edu)--a web-based database and a knowledge extraction engine, supporting annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Its underlying knowledge base (LynxKB) integrates various classes of information from >35 public databases and private collections, as well as manually curated data from our group and collaborators. Lynx provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization to assist the user in extracting meaningful knowledge from LynxKB and experimental data, whereas its service-oriented architecture provides public access to LynxKB and its analytical tools via user-friendly web services and interfaces.

  2. Lynx: a database and knowledge extraction engine for integrative medicine

    PubMed Central

    Sulakhe, Dinanath; Balasubramanian, Sandhya; Xie, Bingqing; Feng, Bo; Taylor, Andrew; Wang, Sheng; Berrocal, Eduardo; Dave, Utpal; Xu, Jinbo; Börnigen, Daniela; Gilliam, T. Conrad; Maltsev, Natalia

    2014-01-01

    We have developed Lynx (http://lynx.ci.uchicago.edu)—a web-based database and a knowledge extraction engine, supporting annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Its underlying knowledge base (LynxKB) integrates various classes of information from >35 public databases and private collections, as well as manually curated data from our group and collaborators. Lynx provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization to assist the user in extracting meaningful knowledge from LynxKB and experimental data, whereas its service-oriented architecture provides public access to LynxKB and its analytical tools via user-friendly web services and interfaces. PMID:24270788

  3. ANTIGENIC VARIANTS OF INFLUENZA A VIRUS (PR8 STRAIN)

    PubMed Central

    Hamre, Dorothy; Loosli, Clayton G.; Gerber, Paul

    1958-01-01

    Seven variant strains of influenza A PR8-S virus, each derived from the previous one by serial passage in the lungs of mice immunized with the homologous agent have been produced. With the H.I. and neutralization procedures these variants showed a progressive serological deviation from the parent PR8-S virus. The seven variants provoked antibodies in varying titers to the preceding variants and the parent virus but not in relation to their position in the series. Thus, the seventh variant provoked significantly more antibody to the PR8-S virus than did the fifth variant. A possible explanation for this is presented. The first four variant viruses showed progressively less ability to react with antisera of the preceding variants and the PR8-S virus, and the three most recently derived variants showed essentially no ability to react with PR8-S and first variant antisera. The variant viruses remained antigenically stable through numerous lung passages in normal mice. Cross absorption tests revealed common antigenic components among the variant viruses and also individual characteristics which classify them as being different from one another. The implications of these findings in relation to studies by others have been discussed. PMID:13539308

  4. Characteristics of MUTYH variants in Japanese colorectal polyposis patients.

    PubMed

    Takao, Misato; Yamaguchi, Tatsuro; Eguchi, Hidetaka; Tada, Yuhki; Kohda, Masakazu; Koizumi, Koichi; Horiguchi, Shin-Ichiro; Okazaki, Yasushi; Ishida, Hideyuki

    2018-06-01

    The base excision repair gene MUTYH is the causative gene of colorectal polyposis syndrome, which is an autosomal recessive disorder associated with a high risk of colorectal cancer. Since few studies have investigated the genotype-phenotype association in Japanese patients with MUTYH variants, the aim of this study was to clarify the clinicopathological findings in Japanese patients with MUTYH gene variants who were detected by screening causative genes associated with hereditary colorectal polyposis. After obtaining informed consent, genetic testing was performed using target enrichment sequencing of 26 genes, including MUTYH. Of the 31 Japanese patients with suspected hereditary colorectal polyposis, eight MUTYH variants were detected in five patients. MUTYH hotspot variants known for Caucasians, namely p.G396D and p.Y179D, were not among the detected variants.Of five patients, two with biallelic MUTYH variants were diagnosed with MUTYH-associated polyposis, while two others had monoallelic MUTYH variants. One patient had the p.P18L and p.G25D variants on the same allele; however, supportive data for considering these two variants 'pathogenic' were lacking. Two patients with biallelic MUTYH variants and two others with monoallelic MUTYH variants were identified among Japanese colorectal polyposis patients. Hotspot variants of the MUTYH gene for Caucasians were not hotspots for Japanese patients.

  5. The PROTICdb database for 2-DE proteomics.

    PubMed

    Langella, Olivier; Zivy, Michel; Joets, Johann

    2007-01-01

    PROTICdb is a web-based database mainly designed to store and analyze plant proteome data obtained by 2D polyacrylamide gel electrophoresis (2D PAGE) and mass spectrometry (MS). The goals of PROTICdb are (1) to store, track, and query information related to proteomic experiments, i.e., from tissue sampling to protein identification and quantitative measurements; and (2) to integrate information from the user's own expertise and other sources into a knowledge base, used to support data interpretation (e.g., for the determination of allelic variants or products of posttranslational modifications). Data insertion into the relational database of PROTICdb is achieved either by uploading outputs from Mélanie, PDQuest, IM2d, ImageMaster(tm) 2D Platinum v5.0, Progenesis, Sequest, MS-Fit, and Mascot software, or by filling in web forms (experimental design and methods). 2D PAGE-annotated maps can be displayed, queried, and compared through the GelBrowser. Quantitative data can be easily exported in a tabulated format for statistical analyses with any third-party software. PROTICdb is based on the Oracle or the PostgreSQLDataBase Management System (DBMS) and is freely available upon request at http://cms.moulon.inra.fr/content/view/14/44/.

  6. Astronomical Orientations of Bora Ceremonial Grounds in Southeast Australia

    NASA Astrophysics Data System (ADS)

    Fuller, Robert S.; Hamacher, Duane W.; Norris, Ray P.

    2013-12-01

    Ethnographic evidence indicates that bora (initiation) ceremonial sites in southeast Australia, which typically comprise a pair of circles connected by a pathway, are symbolically reflected in the Milky Way as the 'Sky Bora'. This evidence also indicates that the position of the Sky Bora signifies the time of year when initiation ceremonies are held. We use archaeological data to test the hypothesis that southeast Australian bora grounds have a preferred orientation to the position of the Milky Way in the night sky in August, when the plane of the galaxy from Crux to Sagittarius is roughly vertical in the evening sky to the south-southwest. We accomplish this by measuring the orientations of 68 bora grounds using a combination of data from the archaeological literature, and site cards in the New South Wales Aboriginal Heritage Information Management System database. We find that bora grounds have a preferred orientation to the south and southwest, consistent with the Sky Bora hypothesis. Monte Carlo statistics show that these preferences were not the result of chance alignments, but were deliberate.

  7. International contributions to IAEA-NEA heat transfer databases for supercritical fluids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leung, L. K. H.; Yamada, K.

    2012-07-01

    An IAEA Coordinated Research Project on 'Heat Transfer Behaviour and Thermohydraulics Code Testing for SCWRs' is being conducted to facilitate collaboration and interaction among participants from 15 organizations. While the project covers several key technology areas relevant to the development of SCWR concepts, it focuses mainly on the heat transfer aspect, which has been identified as the most challenging. Through the collaborating effort, large heat-transfer databases have been compiled for supercritical water and surrogate fluids in tubes, annuli, and bundle subassemblies of various orientations over a wide range of flow conditions. Assessments of several supercritical heat-transfer correlations were performed usingmore » the complied databases. The assessment results are presented. (authors)« less

  8. Germline variant FGFR4  p.G388R exposes a membrane-proximal STAT3 binding site.

    PubMed

    Ulaganathan, Vijay K; Sperl, Bianca; Rapp, Ulf R; Ullrich, Axel

    2015-12-24

    Variant rs351855-G/A is a commonly occurring single-nucleotide polymorphism of coding regions in exon 9 of the fibroblast growth factor receptor FGFR4 (CD334) gene (c.1162G>A). It results in an amino-acid change at codon 388 from glycine to arginine (p.Gly388Arg) in the transmembrane domain of the receptor. Despite compelling genetic evidence for the association of this common variant with cancers of the bone, breast, colon, prostate, skin, lung, head and neck, as well as soft-tissue sarcomas and non-Hodgkin lymphoma, the underlying biological mechanism has remained elusive. Here we show that substitution of the conserved glycine 388 residue to a charged arginine residue alters the transmembrane spanning segment and exposes a membrane-proximal cytoplasmic signal transducer and activator of transcription 3 (STAT3) binding site Y(390)-(P)XXQ(393). We demonstrate that such membrane-proximal STAT3 binding motifs in the germline of type I membrane receptors enhance STAT3 tyrosine phosphorylation by recruiting STAT3 proteins to the inner cell membrane. Remarkably, such germline variants frequently co-localize with somatic mutations in the Catalogue of Somatic Mutations in Cancer (COSMIC) database. Using Fgfr4 single nucleotide polymorphism knock-in mice and transgenic mouse models for breast and lung cancers, we validate the enhanced STAT3 signalling induced by the FGFR4 Arg388-variant in vivo. Thus, our findings elucidate the molecular mechanism behind the genetic association of rs351855 with accelerated cancer progression and suggest that germline variants of cell-surface molecules that recruit STAT3 to the inner cell membrane are a significant risk for cancer prognosis and disease progression.

  9. Complex phenotype of dyskeratosis congenita and mood dysregulation with novel homozygous RTEL1 and TPH1 variants.

    PubMed

    Ungar, Rachel A; Giri, Neelam; Pao, Maryland; Khincha, Payal P; Zhou, Weiyin; Alter, Blanche P; Savage, Sharon A

    2018-06-01

    Dyskeratosis congenita (DC) is an inherited bone marrow failure syndrome caused by germline mutations in telomere biology genes. Patients have extremely short telomeres for their age and a complex phenotype including oral leukoplakia, abnormal skin pigmentation, and dysplastic nails in addition to bone marrow failure, pulmonary fibrosis, stenosis of the esophagus, lacrimal ducts and urethra, developmental anomalies, and high risk of cancer. We evaluated a patient with features of DC, mood dysregulation, diabetes, and lack of pubertal development. Family history was not available but genome-wide genotyping was consistent with consanguinity. Whole exome sequencing identified 82 variants of interest in 80 genes based on the following criteria: homozygous, <0.1% minor allele frequency in public and in-house databases, nonsynonymous, and predicted deleterious by multiple in silico prediction programs. Six genes were identified likely contributory to the clinical presentation. The cause of DC is likely due to homozygous splice site variants in regulator of telomere elongation helicase 1, a known DC and telomere biology gene. A homozygous, missense variant in tryptophan hydroxylase 1 may be clinically important as this gene encodes the rate limiting step in serotonin biosynthesis, a biologic pathway connected with mood disorders. Four additional genes (SCN4A, LRP4, GDAP1L1, and SPTBN5) had rare, missense homozygous variants that we speculate may contribute to portions of the clinical phenotype. This case illustrates the value of conducting detailed clinical and genomic evaluations on rare patients in order to identify new areas of research into the functional consequences of rare variants and their contribution to human disease. © 2018 Wiley Periodicals, Inc.

  10. Pathogenic Variants in Complement Genes and Risk of Atypical Hemolytic Uremic Syndrome Relapse after Eculizumab Discontinuation.

    PubMed

    Fakhouri, Fadi; Fila, Marc; Provôt, François; Delmas, Yahsou; Barbet, Christelle; Châtelet, Valérie; Rafat, Cédric; Cailliez, Mathilde; Hogan, Julien; Servais, Aude; Karras, Alexandre; Makdassi, Raifah; Louillet, Feriell; Coindre, Jean-Philippe; Rondeau, Eric; Loirat, Chantal; Frémeaux-Bacchi, Véronique

    2017-01-06

    The complement inhibitor eculizumab has dramatically improved the outcome of atypical hemolytic uremic syndrome. However, the optimal duration of eculizumab treatment in atypical hemolytic uremic syndrome remains debated. We report on the French atypical hemolytic uremic syndrome working group's first 2-year experience with eculizumab discontinuation in patients with atypical hemolytic uremic syndrome. Using the French atypical hemolytic uremic syndrome registry database, we retrospectively identified all dialysis-free patients with atypical hemolytic uremic syndrome who discontinued eculizumab between 2010 and 2014 and reviewed their relevant clinical and biologic data. The decision to discontinue eculizumab was made by the clinician in charge of the patient. All patients were closely monitored by regular urine dipsticks and blood tests. Eculizumab was rapidly (24-48 hours) restarted in case of relapse. Among 108 patients treated with eculizumab, 38 patients (nine children and 29 adults) discontinued eculizumab (median treatment duration of 17.5 months). Twenty-one patients (55%) carried novel or rare complement genes variants. Renal recovery under eculizumab was equally good in patients with and those without complement gene variants detected. After a median follow-up of 22 months, 12 patients (31%) experienced atypical hemolytic uremic syndrome relapse. Eight of 11 patients (72%) with complement factor H variants, four of eight patients (50%) with membrane cofactor protein variants, and zero of 16 patients with no rare variant detected relapsed. In relapsing patients, early reintroduction (≤48 hours) of eculizumab led to rapid (<7 days) hematologic remission and a return of serum creatinine to baseline level in a median time of 26 days. At last follow-up, renal function remained unchanged in nonrelapsing and relapsing patients compared with baseline values before eculizumab discontinuation. Pathogenic variants in complement genes were associated with higher

  11. An image database management system for conducting CAD research

    NASA Astrophysics Data System (ADS)

    Gruszauskas, Nicholas; Drukker, Karen; Giger, Maryellen L.

    2007-03-01

    The development of image databases for CAD research is not a trivial task. The collection and management of images and their related metadata from multiple sources is a time-consuming but necessary process. By standardizing and centralizing the methods in which these data are maintained, one can generate subsets of a larger database that match the specific criteria needed for a particular research project in a quick and efficient manner. A research-oriented management system of this type is highly desirable in a multi-modality CAD research environment. An online, webbased database system for the storage and management of research-specific medical image metadata was designed for use with four modalities of breast imaging: screen-film mammography, full-field digital mammography, breast ultrasound and breast MRI. The system was designed to consolidate data from multiple clinical sources and provide the user with the ability to anonymize the data. Input concerning the type of data to be stored as well as desired searchable parameters was solicited from researchers in each modality. The backbone of the database was created using MySQL. A robust and easy-to-use interface for entering, removing, modifying and searching information in the database was created using HTML and PHP. This standardized system can be accessed using any modern web-browsing software and is fundamental for our various research projects on computer-aided detection, diagnosis, cancer risk assessment, multimodality lesion assessment, and prognosis. Our CAD database system stores large amounts of research-related metadata and successfully generates subsets of cases that match the user's desired search criteria.

  12. Promising Variants of Initiation of Martensitic γ - α Transformation in Iron Alloys by a Couple of Elastic Waves

    NASA Astrophysics Data System (ADS)

    Kashchenko, M. P.; Chashchina, V. G.

    2016-01-01

    Variants of initiation of growth of crystals of α-martensite by couples of elastic waves propagating in directions <001>γ and <110>γ in singles crystals of Fe31Ni are suggested. The dynamic theory is used to show that the expected orientations of habit planes {110}γ, {001}γ and {559}γ differ from the typical {31015}γ. Possible features of tetragonality of martensite crystals are discussed. The power of the sources of ultrasound required for initiation of γ - α martensitic transformation is estimated.

  13. Secondary iris recognition method based on local energy-orientation feature

    NASA Astrophysics Data System (ADS)

    Huo, Guang; Liu, Yuanning; Zhu, Xiaodong; Dong, Hongxing

    2015-01-01

    This paper proposes a secondary iris recognition based on local features. The application of the energy-orientation feature (EOF) by two-dimensional Gabor filter to the extraction of the iris goes before the first recognition by the threshold of similarity, which sets the whole iris database into two categories-a correctly recognized class and a class to be recognized. Therefore, the former are accepted and the latter are transformed by histogram to achieve an energy-orientation histogram feature (EOHF), which is followed by a second recognition with the chi-square distance. The experiment has proved that the proposed method, because of its higher correct recognition rate, could be designated as the most efficient and effective among its companion studies in iris recognition algorithms.

  14. Influence of Hydrogen and Number of Particle Variants on Ordinary and Two-Way Shape Memory Effects in Ti-Ni Single Crystals

    NASA Astrophysics Data System (ADS)

    Kireeva, I. V.; Platonova, Yu. N.; Chumlyakov, Yu. I.

    2017-02-01

    The ordinary and two-way shape memory effects (SMEs) are investigated for [ overline{1} 12] single crystals of Ti-51.3Ni (at.%) alloy aged at 823 K for 1.5 h in free state and under tensile stress of 150 MPa without hydrogen and after saturation by hydrogen. It is established that without hydrogen in [ overline{1} 12] single crystals with one and four variants of Ti3Ni4 particles the maximum magnitude of the ordinary SME is 1.9-2.6% under the external stress σext = 250 MPa. Under σext > 250 MPa, crystals are destroyed. The magnitude of the two-way SME caused by the B2- R- B19' MT equal to 1.1% at σext = 0 is observed in [ overline{1} 12] single crystals with one variant of Ti3Ni4 particles. The physical reason for the observed two-way SME is the internal compressive stresses oriented along the [ overline{1} 12] directions arising from one variant of Ti3Ni4 particles as a result of aging under tensile stress of 150 MPa. It is established that hydrogen does not influence the TR temperature, reduces the plasticity, and suppresses the two-way SME. The suppression of two-way SME in the [ overline{1} 12] single crystals of the Ti-51.3Ni (at.%) alloy with one variant of Ti3Ni4 particles is caused by shielding of stress fields from one variant of Ti3Ni4 particles and multiple nucleation of R- and B19' martensite variants under loading with saturation by hydrogen.

  15. Spatially Resolved Mid-IR Spectra from Meteorites; Linking Composition, Crystallographic Orientation and Spectra on the Micro-Scale

    NASA Astrophysics Data System (ADS)

    Stephen, N. R.

    2016-08-01

    IR spectroscopy is used to infer composition of extraterrestrial bodies, comparing bulk spectra to databases of separate mineral phases. We extract spatially resolved meteorite-specific spectra from achondrites with respect to zonation and orientation.

  16. CRIMEtoYHU: a new web tool to develop yeast-based functional assays for characterizing cancer-associated missense variants.

    PubMed

    Mercatanti, Alberto; Lodovichi, Samuele; Cervelli, Tiziana; Galli, Alvaro

    2017-12-01

    Evaluation of the functional impact of cancer-associated missense variants is more difficult than for protein-truncating mutations and consequently standard guidelines for the interpretation of sequence variants have been recently proposed. A number of algorithms and software products were developed to predict the impact of cancer-associated missense mutations on protein structure and function. Importantly, direct assessment of the variants using high-throughput functional assays using simple genetic systems can help in speeding up the functional evaluation of newly identified cancer-associated variants. We developed the web tool CRIMEtoYHU (CTY) to help geneticists in the evaluation of the functional impact of cancer-associated missense variants. Humans and the yeast Saccharomyces cerevisiae share thousands of protein-coding genes although they have diverged for a billion years. Therefore, yeast humanization can be helpful in deciphering the functional consequences of human genetic variants found in cancer and give information on the pathogenicity of missense variants. To humanize specific positions within yeast genes, human and yeast genes have to share functional homology. If a mutation in a specific residue is associated with a particular phenotype in humans, a similar substitution in the yeast counterpart may reveal its effect at the organism level. CTY simultaneously finds yeast homologous genes, identifies the corresponding variants and determines the transferability of human variants to yeast counterparts by assigning a reliability score (RS) that may be predictive for the validity of a functional assay. CTY analyzes newly identified mutations or retrieves mutations reported in the COSMIC database, provides information about the functional conservation between yeast and human and shows the mutation distribution in human genes. CTY analyzes also newly found mutations and aborts when no yeast homologue is found. Then, on the basis of the protein domain

  17. A Bioinformatics Approach to the Identification of Variants Associated with Type 1 and Type 2 Diabetes Mellitus that Reside in Functionally Validated miRNAs Binding Sites.

    PubMed

    Ghaedi, Hamid; Bastami, Milad; Jahani, Mohammad Mehdi; Alipoor, Behnam; Tabasinezhad, Maryam; Ghaderi, Omar; Nariman-Saleh-Fam, Ziba; Mirfakhraie, Reza; Movafagh, Abolfazl; Omrani, Mir Davood; Masotti, Andrea

    2016-06-01

    The present work is aimed at finding variants associated with Type 1 and Type 2 diabetes mellitus (DM) that reside in functionally validated miRNAs binding sites and that can have a functional role in determining diabetes and related pathologies. Using bioinformatics analyses we obtained a database of validated polymorphic miRNA binding sites which has been intersected with genes related to DM or to variants associated and/or in linkage disequilibrium (LD) with it and is reported in genome-wide association studies (GWAS). The workflow we followed allowed us to find variants associated with DM that also reside in functional miRNA binding sites. These data have been demonstrated to have a functional role by impairing the functions of genes implicated in biological processes linked to DM. In conclusion, our work emphasized the importance of SNPs located in miRNA binding sites. The results discussed in this work may constitute the basis of further works aimed at finding functional candidates and variants affecting protein structure and function, transcription factor binding sites, and non-coding epigenetic variants, contributing to widen the knowledge about the pathogenesis of this important disease.

  18. Identification of missing variants by combining multiple analytic pipelines.

    PubMed

    Ren, Yingxue; Reddy, Joseph S; Pottier, Cyril; Sarangi, Vivekananda; Tian, Shulan; Sinnwell, Jason P; McDonnell, Shannon K; Biernacka, Joanna M; Carrasquillo, Minerva M; Ross, Owen A; Ertekin-Taner, Nilüfer; Rademakers, Rosa; Hudson, Matthew; Mainzer, Liudmila Sergeevna; Asmann, Yan W

    2018-04-16

    After decades of identifying risk factors using array-based genome-wide association studies (GWAS), genetic research of complex diseases has shifted to sequencing-based rare variants discovery. This requires large sample sizes for statistical power and has brought up questions about whether the current variant calling practices are adequate for large cohorts. It is well-known that there are discrepancies between variants called by different pipelines, and that using a single pipeline always misses true variants exclusively identifiable by other pipelines. Nonetheless, it is common practice today to call variants by one pipeline due to computational cost and assume that false negative calls are a small percent of total. We analyzed 10,000 exomes from the Alzheimer's Disease Sequencing Project (ADSP) using multiple analytic pipelines consisting of different read aligners and variant calling strategies. We compared variants identified by using two aligners in 50,100, 200, 500, 1000, and 1952 samples; and compared variants identified by adding single-sample genotyping to the default multi-sample joint genotyping in 50,100, 500, 2000, 5000 and 10,000 samples. We found that using a single pipeline missed increasing numbers of high-quality variants correlated with sample sizes. By combining two read aligners and two variant calling strategies, we rescued 30% of pass-QC variants at sample size of 2000, and 56% at 10,000 samples. The rescued variants had higher proportions of low frequency (minor allele frequency [MAF] 1-5%) and rare (MAF < 1%) variants, which are the very type of variants of interest. In 660 Alzheimer's disease cases with earlier onset ages of ≤65, 4 out of 13 (31%) previously-published rare pathogenic and protective mutations in APP, PSEN1, and PSEN2 genes were undetected by the default one-pipeline approach but recovered by the multi-pipeline approach. Identification of the complete variant set from sequencing data is the prerequisite of genetic

  19. Incorporating client-server database architecture and graphical user interface into outpatient medical records.

    PubMed Central

    Fiacco, P. A.; Rice, W. H.

    1991-01-01

    Computerized medical record systems require structured database architectures for information processing. However, the data must be able to be transferred across heterogeneous platform and software systems. Client-Server architecture allows for distributive processing of information among networked computers and provides the flexibility needed to link diverse systems together effectively. We have incorporated this client-server model with a graphical user interface into an outpatient medical record system, known as SuperChart, for the Department of Family Medicine at SUNY Health Science Center at Syracuse. SuperChart was developed using SuperCard and Oracle SuperCard uses modern object-oriented programming to support a hypermedia environment. Oracle is a powerful relational database management system that incorporates a client-server architecture. This provides both a distributed database and distributed processing which improves performance. PMID:1807732

  20. Histone H3 Variants in Trichomonas vaginalis

    PubMed Central

    Zubáčová, Zuzana; Hostomská, Jitka

    2012-01-01

    The parabasalid protist Trichomonas vaginalis is a widespread parasite that affects humans, frequently causing vaginitis in infected women. Trichomonad mitosis is marked by the persistence of the nuclear membrane and the presence of an asymmetric extranuclear spindle with no obvious direct connection to the chromosomes. No centromeric markers have been described in T. vaginalis, which has prevented a detailed analysis of mitotic events in this organism. In other eukaryotes, nucleosomes of centromeric chromatin contain the histone H3 variant CenH3. The principal aim of this work was to identify a CenH3 homolog in T. vaginalis. We performed a screen of the T. vaginalis genome to retrieve sequences of canonical and variant H3 histones. Three variant histone H3 proteins were identified, and the subcellular localization of their epitope-tagged variants was determined. The localization of the variant TVAG_185390 could not be distinguished from that of the canonical H3 histone. The sequence of the variant TVAG_087830 closely resembled that of histone H3. The tagged protein colocalized with sites of active transcription, indicating that the variant TVAG_087830 represented H3.3 in T. vaginalis. The third H3 variant (TVAG_224460) was localized to 6 or 12 distinct spots at the periphery of the nucleus, corresponding to the number of chromosomes in G1 phase and G2 phase, respectively. We propose that this variant represents the centromeric marker CenH3 and thus can be employed as a tool to study mitosis in T. vaginalis. Furthermore, we suggest that the peripheral distribution of CenH3 within the nucleus results from the association of centromeres with the nuclear envelope throughout the cell cycle. PMID:22408228

  1. NAHR-mediated copy-number variants in a clinical population: mechanistic insights into both genomic disorders and Mendelizing traits.

    PubMed

    Dittwald, Piotr; Gambin, Tomasz; Szafranski, Przemyslaw; Li, Jian; Amato, Stephen; Divon, Michael Y; Rodríguez Rojas, Lisa Ximena; Elton, Lindsay E; Scott, Daryl A; Schaaf, Christian P; Torres-Martinez, Wilfredo; Stevens, Abby K; Rosenfeld, Jill A; Agadi, Satish; Francis, David; Kang, Sung-Hae L; Breman, Amy; Lalani, Seema R; Bacino, Carlos A; Bi, Weimin; Milosavljevic, Aleksandar; Beaudet, Arthur L; Patel, Ankita; Shaw, Chad A; Lupski, James R; Gambin, Anna; Cheung, Sau Wai; Stankiewicz, Pawel

    2013-09-01

    We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5'-CCNCCNTNNCCNC-3', correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease.

  2. Ontology-Oriented Programming for Biomedical Informatics.

    PubMed

    Lamy, Jean-Baptiste

    2016-01-01

    Ontologies are now widely used in the biomedical domain. However, it is difficult to manipulate ontologies in a computer program and, consequently, it is not easy to integrate ontologies with databases or websites. Two main approaches have been proposed for accessing ontologies in a computer program: traditional API (Application Programming Interface) and ontology-oriented programming, either static or dynamic. In this paper, we will review these approaches and discuss their appropriateness for biomedical ontologies. We will also present an experience feedback about the integration of an ontology in a computer software during the VIIIP research project. Finally, we will present OwlReady, the solution we developed.

  3. The effect of grain orientation on nanoindentation behavior of model austenitic alloy Fe-20Cr-25Ni

    DOE PAGES

    Chen, Tianyi; Tan, Lizhen; Lu, Zizhe; ...

    2017-07-26

    Instrumented nanoindentation was used in this paper to investigate the hardness, elastic modulus, and creep behavior of an austenitic Fe-20Cr-25Ni model alloy at room temperature, with the indented grain orientation being the variant. The samples indented close to the {111} surfaces exhibited the highest hardness and modulus. However, nanoindentation creep tests showed the greatest tendency for creep in the {111} indented samples, compared with the samples indented close to the {001} and {101} surfaces. Scanning electron microscopy and cross-sectional transmission electron microscopy revealed slip bands and dislocations in all samples. The slip band patterns on the indented surfaces were influencedmore » by the grain orientations. Deformation twinning was observed only under the {001} indented surfaces. Finally, microstructural analysis and molecular dynamics modeling correlated the anisotropic nanoindentation-creep behavior with the different dislocation substructures formed during indentation, which resulted from the dislocation reactions of certain active slip systems that are determined by the indented grain orientations.« less

  4. Spectrum of genetic variants of BRCA1 and BRCA2 in a German single center study.

    PubMed

    Meisel, Cornelia; Sadowski, Carolin Eva; Kohlstedt, Daniela; Keller, Katja; Stäritz, Franziska; Grübling, Nannette; Becker, Kerstin; Mackenroth, Luisa; Rump, Andreas; Schröck, Evelin; Arnold, Norbert; Wimberger, Pauline; Kast, Karin

    2017-05-01

    Determination of mutation status of BRCA1 and BRCA2 has become part of the clinical routine. However, the spectrum of genetic variants differs between populations. The aim of this study was to deliver a comprehensive description of all detected variants. In families fulfilling one of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC) criteria for genetic testing, one affected was chosen for analysis. DNA of blood lymphocytes was amplified by PCR and prescreened by DHPLC. Aberrant fragments were sequenced. All coding exons and splice sites of BRCA1 and BRCA2 were analyzed. Screening for large rearrangements in both genes was performed by MLPA. Of 523 index patients, 121 (23.1%) were found to carry a pathogenic or likely pathogenic (class 4/5) mutation. A variant of unknown significance (VUS) was detected in 73/523 patients (13.9%). Two mutations p.Gln1756Profs*74 and p.Cys61Gly comprised 42.3% (n = 33/78) of all detected pathogenic mutations in BRCA1. Most of the other mutations were unique mutations. The most frequently detected mutation in BRCA2 was p.Val1283Lys (13.9%; n = 6/43). Altogether, 101 different neutral genetic variants were counted in BRCA1 (n = 35) and in BRCA2 (n = 66). The two most frequently detected mutations are founder mutations in Poland and Czech Republic. More similarities seem to be shared with our direct neighbor countries compared to other European countries. For comparison of the extended genotype, a shared database is needed.

  5. Concentrations of indoor pollutants (CIP) database user's manual (Version 4. 0)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Apte, M.G.; Brown, S.R.; Corradi, C.A.

    1990-10-01

    This is the latest release of the database and the user manual. The user manual is a tutorial and reference for utilizing the CIP Database system. An installation guide is included to cover various hardware configurations. Numerous examples and explanations of the dialogue between the user and the database program are provided. It is hoped that this resource will, along with on-line help and the menu-driven software, make for a quick and easy learning curve. For the purposes of this manual, it is assumed that the user is acquainted with the goals of the CIP Database, which are: (1) tomore » collect existing measurements of concentrations of indoor air pollutants in a user-oriented database and (2) to provide a repository of references citing measured field results openly accessible to a wide audience of researchers, policy makers, and others interested in the issues of indoor air quality. The database software, as distinct from the data, is contained in two files, CIP. EXE and PFIL.COM. CIP.EXE is made up of a number of programs written in dBase III command code and compiled using Clipper into a single, executable file. PFIL.COM is a program written in Turbo Pascal that handles the output of summary text files and is called from CIP.EXE. Version 4.0 of the CIP Database is current through March 1990.« less

  6. A particle swarm optimization variant with an inner variable learning strategy.

    PubMed

    Wu, Guohua; Pedrycz, Witold; Ma, Manhao; Qiu, Dishan; Li, Haifeng; Liu, Jin

    2014-01-01

    Although Particle Swarm Optimization (PSO) has demonstrated competitive performance in solving global optimization problems, it exhibits some limitations when dealing with optimization problems with high dimensionality and complex landscape. In this paper, we integrate some problem-oriented knowledge into the design of a certain PSO variant. The resulting novel PSO algorithm with an inner variable learning strategy (PSO-IVL) is particularly efficient for optimizing functions with symmetric variables. Symmetric variables of the optimized function have to satisfy a certain quantitative relation. Based on this knowledge, the inner variable learning (IVL) strategy helps the particle to inspect the relation among its inner variables, determine the exemplar variable for all other variables, and then make each variable learn from the exemplar variable in terms of their quantitative relations. In addition, we design a new trap detection and jumping out strategy to help particles escape from local optima. The trap detection operation is employed at the level of individual particles whereas the trap jumping out strategy is adaptive in its nature. Experimental simulations completed for some representative optimization functions demonstrate the excellent performance of PSO-IVL. The effectiveness of the PSO-IVL stresses a usefulness of augmenting evolutionary algorithms by problem-oriented domain knowledge.

  7. Variant adrenal venous anatomy in 546 laparoscopic adrenalectomies.

    PubMed

    Scholten, Anouk; Cisco, Robin M; Vriens, Menno R; Shen, Wen T; Duh, Quan-Yang

    2013-04-01

    Knowing the types and frequency of adrenal vein variants would help surgeons identify and control the adrenal vein during laparoscopic adrenalectomy. To establish the surgical anatomy of the main vein and its variants for laparoscopic adrenalectomy and to analyze the relationship between variant adrenal venous anatomy and tumor size, pathologic diagnosis, and operative outcomes. In a retrospective review of patients at a tertiary referral hospital, 506 patients underwent 546 consecutive laparoscopic adrenalectomies between April 22, 1993, and October 21, 2011. Patients with variant adrenal venous anatomy were compared with patients with normal adrenal venous anatomy regarding preoperative variables (patient and tumor characteristics [size and location] and clinical diagnosis), intraoperative variables (details on the main adrenal venous drainage, any variant venous anatomy, duration of operation, rate of conversion to hand-assisted or open procedure, and estimated blood loss), and postoperative variables (transfusion requirement, reoperation for bleeding, duration of hospital stay, and histologic diagnosis). Laparoscopic adrenalectomy. Prevalence of variant adrenal venous anatomy and its relationship to tumor characteristics, pathologic diagnosis, and operative outcomes. Variant venous anatomy was encountered in 70 of 546 adrenalectomies (13%). Variants included no main adrenal vein identifiable (n = 18), 1 main adrenal vein with additional small veins (n = 11), 2 adrenal veins (n = 20), more than 2 adrenal veins (n = 14), and variants of the adrenal vein drainage to the inferior vena cava and hepatic vein or of the inferior phrenic vein (n = 7). Variants occurred more often on the right side than on the left side (42 of 250 glands [17%] vs. 28 of 296 glands [9%], respectively; P = .02). Patients with variant anatomy compared with those with normal anatomy had larger tumors (mean, 5.1 vs 3.3 cm, respectively; P < .001), more pheochromocytomas (24 of 70 [35%] vs

  8. How preserved is episodic memory in behavioral variant frontotemporal dementia?

    PubMed

    Hornberger, M; Piguet, O; Graham, A J; Nestor, P J; Hodges, J R

    2010-02-09

    Studies have shown variable memory performance in patients with behavioral variant frontotemporal dementia (bvFTD). Our study investigated whether this variability is due to the admixture of patients with true bvFTD and phenocopy patients. We also sought to compare performance of patients with bvFTD and patients with Alzheimer disease (AD). We analyzed neuropsychological memory performance in patients with a clinical diagnosis of bvFTD divided into those who progressed (n = 50) and those who remained stable (n = 39), patients with AD (n = 64), and healthy controls (n = 64). Patients with progressive bvFTD were impaired on most memory tests to a similar level to that of patients with early AD. Findings from a subset of patients with progressive bvFTD with confirmed FTLD pathology (n = 10) corroborated these findings. By contrast, patients with phenocopy bvFTD performed significantly better than progressors and patients with AD. Logistic regression revealed that patients with bvFTD can be distinguished to a high degree (85%) on the immediate recall score of a word list learning test (Rey Auditory Verbal Learning Test). Our results provide evidence for an underlying memory deficit in "real" or progressive behavioral variant frontotemporal dementia (bvFTD) similar to Alzheimer disease, though the groups differ in orientation scores, with patients with bvFTD being intact. Exclusion solely based on impaired neuropsychological memory performance can potentially lead to an underdiagnosis of FTD.

  9. Rare variants and autoimmune disease.

    PubMed

    Massey, Jonathan; Eyre, Steve

    2014-09-01

    The study of rare variants in monogenic forms of autoimmune disease has offered insight into the aetiology of more complex pathologies. Research in complex autoimmune disease initially focused on sequencing candidate genes, with some early successes, notably in uncovering low-frequency variation associated with Type 1 diabetes mellitus. However, other early examples have proved difficult to replicate, and a recent study across six autoimmune diseases, re-sequencing 25 autoimmune disease-associated genes in large sample sizes, failed to find any associated rare variants. The study of rare and low-frequency variation in autoimmune diseases has been made accessible by the inclusion of such variants on custom genotyping arrays (e.g. Immunochip and Exome arrays). Whole-exome sequencing approaches are now also being utilised to uncover the contribution of rare coding variants to disease susceptibility, severity and treatment response. Other sequencing strategies are starting to uncover the role of regulatory rare variation. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  10. Variant Review with the Integrative Genomics Viewer.

    PubMed

    Robinson, James T; Thorvaldsdóttir, Helga; Wenger, Aaron M; Zehir, Ahmet; Mesirov, Jill P

    2017-11-01

    Manual review of aligned reads for confirmation and interpretation of variant calls is an important step in many variant calling pipelines for next-generation sequencing (NGS) data. Visual inspection can greatly increase the confidence in calls, reduce the risk of false positives, and help characterize complex events. The Integrative Genomics Viewer (IGV) was one of the first tools to provide NGS data visualization, and it currently provides a rich set of tools for inspection, validation, and interpretation of NGS datasets, as well as other types of genomic data. Here, we present a short overview of IGV's variant review features for both single-nucleotide variants and structural variants, with examples from both cancer and germline datasets. IGV is freely available at https://www.igv.org Cancer Res; 77(21); e31-34. ©2017 AACR . ©2017 American Association for Cancer Research.

  11. Histological variants of cutaneous Kaposi sarcoma

    PubMed Central

    Grayson, Wayne; Pantanowitz, Liron

    2008-01-01

    This review provides a comprehensive overview of the broad clinicopathologic spectrum of cutaneous Kaposi sarcoma (KS) lesions. Variants discussed include: usual KS lesions associated with disease progression (i.e. patch, plaque and nodular stage); morphologic subtypes alluded to in the older literature such as anaplastic and telangiectatic KS, as well as several lymphedematous variants; and numerous recently described variants including hyperkeratotic, keloidal, micronodular, pyogenic granuloma-like, ecchymotic, and intravascular KS. Involuting lesions as a result of treatment related regression are also presented. PMID:18655700

  12. A trans-acting Variant within the Transcription Factor RIM101 Interacts with Genetic Background to Determine its Regulatory Capacity.

    PubMed

    Read, Timothy; Richmond, Phillip A; Dowell, Robin D

    2016-01-01

    Most genetic variants associated with disease occur within regulatory regions of the genome, underscoring the importance of defining the mechanisms underlying differences in regulation of gene expression between individuals. We discovered a pair of co-regulated, divergently oriented transcripts, AQY2 and ncFRE6, that are expressed in one strain of Saccharomyces cerevisiae, ∑1278b, but not in another, S288c. By combining classical genetics techniques with high-throughput sequencing, we identified a trans-acting single nucleotide polymorphism within the transcription factor RIM101 that causes the background-dependent expression of both transcripts. Subsequent RNA-seq experiments revealed that RIM101 regulates many more targets in S288c than in ∑1278b and that deletion of RIM101 in both backgrounds abrogates the majority of differential expression between the strains. Strikingly, only three transcripts undergo a significant change in expression after swapping RIM101 alleles between backgrounds, implying that the differences in the RIM101 allele lead to a remarkably focused transcriptional response. However, hundreds of RIM101-dependent targets undergo a subtle but consistent shift in expression in the S288c RIM101-swapped strain, but not its ∑1278b counterpart. We conclude that ∑1278b may harbor a variant(s) that buffers against widespread transcriptional dysregulation upon introduction of a non-native RIM101 allele, emphasizing the importance of accounting for genetic background when assessing the impact of a regulatory variant.

  13. SCN5A (NaV1.5) Variant Functional Perturbation and Clinical Presentation: Variants of a Certain Significance.

    PubMed

    Kroncke, Brett M; Glazer, Andrew M; Smith, Derek K; Blume, Jeffrey D; Roden, Dan M

    2018-05-01

    Accurately predicting the impact of rare nonsynonymous variants on disease risk is an important goal in precision medicine. Variants in the cardiac sodium channel SCN5A (protein Na V 1.5; voltage-dependent cardiac Na+ channel) are associated with multiple arrhythmia disorders, including Brugada syndrome and long QT syndrome. Rare SCN5A variants also occur in ≈1% of unaffected individuals. We hypothesized that in vitro electrophysiological functional parameters explain a statistically significant portion of the variability in disease penetrance. From a comprehensive literature review, we quantified the number of carriers presenting with and without disease for 1712 reported SCN5A variants. For 356 variants, data were also available for 5 Na V 1.5 electrophysiological parameters: peak current, late/persistent current, steady-state V1/2 of activation and inactivation, and recovery from inactivation. We found that peak and late current significantly associate with Brugada syndrome ( P <0.001; ρ=-0.44; Spearman rank test) and long QT syndrome disease penetrance ( P <0.001; ρ=0.37). Steady-state V1/2 activation and recovery from inactivation associate significantly with Brugada syndrome and long QT syndrome penetrance, respectively. Continuous estimates of disease penetrance align with the current American College of Medical Genetics classification paradigm. Na V 1.5 in vitro electrophysiological parameters are correlated with Brugada syndrome and long QT syndrome disease risk. Our data emphasize the value of in vitro electrophysiological characterization and incorporating counts of affected and unaffected carriers to aid variant classification. This quantitative analysis of the electrophysiological literature should aid the interpretation of Na V 1.5 variant electrophysiological abnormalities and help improve Na V 1.5 variant classification. © 2018 American Heart Association, Inc.

  14. Pathogenic Germline Variants in 10,389 Adult Cancers.

    PubMed

    Huang, Kuan-Lin; Mashl, R Jay; Wu, Yige; Ritter, Deborah I; Wang, Jiayin; Oh, Clara; Paczkowska, Marta; Reynolds, Sheila; Wyczalkowski, Matthew A; Oak, Ninad; Scott, Adam D; Krassowski, Michal; Cherniack, Andrew D; Houlahan, Kathleen E; Jayasinghe, Reyka; Wang, Liang-Bo; Zhou, Daniel Cui; Liu, Di; Cao, Song; Kim, Young Won; Koire, Amanda; McMichael, Joshua F; Hucthagowder, Vishwanathan; Kim, Tae-Beom; Hahn, Abigail; Wang, Chen; McLellan, Michael D; Al-Mulla, Fahd; Johnson, Kimberly J; Lichtarge, Olivier; Boutros, Paul C; Raphael, Benjamin; Lazar, Alexander J; Zhang, Wei; Wendl, Michael C; Govindan, Ramaswamy; Jain, Sanjay; Wheeler, David; Kulkarni, Shashikant; Dipersio, John F; Reimand, Jüri; Meric-Bernstam, Funda; Chen, Ken; Shmulevich, Ilya; Plon, Sharon E; Chen, Feng; Ding, Li

    2018-04-05

    We conducted the largest investigation of predisposition variants in cancer to date, discovering 853 pathogenic or likely pathogenic variants in 8% of 10,389 cases from 33 cancer types. Twenty-one genes showed single or cross-cancer associations, including novel associations of SDHA in melanoma and PALB2 in stomach adenocarcinoma. The 659 predisposition variants and 18 additional large deletions in tumor suppressors, including ATM, BRCA1, and NF1, showed low gene expression and frequent (43%) loss of heterozygosity or biallelic two-hit events. We also discovered 33 such variants in oncogenes, including missenses in MET, RET, and PTPN11 associated with high gene expression. We nominated 47 additional predisposition variants from prioritized VUSs supported by multiple evidences involving case-control frequency, loss of heterozygosity, expression effect, and co-localization with mutations and modified residues. Our integrative approach links rare predisposition variants to functional consequences, informing future guidelines of variant classification and germline genetic testing in cancer. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  15. ScaffoldScaffolder: solving contig orientation via bidirected to directed graph reduction.

    PubMed

    Bodily, Paul M; Fujimoto, M Stanley; Snell, Quinn; Ventura, Dan; Clement, Mark J

    2016-01-01

    The contig orientation problem, which we formally define as the MAX-DIR problem, has at times been addressed cursorily and at times using various heuristics. In setting forth a linear-time reduction from the MAX-CUT problem to the MAX-DIR problem, we prove the latter is NP-complete. We compare the relative performance of a novel greedy approach with several other heuristic solutions. Our results suggest that our greedy heuristic algorithm not only works well but also outperforms the other algorithms due to the nature of scaffold graphs. Our results also demonstrate a novel method for identifying inverted repeats and inversion variants, both of which contradict the basic single-orientation assumption. Such inversions have previously been noted as being difficult to detect and are directly involved in the genetic mechanisms of several diseases. http://bioresearch.byu.edu/scaffoldscaffolder. paulmbodily@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. G6PDdb, an integrated database of glucose-6-phosphate dehydrogenase (G6PD) mutations.

    PubMed

    Kwok, Colin J; Martin, Andrew C R; Au, Shannon W N; Lam, Veronica M S

    2002-03-01

    G6PDdb (http://www.rubic.rdg.ac.uk/g6pd/ or http://www.bioinf.org.uk/g6pd/) is a newly created web-accessible locus-specific mutation database for the human Glucose-6-phosphate dehydrogenase (G6PD) gene. The relational database integrates up-to-date mutational and structural data from various databanks (GenBank, Protein Data Bank, etc.) with biochemically characterized variants and their associated phenotypes obtained from published literature and the Favism website. An automated analysis of the mutations likely to have a significant impact on the structure of the protein has been performed using a recently developed procedure. The database may be queried online and the full results of the analysis of the structural impact of mutations are available. The web page provides a form for submitting additional mutation data and is linked to resources such as the Favism website, OMIM, HGMD, HGVBASE, and the PDB. This database provides insights into the molecular aspects and clinical significance of G6PD deficiency for researchers and clinicians and the web page functions as a knowledge base relevant to the understanding of G6PD deficiency and its management. Copyright 2002 Wiley-Liss, Inc.

  17. Visualizing the geography of genetic variants.

    PubMed

    Marcus, Joseph H; Novembre, John

    2017-02-15

    One of the key characteristics of any genetic variant is its geographic distribution. The geographic distribution can shed light on where an allele first arose, what populations it has spread to, and in turn on how migration, genetic drift, and natural selection have acted. The geographic distribution of a genetic variant can also be of great utility for medical/clinical geneticists and collectively many genetic variants can reveal population structure. Here we develop an interactive visualization tool for rapidly displaying the geographic distribution of genetic variants. Through a REST API and dynamic front-end, the Geography of Genetic Variants (GGV) browser ( http://popgen.uchicago.edu/ggv/ ) provides maps of allele frequencies in populations distributed across the globe. GGV is implemented as a website ( http://popgen.uchicago.edu/ggv/ ) which employs an API to access frequency data ( http://popgen.uchicago.edu/freq_api/ ). Python and javascript source code for the website and the API are available at: http://github.com/NovembreLab/ggv/ and http://github.com/NovembreLab/ggv-api/ . jnovembre@uchicago.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  18. Variants of glycoside hydrolases

    DOEpatents

    Teter, Sarah; Ward, Connie; Cherry, Joel; Jones, Aubrey; Harris, Paul; Yi, Jung

    2013-02-26

    The present invention relates to variants of a parent glycoside hydrolase, comprising a substitution at one or more positions corresponding to positions 21, 94, 157, 205, 206, 247, 337, 350, 373, 383, 438, 455, 467, and 486 of amino acids 1 to 513 of SEQ ID NO: 2, and optionally further comprising a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2 a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2, wherein the variants have glycoside hydrolase activity. The present invention also relates to nucleotide sequences encoding the variant glycoside hydrolases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  19. Variants of glycoside hydrolases

    DOEpatents

    Teter, Sarah [Davis, CA; Ward, Connie [Hamilton, MT; Cherry, Joel [Davis, CA; Jones, Aubrey [Davis, CA; Harris, Paul [Carnation, WA; Yi, Jung [Sacramento, CA

    2011-04-26

    The present invention relates to variants of a parent glycoside hydrolase, comprising a substitution at one or more positions corresponding to positions 21, 94, 157, 205, 206, 247, 337, 350, 373, 383, 438, 455, 467, and 486 of amino acids 1 to 513 of SEQ ID NO: 2, and optionally further comprising a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2 a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2, wherein the variants have glycoside hydrolase activity. The present invention also relates to nucleotide sequences encoding the variant glycoside hydrolases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  20. Variants of glycoside hydrolases

    DOEpatents

    Teter, Sarah; Ward, Connie; Cherry, Joel; Jones, Aubrey; Harris, Paul; Yi, Jung

    2017-07-11

    The present invention relates to variants of a parent glycoside hydrolase, comprising a substitution at one or more positions corresponding to positions 21, 94, 157, 205, 206, 247, 337, 350, 373, 383, 438, 455, 467, and 486 of amino acids 1 to 513 of SEQ ID NO: 2, and optionally further comprising a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2 a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2, wherein the variants have glycoside hydrolase activity. The present invention also relates to nucleotide sequences encoding the variant glycoside hydrolases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  1. Twelve novel HGD gene variants identified in 99 alkaptonuria patients: focus on 'black bone disease' in Italy.

    PubMed

    Nemethova, Martina; Radvanszky, Jan; Kadasi, Ludevit; Ascher, David B; Pires, Douglas E V; Blundell, Tom L; Porfirio, Berardino; Mannoni, Alessandro; Santucci, Annalisa; Milucci, Lia; Sestini, Silvia; Biolcati, Gianfranco; Sorge, Fiammetta; Aurizi, Caterina; Aquaron, Robert; Alsbou, Mohammed; Lourenço, Charles Marques; Ramadevi, Kanakasabapathi; Ranganath, Lakshminarayan R; Gallagher, James A; van Kan, Christa; Hall, Anthony K; Olsson, Birgitta; Sireau, Nicolas; Ayoob, Hana; Timmis, Oliver G; Sang, Kim-Hanh Le Quan; Genovese, Federica; Imrich, Richard; Rovensky, Jozef; Srinivasaraghavan, Rangan; Bharadwaj, Shruthi K; Spiegel, Ronen; Zatkova, Andrea

    2016-01-01

    Alkaptonuria (AKU) is an autosomal recessive disorder caused by mutations in homogentisate-1,2-dioxygenase (HGD) gene leading to the deficiency of HGD enzyme activity. The DevelopAKUre project is underway to test nitisinone as a specific treatment to counteract this derangement of the phenylalanine-tyrosine catabolic pathway. We analysed DNA of 40 AKU patients enrolled for SONIA1, the first study in DevelopAKUre, and of 59 other AKU patients sent to our laboratory for molecular diagnostics. We identified 12 novel DNA variants: one was identified in patients from Brazil (c.557T>A), Slovakia (c.500C>T) and France (c.440T>C), three in patients from India (c.469+6T>C, c.650-85A>G, c.158G>A), and six in patients from Italy (c.742A>G, c.614G>A, c.1057A>C, c.752G>A, c.119A>C, c.926G>T). Thus, the total number of potential AKU-causing variants found in 380 patients reported in the HGD mutation database is now 129. Using mCSM and DUET, computational approaches based on the protein 3D structure, the novel missense variants are predicted to affect the activity of the enzyme by three mechanisms: decrease of stability of individual protomers, disruption of protomer-protomer interactions or modification of residues in the region of the active site. We also present an overview of AKU in Italy, where so far about 60 AKU cases are known and DNA analysis has been reported for 34 of them. In this rather small group, 26 different HGD variants affecting function were described, indicating rather high heterogeneity. Twelve of these variants seem to be specific for Italy.

  2. UbSRD: The Ubiquitin Structural Relational Database.

    PubMed

    Harrison, Joseph S; Jacobs, Tim M; Houlihan, Kevin; Van Doorslaer, Koenraad; Kuhlman, Brian

    2016-02-22

    The structurally defined ubiquitin-like homology fold (UBL) can engage in several unique protein-protein interactions and many of these complexes have been characterized with high-resolution techniques. Using Rosetta's structural classification tools, we have created the Ubiquitin Structural Relational Database (UbSRD), an SQL database of features for all 509 UBL-containing structures in the PDB, allowing users to browse these structures by protein-protein interaction and providing a platform for quantitative analysis of structural features. We used UbSRD to define the recognition features of ubiquitin (UBQ) and SUMO observed in the PDB and the orientation of the UBQ tail while interacting with certain types of proteins. While some of the interaction surfaces on UBQ and SUMO overlap, each molecule has distinct features that aid in molecular discrimination. Additionally, we find that the UBQ tail is malleable and can adopt a variety of conformations upon binding. UbSRD is accessible as an online resource at rosettadesign.med.unc.edu/ubsrd. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Variability extraction and modeling for product variants.

    PubMed

    Linsbauer, Lukas; Lopez-Herrejon, Roberto Erick; Egyed, Alexander

    2017-01-01

    Fast-changing hardware and software technologies in addition to larger and more specialized customer bases demand software tailored to meet very diverse requirements. Software development approaches that aim at capturing this diversity on a single consolidated platform often require large upfront investments, e.g., time or budget. Alternatively, companies resort to developing one variant of a software product at a time by reusing as much as possible from already-existing product variants. However, identifying and extracting the parts to reuse is an error-prone and inefficient task compounded by the typically large number of product variants. Hence, more disciplined and systematic approaches are needed to cope with the complexity of developing and maintaining sets of product variants. Such approaches require detailed information about the product variants, the features they provide and their relations. In this paper, we present an approach to extract such variability information from product variants. It identifies traces from features and feature interactions to their implementation artifacts, and computes their dependencies. This work can be useful in many scenarios ranging from ad hoc development approaches such as clone-and-own to systematic reuse approaches such as software product lines. We applied our variability extraction approach to six case studies and provide a detailed evaluation. The results show that the extracted variability information is consistent with the variability in our six case study systems given by their variability models and available product variants.

  4. Improving coeliac disease risk prediction by testing non-HLA variants additional to HLA variants.

    PubMed

    Romanos, Jihane; Rosén, Anna; Kumar, Vinod; Trynka, Gosia; Franke, Lude; Szperl, Agata; Gutierrez-Achury, Javier; van Diemen, Cleo C; Kanninga, Roan; Jankipersadsing, Soesma A; Steck, Andrea; Eisenbarth, Georges; van Heel, David A; Cukrowska, Bozena; Bruno, Valentina; Mazzilli, Maria Cristina; Núñez, Concepcion; Bilbao, Jose Ramon; Mearin, M Luisa; Barisani, Donatella; Rewers, Marian; Norris, Jill M; Ivarsson, Anneli; Boezen, H Marieke; Liu, Edwin; Wijmenga, Cisca

    2014-03-01

    The majority of coeliac disease (CD) patients are not being properly diagnosed and therefore remain untreated, leading to a greater risk of developing CD-associated complications. The major genetic risk heterodimer, HLA-DQ2 and DQ8, is already used clinically to help exclude disease. However, approximately 40% of the population carry these alleles and the majority never develop CD. We explored whether CD risk prediction can be improved by adding non-HLA-susceptible variants to common HLA testing. We developed an average weighted genetic risk score with 10, 26 and 57 single nucleotide polymorphisms (SNP) in 2675 cases and 2815 controls and assessed the improvement in risk prediction provided by the non-HLA SNP. Moreover, we assessed the transferability of the genetic risk model with 26 non-HLA variants to a nested case-control population (n=1709) and a prospective cohort (n=1245) and then tested how well this model predicted CD outcome for 985 independent individuals. Adding 57 non-HLA variants to HLA testing showed a statistically significant improvement compared to scores from models based on HLA only, HLA plus 10 SNP and HLA plus 26 SNP. With 57 non-HLA variants, the area under the receiver operator characteristic curve reached 0.854 compared to 0.823 for HLA only, and 11.1% of individuals were reclassified to a more accurate risk group. We show that the risk model with HLA plus 26 SNP is useful in independent populations. Predicting risk with 57 additional non-HLA variants improved the identification of potential CD patients. This demonstrates a possible role for combined HLA and non-HLA genetic testing in diagnostic work for CD.

  5. Visibility of medical informatics regarding bibliometric indices and databases

    PubMed Central

    2011-01-01

    Background The quantitative study of the publication output (bibliometrics) deeply influences how scientific work is perceived (bibliometric visibility). Recently, new bibliometric indices and databases have been established, which may change the visibility of disciplines, institutions and individuals. This study examines the effects of the new indices on the visibility of Medical Informatics. Methods By objective criteria, three sets of journals are chosen, two representing Medical Informatics and a third addressing Internal Medicine as a benchmark. The availability of index data (index coverage) and the aggregate scores of these corpora are compared for journal-related (Journal impact factor, Eigenfactor metrics, SCImago journal rank) and author-related indices (Hirsch-index, Egghes G-index). Correlation analysis compares the dependence of author-related indices. Results The bibliometric visibility depended on the research focus and the citation database: Scopus covers more journals relevant for Medical Informatics than ISI/Thomson Reuters. Journals focused on Medical Informatics' methodology were negatively affected by the Eigenfactor metrics, while the visibility profited from an interdisciplinary research focus. The correlation between Hirsch-indices computed on citation databases and the Internet was strong. Conclusions The visibility of smaller technology-oriented disciplines like Medical Informatics is changed by the new bibliometric indices and databases possibly leading to suitably changed publication strategies. Freely accessible author-related indices enable an easy and adequate individual assessment. PMID:21496230

  6. Chromosome Gene Orientation Inversion Networks (GOINs) of Plasmodium Proteome.

    PubMed

    Quevedo-Tumailli, Viviana F; Ortega-Tenezaca, Bernabé; González-Díaz, Humbert

    2018-03-02

    The spatial distribution of genes in chromosomes seems not to be random. For instance, only 10% of genes are transcribed from bidirectional promoters in humans, and many more are organized into larger clusters. This raises intriguing questions previously asked by different authors. We would like to add a few more questions in this context, related to gene orientation inversions. Does gene orientation (inversion) follow a random pattern? Is it relevant to biological activity somehow? We define a new kind of network coined as the gene orientation inversion network (GOIN). GOIN's complex network encodes short- and long-range patterns of inversion of the orientation of pairs of gene in the chromosome. We selected Plasmodium falciparum as a case of study due to the high relevance of this parasite to public health (causal agent of malaria). We constructed here for the first time all of the GOINs for the genome of this parasite. These networks have an average of 383 nodes (genes in one chromosome) and 1314 links (pairs of gene with inverse orientation). We calculated node centralities and other parameters of these networks. These numerical parameters were used to study different properties of gene inversion patterns, for example, distribution, local communities, similarity to Erdös-Rényi random networks, randomness, and so on. We find clues that seem to indicate that gene orientation inversion does not follow a random pattern. We noted that some gene communities in the GOINs tend to group genes encoding for RIFIN-related proteins in the proteome of the parasite. RIFIN-like proteins are a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Consequently, we used these centralities as input of machine learning (ML) models to predict the RIFIN-like activity of 5365 proteins in the proteome of Plasmodium sp. The best linear ML model found discriminates RIFIN-like from other proteins with sensitivity and

  7. Culture and social hierarchy: Self- and other-oriented correlates of socioeconomic status across cultures.

    PubMed

    Miyamoto, Yuri; Yoo, Jiah; Levine, Cynthia S; Park, Jiyoung; Boylan, Jennifer Morozink; Sims, Tamara; Markus, Hazel Rose; Kitayama, Shinobu; Kawakami, Norito; Karasawa, Mayumi; Coe, Christopher L; Love, Gayle D; Ryff, Carol D

    2018-05-17

    Current theorizing on socioeconomic status (SES) focuses on the availability of resources and the freedom they afford as a key determinant of the association between high SES and stronger orientation toward the self and, by implication, weaker orientation toward others. However, this work relies nearly exclusively on data from Western countries where self-orientation is strongly sanctioned. In the present work, we predicted and found that especially in East Asian countries, where other-orientation is strongly sanctioned, high SES is associated with stronger other-orientation as well as with self-orientation. We first examined both psychological attributes (Study 1, N = 2,832) and socialization values (Study 2a, N = 4,675) in Japan and the United States. In line with the existent evidence, SES was associated with greater self-oriented psychological attributes and socialization values in both the U.S. and Japan. Importantly, however, higher SES was associated with greater other orientation in Japan, whereas this association was weaker or even reversed in the United States. Study 2b (N = 85,296) indicated that the positive association between SES and self-orientation is found, overall, across 60 nations. Further, Study 2b showed that the positive association between SES and other-orientation in Japan can be generalized to other Confucian cultures, whereas the negative association between SES and other-orientation in the U.S. can be generalized to other Frontier cultures. Implications of the current findings for modernization and globalization are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  8. Development of a Dynamically Configurable, Object-Oriented Framework for Distributed, Multi-modal Computational Aerospace Systems Simulation

    NASA Technical Reports Server (NTRS)

    Afjeh, Abdollah A.; Reed, John A.

    2003-01-01

    The following reports are presented on this project:A first year progress report on: Development of a Dynamically Configurable,Object-Oriented Framework for Distributed, Multi-modal Computational Aerospace Systems Simulation; A second year progress report on: Development of a Dynamically Configurable, Object-Oriented Framework for Distributed, Multi-modal Computational Aerospace Systems Simulation; An Extensible, Interchangeable and Sharable Database Model for Improving Multidisciplinary Aircraft Design; Interactive, Secure Web-enabled Aircraft Engine Simulation Using XML Databinding Integration; and Improving the Aircraft Design Process Using Web-based Modeling and Simulation.

  9. The Ruby UCSC API: accessing the UCSC genome database using Ruby.

    PubMed

    Mishima, Hiroyuki; Aerts, Jan; Katayama, Toshiaki; Bonnal, Raoul J P; Yoshiura, Koh-ichiro

    2012-09-21

    The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast.The API uses the bin index-if available-when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/.

  10. The Ruby UCSC API: accessing the UCSC genome database using Ruby

    PubMed Central

    2012-01-01

    Background The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. Results The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast. The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Conclusions Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/. PMID:22994508

  11. JICST Factual Database JICST DNA Database

    NASA Astrophysics Data System (ADS)

    Shirokizawa, Yoshiko; Abe, Atsushi

    Japan Information Center of Science and Technology (JICST) has started the on-line service of DNA database in October 1988. This database is composed of EMBL Nucleotide Sequence Library and Genetic Sequence Data Bank. The authors outline the database system, data items and search commands. Examples of retrieval session are presented.

  12. Identification of lung cancer histology-specific variants applying Bayesian framework variant prioritization approaches within the TRICL and ILCCO consortia

    PubMed Central

    Brenner, Darren R.; Amos, Christopher I.; Brhane, Yonathan; Timofeeva, Maria N.; Caporaso, Neil; Wang, Yufei; Christiani, David C.; Bickeböller, Heike; Yang, Ping; Albanes, Demetrius; Stevens, Victoria L.; Gapstur, Susan; McKay, James; Boffetta, Paolo; Zaridze, David; Szeszenia-Dabrowska, Neonilia; Lissowska, Jolanta; Rudnai, Peter; Fabianova, Eleonora; Mates, Dana; Bencko, Vladimir; Foretova, Lenka; Janout, Vladimir; Krokan, Hans E.; Skorpen, Frank; Gabrielsen, Maiken E.; Vatten, Lars; Njølstad, Inger; Chen, Chu; Goodman, Gary; Lathrop, Mark; Vooder, Tõnu; Välk, Kristjan; Nelis, Mari; Metspalu, Andres; Broderick, Peter; Eisen, Timothy; Wu, Xifeng; Zhang, Di; Chen, Wei; Spitz, Margaret R.; Wei, Yongyue; Su, Li; Xie, Dong; She, Jun; Matsuo, Keitaro; Matsuda, Fumihiko; Ito, Hidemi; Risch, Angela; Heinrich, Joachim; Rosenberger, Albert; Muley, Thomas; Dienemann, Hendrik; Field, John K.; Raji, Olaide; Chen, Ying; Gosney, John; Liloglou, Triantafillos; Davies, Michael P.A.; Marcus, Michael; McLaughlin, John; Orlow, Irene; Han, Younghun; Li, Yafang; Zong, Xuchen; Johansson, Mattias; Liu, Geoffrey; Tworoger, Shelley S.; Le Marchand, Loic; Henderson, Brian E.; Wilkens, Lynne R.; Dai, Juncheng; Shen, Hongbing; Houlston, Richard S.; Landi, Maria T.; Brennan, Paul; Hung, Rayjean J.

    2015-01-01

    Large-scale genome-wide association studies (GWAS) have likely uncovered all common variants at the GWAS significance level. Additional variants within the suggestive range (0.0001> P > 5×10−8) are, however, still of interest for identifying causal associations. This analysis aimed to apply novel variant prioritization approaches to identify additional lung cancer variants that may not reach the GWAS level. Effects were combined across studies with a total of 33456 controls and 6756 adenocarcinoma (AC; 13 studies), 5061 squamous cell carcinoma (SCC; 12 studies) and 2216 small cell lung cancer cases (9 studies). Based on prior information such as variant physical properties and functional significance, we applied stratified false discovery rates, hierarchical modeling and Bayesian false discovery probabilities for variant prioritization. We conducted a fine mapping analysis as validation of our methods by examining top-ranking novel variants in six independent populations with a total of 3128 cases and 2966 controls. Three novel loci in the suggestive range were identified based on our Bayesian framework analyses: KCNIP4 at 4p15.2 (rs6448050, P = 4.6×10−7) and MTMR2 at 11q21 (rs10501831, P = 3.1×10−6) with SCC, as well as GAREM at 18q12.1 (rs11662168, P = 3.4×10−7) with AC. Use of our prioritization methods validated two of the top three loci associated with SCC (P = 1.05×10−4 for KCNIP4, represented by rs9799795) and AC (P = 2.16×10−4 for GAREM, represented by rs3786309) in the independent fine mapping populations. This study highlights the utility of using prior functional data for sequence variants in prioritization analyses to search for robust signals in the suggestive range. PMID:26363033

  13. A variant selection model for predicting the transformation texture of deformed austenite

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Butron-Guillen, M.P.; Jonas, J.J.; Da Costa Viana, C.S.

    1997-09-01

    The occurrence of variant selection during the transformation of deformed austenite is examined, together with its effect on the product texture. A new prediction method is proposed based on the morphology of the austenite grains, on slip activity, and on the residual stresses remaining in the material after rolling. The aspect ratio of pancaked grains is demonstrated to play an important role in favoring selection of the transformed copper ({l_brace}311{r_brace}<011> and {l_brace}211{r_brace}<011>) components. The extent of shear on active slip planes during prior rolling is shown to promote the formation of the transformed brass ({l_brace}332{r_brace}<113> and {l_brace}211{r_brace}<113>) components. Finally, themore » residual stresses remaining in the material after rolling play an essential part by preventing growth of the {l_brace}110{r_brace}<110> and {l_brace}100{r_brace} orientations selected by the grain shape and slip activity rules. With the aid of these three variant selection criteria combined, it is possible to reproduce all the features of the transformation textures observed experimentally. The criteria also explain why the intensities of the transformed copper components are sensitive to the pancaking strain, while those of the transformed brass are a function of the cooling rate employed after hot rolling.« less

  14. Palaeo sea-level and ice-sheet databases: problems, strategies and perspectives

    NASA Astrophysics Data System (ADS)

    Rovere, Alessio; Düsterhus, André; Carlson, Anders; Barlow, Natasha; Bradwell, Tom; Dutton, Andrea; Gehrels, Roland; Hibbert, Fiona; Hijma, Marc; Horton, Benjamin; Klemann, Volker; Kopp, Robert; Sivan, Dorit; Tarasov, Lev; Törnqvist, Torbjorn

    2016-04-01

    Databases of palaeoclimate data have driven many major developments in understanding the Earth system. The measurement and interpretation of palaeo sea-level and ice-sheet data that form such databases pose considerable challenges to the scientific communities that use them for further analyses. In this paper, we build on the experience of the PALSEA (PALeo constraints on SEA level rise) community, which is a working group inside the PAGES (Past Global Changes) project, to describe the challenges and best strategies that can be adopted to build a self-consistent and standardised database of geological and geochemical data related to palaeo sea levels and ice sheets. Our aim in this paper is to identify key points that need attention and subsequent funding when undertaking the task of database creation. We conclude that any sea-level or ice-sheet database must be divided into three instances: i) measurement; ii) interpretation; iii) database creation. Measurement should include postion, age, description of geological features, and quantification of uncertainties. All must be described as objectively as possible. Interpretation can be subjective, but it should always include uncertainties and include all the possible interpretations, without unjustified a priori exclusions. We propose that, in the creation of a database, an approach based on Accessibility, Transparency, Trust, Availability, Continued updating, Completeness and Communication of content (ATTAC3) must be adopted. Also, it is essential to consider the community structure that creates and benefits of a database. We conclude that funding sources should consider to address not only the creation of original data in specific research-question oriented projects, but also include the possibility to use part of the funding for IT-related and database creation tasks, which are essential to guarantee accessibility and maintenance of the collected data.

  15. Guillain-Barré Syndrome and Variants

    PubMed Central

    Barohn, Richard J.

    2014-01-01

    Synopsis Guillain-Barré syndrome (GBS) is characterized by rapidly evolving ascending weakness, mild sensory loss and hypo- or areflexia, progressing to a nadir over up to four weeks. Cerebrospinal fluid evaluation demonstrates albuminocytologic dissociation in 90% of cases. Acute inflammatory demyelinating polyneuropathy (AIDP) was the first to be recognized over a century ago and is the most common form of GBS. In AIDP, the immune attack is directed at peripheral nerve myelin with secondary by-stander axon loss. Axonal motor and sensorimotor variants have been described in the last 3 decades and are mediated by molecular mimicry targeting peripheral nerve motor axons. Besides the Miller-Fisher syndrome (MFS) and descending weakness, other rare phenotypic variants have been recently described with pure sensory variant, restricted autonomic manifestations and the pharyngeal-cervical-brachial pattern. It is important to recognize GBS and its variants due to the availability of equally effective therapies in the form of plasmapheresis and intravenous immunoglobulins. PMID:23642721

  16. Processing of No-Release Variants in Connected Speech

    ERIC Educational Resources Information Center

    LoCasto, Paul C.; Connine, Cynthia M.

    2011-01-01

    The cross modal repetition priming paradigm was used to investigate how potential lexically ambiguous no-release variants are processed. In particular we focus on segmental regularities that affect the variant's frequency of occurrence (voicing of the critical segment) and phonological context in which the variant occurs (status of the following…

  17. Early-Onset Progressive Retinal Atrophy Associated with an IQCB1 Variant in African Black-Footed Cats (Felis nigripes)

    PubMed Central

    Oh, Annie; Pearce, Jacqueline W.; Gandolfi, Barbara; Creighton, Erica K.; Suedmeyer, William K.; Selig, Michael; Bosiack, Ann P.; Castaner, Leilani J.; Whiting, Rebecca E. H.; Belknap, Ellen B.; Lyons, Leslie A.; Aderdein, Danielle; Alves, Paulo C.; Barsh, Gregory S.; Beale, Holly C.; Boyko, Adam R.; Castelhano, Marta G.; Chan, Patricia; Ellinwood, N. Matthew; Garrick, Dorian J.; Helps, Christopher R.; Kaelin, Christopher B.; Leeb, Tosso; Lohi, Hannes; Longeri, Maria; Malik, Richard; Montague, Michael J.; Munday, John S.; Murphy, William J.; Pedersen, Niels C.; Rothschild, Max F.; Swanson, William F.; Terio, Karen A.; Todhunter, Rory J.; Warren, Wesley C.

    2017-01-01

    African black-footed cats (Felis nigripes) are endangered wild felids. One male and full-sibling female African black-footed cat developed vision deficits and mydriasis as early as 3 months of age. The diagnosis of early-onset progressive retinal atrophy (PRA) was supported by reduced direct and consensual pupillary light reflexes, phenotypic presence of retinal degeneration, and a non-recordable electroretinogram with negligible amplitudes in both eyes. Whole genome sequencing, conducted on two unaffected parents and one affected offspring was compared to a variant database from 51 domestic cats and a Pallas cat, revealed 50 candidate variants that segregated concordantly with the PRA phenotype. Testing in additional affected cats confirmed that cats homozygous for a 2 base pair (bp) deletion within IQ calmodulin-binding motif-containing protein-1 (IQCB1), the gene that encodes for nephrocystin-5 (NPHP5), had vision loss. The variant segregated concordantly in other related individuals within the pedigree supporting the identification of a recessively inherited early-onset feline PRA. Analysis of the black-footed cat studbook suggests additional captive cats are at risk. Genetic testing for IQCB1 and avoidance of matings between carriers should be added to the species survival plan for captive management. PMID:28322220

  18. Mutation extraction tools can be combined for robust recognition of genetic variants in the literature

    PubMed Central

    Jimeno Yepes, Antonio; Verspoor, Karin

    2014-01-01

    As the cost of genomic sequencing continues to fall, the amount of data being collected and studied for the purpose of understanding the genetic basis of disease is increasing dramatically. Much of the source information relevant to such efforts is available only from unstructured sources such as the scientific literature, and significant resources are expended in manually curating and structuring the information in the literature. As such, there have been a number of systems developed to target automatic extraction of mutations and other genetic variation from the literature using text mining tools. We have performed a broad survey of the existing publicly available tools for extraction of genetic variants from the scientific literature. We consider not just one tool but a number of different tools, individually and in combination, and apply the tools in two scenarios. First, they are compared in an intrinsic evaluation context, where the tools are tested for their ability to identify specific mentions of genetic variants in a corpus of manually annotated papers, the Variome corpus. Second, they are compared in an extrinsic evaluation context based on our previous study of text mining support for curation of the COSMIC and InSiGHT databases. Our results demonstrate that no single tool covers the full range of genetic variants mentioned in the literature. Rather, several tools have complementary coverage and can be used together effectively. In the intrinsic evaluation on the Variome corpus, the combined performance is above 0.95 in F-measure, while in the extrinsic evaluation the combined recall performance is above 0.71 for COSMIC and above 0.62 for InSiGHT, a substantial improvement over the performance of any individual tool. Based on the analysis of these results, we suggest several directions for the improvement of text mining tools for genetic variant extraction from the literature. PMID:25285203

  19. Variants on 8q24 and prostate cancer risk in Chinese population: a meta-analysis.

    PubMed

    Ren, Xiao-Qiang; Zhang, Jian-Guo; Xin, Shi-Yong; Cheng, Tao; Li, Liang; Ren, Wei-Hua

    2015-01-01

    Previous studies have identified 8q24 as an important region to prostate cancer (PCa) susceptibility. The aim of this study was to investigate the role of six genetic variants on 8q24 (rs1447295, A; rs6983267, G; rs6983561, C; rs7837688, T; rs10090154, T and rs16901979, A) on PCa risk in Chinese population. Online electronic databases were searched to retrieve related articles concerning the association between 8q24 variants and PCa risk in men of Chinese population published between 2000 and 2014. Odds ratio (ORs) with its 95% correspondence interval (CI) were employed to assess the strength of association. Total eleven case-control studies were screened out, including 2624 PCa patients and 2438 healthy controls. Our results showed that three risk alleles of rs1447295 A (OR=1.35, 95% CI=1.19-1.53, P<0.00001), rs6983561 C (C vs. A: OR=1.41, 95% CI=1.21-1.63, P<0.00001) and rs10090154 T (T vs. C: OR=1.48, 95% CI=1.22-1.80, P<0.00001) on8q24 were significantly associated with PCa risk in Chinese population. Furthermore, genotypes of rs1447295, AA+AC; rs6983561, CC+AC and CC; rs10090154, TT+TC; and rs16901979, AA were associated with PCa as well (P<0.01). No association was found between rs6983267, rs7837688 and PCa risk. In conclusions, variants including rs1447295, rs6983561, rs10090154 and rs16901979 on 8q24 might be associated with PCa risk in Chinese population, indicating these four variations may contribute risk to this disease. This meta-analysis was the first study to assess the role of 8q24 variants on PCa risk in Chinese population.

  20. NMNAT1 variants cause cone and cone-rod dystrophy.

    PubMed

    Nash, Benjamin M; Symes, Richard; Goel, Himanshu; Dinger, Marcel E; Bennetts, Bruce; Grigg, John R; Jamieson, Robyn V

    2018-03-01

    Cone and cone-rod dystrophies (CD and CRD, respectively) are degenerative retinal diseases that predominantly affect the cone photoreceptors. The underlying disease gene is not known in approximately 75% of autosomal recessive cases. Variants in NMNAT1 cause a severe, early-onset retinal dystrophy called Leber congenital amaurosis (LCA). We report two patients where clinical phenotyping indicated diagnoses of CD and CRD, respectively. NMNAT1 variants were identified, with Case 1 showing an extremely rare homozygous variant c.[271G > A] p.(Glu91Lys) and Case 2 compound heterozygous variants c.[53 A > G];[769G > A] p.(Asn18Ser);(Glu257Lys). The detailed variant analysis, in combination with the observation of an associated macular atrophy phenotype, indicated that these variants were disease-causing. This report demonstrates that the variants in NMNAT1 may cause CD or CRD associated with macular atrophy. Genetic investigations of the patients with CD or CRD should include NMNAT1 in the genes examined.

  1. Identifying Causal Variants at Loci with Multiple Signals of Association

    PubMed Central

    Hormozdiari, Farhad; Kostem, Emrah; Kang, Eun Yong; Pasaniuc, Bogdan; Eskin, Eleazar

    2014-01-01

    Although genome-wide association studies have successfully identified thousands of risk loci for complex traits, only a handful of the biologically causal variants, responsible for association at these loci, have been successfully identified. Current statistical methods for identifying causal variants at risk loci either use the strength of the association signal in an iterative conditioning framework or estimate probabilities for variants to be causal. A main drawback of existing methods is that they rely on the simplifying assumption of a single causal variant at each risk locus, which is typically invalid at many risk loci. In this work, we propose a new statistical framework that allows for the possibility of an arbitrary number of causal variants when estimating the posterior probability of a variant being causal. A direct benefit of our approach is that we predict a set of variants for each locus that under reasonable assumptions will contain all of the true causal variants with a high confidence level (e.g., 95%) even when the locus contains multiple causal variants. We use simulations to show that our approach provides 20–50% improvement in our ability to identify the causal variants compared to the existing methods at loci harboring multiple causal variants. We validate our approach using empirical data from an expression QTL study of CHI3L2 to identify new causal variants that affect gene expression at this locus. CAVIAR is publicly available online at http://genetics.cs.ucla.edu/caviar/. PMID:25104515

  2. Examining rare and low-frequency genetic variants previously associated with lone or familial forms of atrial fibrillation in an electronic medical record system: a cautionary note.

    PubMed

    Weeke, Peter; Denny, Joshua C; Basterache, Lisa; Shaffer, Christian; Bowton, Erica; Ingram, Christie; Darbar, Dawood; Roden, Dan M

    2015-02-01

    Studies in individuals or small kindreds have implicated rare variants in 25 different genes in lone and familial atrial fibrillation (AF) using linkage and segregation analysis, functional characterization, and rarity in public databases. Here, we used a cohort of 20 204 patients of European or African ancestry with electronic medical records and exome chip data to compare the frequency of AF among carriers and noncarriers of these rare variants. The exome chip included 19 of 115 rare variants, in 9 genes, previously associated with lone or familial AF. Using validated algorithms querying a combination of clinical notes, structured billing codes, ECG reports, and procedure codes, we identified 1056 AF cases (>18 years) and 19 148 non-AF controls (>50 years) with available genotype data on the Illumina HumanExome BeadChip v.1.0 in the Vanderbilt electronic medical record-linked DNA repository, BioVU. Known correlations between AF and common variants at 4q25 were replicated. None of the 19 variants previously associated with AF were over-represented among AF cases (P>0.1 for all), and the frequency of variant carriers among non-AF controls was >0.1% for 14 of 19. Repeat analyses using non-AF controls aged >60 (n=14 904), >70 (n=9670), and >80 (n=4729) years did not influence these findings. Rare variants previously implicated in lone or familial forms of AF present on the exome chip are detected at low frequencies in a general population but are not associated with AF. These findings emphasize the need for caution when ascribing variants as pathogenic or causative. © 2014 American Heart Association, Inc.

  3. Nonlinear unitary transformations of space-variant polarized light fields from self-induced geometric-phase optical elements

    NASA Astrophysics Data System (ADS)

    Kravets, Nina; Brasselet, Etienne

    2018-01-01

    We propose to couple the optical orientational nonlinearities of liquid crystals with their ability to self-organize to tailor them to control space-variant-polarized optical fields in a nonlinear manner. Experimental demonstration is made using a liquid crystal light valve that behaves like a light-driven geometric phase optical element. We also unveil two original nonlinear optical processes, namely self-induced separability and nonseparability. These results contribute to the advancement of nonlinear singular optics that is still in its infancy despite 25 years of effort, which may foster the development of nonlinear protocols to manipulate high-dimensional optical information both in the classical and quantum regimes.

  4. Twelve novel HGD gene variants identified in 99 alkaptonuria patients: focus on ‘black bone disease' in Italy

    PubMed Central

    Nemethova, Martina; Radvanszky, Jan; Kadasi, Ludevit; Ascher, David B; Pires, Douglas E V; Blundell, Tom L; Porfirio, Berardino; Mannoni, Alessandro; Santucci, Annalisa; Milucci, Lia; Sestini, Silvia; Biolcati, Gianfranco; Sorge, Fiammetta; Aurizi, Caterina; Aquaron, Robert; Alsbou, Mohammed; Marques Lourenço, Charles; Ramadevi, Kanakasabapathi; Ranganath, Lakshminarayan R; Gallagher, James A; van Kan, Christa; Hall, Anthony K; Olsson, Birgitta; Sireau, Nicolas; Ayoob, Hana; Timmis, Oliver G; Le Quan Sang, Kim-Hanh; Genovese, Federica; Imrich, Richard; Rovensky, Jozef; Srinivasaraghavan, Rangan; Bharadwaj, Shruthi K; Spiegel, Ronen; Zatkova, Andrea

    2016-01-01

    Alkaptonuria (AKU) is an autosomal recessive disorder caused by mutations in homogentisate-1,2-dioxygenase (HGD) gene leading to the deficiency of HGD enzyme activity. The DevelopAKUre project is underway to test nitisinone as a specific treatment to counteract this derangement of the phenylalanine-tyrosine catabolic pathway. We analysed DNA of 40 AKU patients enrolled for SONIA1, the first study in DevelopAKUre, and of 59 other AKU patients sent to our laboratory for molecular diagnostics. We identified 12 novel DNA variants: one was identified in patients from Brazil (c.557T>A), Slovakia (c.500C>T) and France (c.440T>C), three in patients from India (c.469+6T>C, c.650–85A>G, c.158G>A), and six in patients from Italy (c.742A>G, c.614G>A, c.1057A>C, c.752G>A, c.119A>C, c.926G>T). Thus, the total number of potential AKU-causing variants found in 380 patients reported in the HGD mutation database is now 129. Using mCSM and DUET, computational approaches based on the protein 3D structure, the novel missense variants are predicted to affect the activity of the enzyme by three mechanisms: decrease of stability of individual protomers, disruption of protomer-protomer interactions or modification of residues in the region of the active site. We also present an overview of AKU in Italy, where so far about 60 AKU cases are known and DNA analysis has been reported for 34 of them. In this rather small group, 26 different HGD variants affecting function were described, indicating rather high heterogeneity. Twelve of these variants seem to be specific for Italy. PMID:25804398

  5. Visualization and manipulating the image of a formal data structure (FDS)-based database

    NASA Astrophysics Data System (ADS)

    Verdiesen, Franc; de Hoop, Sylvia; Molenaar, Martien

    1994-08-01

    A vector map is a terrain representation with a vector-structured geometry. Molenaar formulated an object-oriented formal data structure for 3D single valued vector maps. This FDS is implemented in a database (Oracle). In this study we describe a methodology for visualizing a FDS-based database and manipulating the image. A data set retrieved by querying the database is converted into an import file for a drawing application. An objective of this study is that an end-user can alter and add terrain objects in the image. The drawing application creates an export file, that is compared with the import file. Differences between these files result in updating the database which involves checks on consistency. In this study Autocad is used for visualizing and manipulating the image of the data set. A computer program has been written for the data exchange and conversion between Oracle and Autocad. The data structure of the FDS is compared to the data structure of Autocad and the data of the FDS is converted into the structure of Autocad equal to the FDS.

  6. Amino acid changes in disease-associated variants differ radically from variants observed in the 1000 genomes project dataset.

    PubMed

    de Beer, Tjaart A P; Laskowski, Roman A; Parks, Sarah L; Sipos, Botond; Goldman, Nick; Thornton, Janet M

    2013-01-01

    The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids are very different. These differences predominantly reflect preferences for nucleotide mutations in the DNA (especially the high mutation rate of the CpG dinucleotide, which makes arginine mutability very much higher than other amino acids) rather than selection imposed by protein structure constraints, although there is evidence for the latter as well. The variants occur predominantly on the surface of proteins (82%), with a slight preference for sites which are more exposed and less well conserved than random. Mutations to functional residues occur about half as often as expected by chance. The disease-associated amino acid variant distributions in OMIM are radically different from those expected on the basis of the 1000 Genomes dataset. The disease-associated variants preferentially occur in more conserved sites, compared to 1000 Genomes mutations. Many of the amino acid exchange profiles appear to exhibit an anti-correlation, with common exchanges in one dataset being rare in the other. Disease-associated variants exhibit more extreme differences in amino acid size and hydrophobicity. More modelling of the mutational processes at the nucleotide level is needed, but these observations should contribute to an improved prediction of the effects of specific variants in humans.

  7. Facet orientation in the thoracolumbar spine: three-dimensional anatomic and biomechanical analysis.

    PubMed

    Masharawi, Youssef; Rothschild, Bruce; Dar, Gali; Peleg, Smadar; Robinson, Dror; Been, Ella; Hershkovitz, Israel

    2004-08-15

    Thoracolumbar facet orientations were measured and analyzed. To establish a comprehensive database for facet orientation in the thoracolumbar vertebrae and to determine the normal human condition. Most studies on facet orientation have based their conclusions on two-dimensional measurements, in small samples or isolated vertebrae. The amount of normal asymmetry in facet orientation is poorly addressed. Transverse and longitudinal facet angles were measured directly from 240 human vertebral columns (males/females, blacks/whites). The specimens' osteologic material is part of the Hamann-Todd Osteological Collection housed at the Cleveland Museum of Natural History (Cleveland, OH). A total of 4,080 vertebrae (T1-L5) from the vertebral columns of individuals 20 to 80 years of age were measured, using a Microscribe three-dimensional apparatus (Immersion Co., San Jose, CA). Data were recorded directly on computer software. Statistical analysis included paired t tests and analysis of variance. RESULTS.: Facet orientation is independent of gender, age, and ethnic group. Asymmetry in facet orientation is found in the thorax. All thoracolumbar facets are positioned in an oblique plane. In the transverse plane, all facets from T1 to T11 are positioned with an anterior inclination of approximately 25 degrees to 30 degrees from the frontal plane. The facets of T12-L2 are oriented closer to the midsagittal plane of the vertebral body (mean range, 25.89 degrees-33.87 degrees), while the facets of L3-L5 are oriented away from that plane (mean range, 40.40 degrees-56.30 degrees). Facet transverse orientation at the thoracolumbar junction is highly variable (approximately 80% with approximately 101 degrees and approximately 20% with 35 degrees). All facets are oriented more vertically from T1 (approximately 150 degrees) to L5 (approximately 170 degrees). The facet sagittal orientations of the lumbar zygoapophyseal joints are not equivalent. CONCLUSIONS.: Asymmetry in facet

  8. The variant call format and VCFtools.

    PubMed

    Danecek, Petr; Auton, Adam; Abecasis, Goncalo; Albers, Cornelis A; Banks, Eric; DePristo, Mark A; Handsaker, Robert E; Lunter, Gerton; Marth, Gabor T; Sherry, Stephen T; McVean, Gilean; Durbin, Richard

    2011-08-01

    The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. http://vcftools.sourceforge.net

  9. An Introduction to Database Structure and Database Machines.

    ERIC Educational Resources Information Center

    Detweiler, Karen

    1984-01-01

    Enumerates principal management objectives of database management systems (data independence, quality, security, multiuser access, central control) and criteria for comparison (response time, size, flexibility, other features). Conventional database management systems, relational databases, and database machines used for backend processing are…

  10. The MPI Emotional Body Expressions Database for Narrative Scenarios

    PubMed Central

    Volkova, Ekaterina; de la Rosa, Stephan; Bülthoff, Heinrich H.; Mohler, Betty

    2014-01-01

    Emotion expression in human-human interaction takes place via various types of information, including body motion. Research on the perceptual-cognitive mechanisms underlying the processing of natural emotional body language can benefit greatly from datasets of natural emotional body expressions that facilitate stimulus manipulation and analysis. The existing databases have so far focused on few emotion categories which display predominantly prototypical, exaggerated emotion expressions. Moreover, many of these databases consist of video recordings which limit the ability to manipulate and analyse the physical properties of these stimuli. We present a new database consisting of a large set (over 1400) of natural emotional body expressions typical of monologues. To achieve close-to-natural emotional body expressions, amateur actors were narrating coherent stories while their body movements were recorded with motion capture technology. The resulting 3-dimensional motion data recorded at a high frame rate (120 frames per second) provides fine-grained information about body movements and allows the manipulation of movement on a body joint basis. For each expression it gives the positions and orientations in space of 23 body joints for every frame. We report the results of physical motion properties analysis and of an emotion categorisation study. The reactions of observers from the emotion categorisation study are included in the database. Moreover, we recorded the intended emotion expression for each motion sequence from the actor to allow for investigations regarding the link between intended and perceived emotions. The motion sequences along with the accompanying information are made available in a searchable MPI Emotional Body Expression Database. We hope that this database will enable researchers to study expression and perception of naturally occurring emotional body expressions in greater depth. PMID:25461382

  11. PTGER4 modulating variants in Crohn's disease.

    PubMed

    Prager, Matthias; Büttner, Janine; Büning, Carsten

    2014-08-01

    Variants modulating expression of the prostaglandin receptor 4 (PTGER4) have been reported to be associated with Cohn's disease (CD), but the clinical impact remains to be elucidated. We analyzed these variants in a large German inflammatory bowel disease (IBD) cohort and searched for a potential phenotype association. The variants rs4495224 and rs7720838 were studied in adult German IBD patients (CD, n = 475; ulcerative colitis (UC), n = 293) and healthy controls (HC, n = 467). Data were correlated to results from NOD2 genotyping and to clinical characteristics. We found a significant association for the rs7720838 variant with overrepresentation of the T allele to CD (p = 0.0058; OR 0.7703, 95 % CI 0.641-0.926) but not to UC. Furthermore, logistic regression analysis revealed that the presence of the T allele was associated with stricturing disease behavior in CD patients (p = 0.03; OR 1.84, 95 % CI 1.07-3.16). Interestingly, the chance for developing stricturing disease behavior was enhanced if mutant alleles in both rs7720838 and NOD2 were present (OR 2.87, 95 % CI 1.42-5.81; p = 0.003). No overall association to CD or UC was found for the rs4495224 variant. The PTGER4 modulating variant rs7720838 increases susceptibility for CD and might resemble a risk factor for stricturing disease behavior.

  12. Identifying causal variants at loci with multiple signals of association.

    PubMed

    Hormozdiari, Farhad; Kostem, Emrah; Kang, Eun Yong; Pasaniuc, Bogdan; Eskin, Eleazar

    2014-10-01

    Although genome-wide association studies have successfully identified thousands of risk loci for complex traits, only a handful of the biologically causal variants, responsible for association at these loci, have been successfully identified. Current statistical methods for identifying causal variants at risk loci either use the strength of the association signal in an iterative conditioning framework or estimate probabilities for variants to be causal. A main drawback of existing methods is that they rely on the simplifying assumption of a single causal variant at each risk locus, which is typically invalid at many risk loci. In this work, we propose a new statistical framework that allows for the possibility of an arbitrary number of causal variants when estimating the posterior probability of a variant being causal. A direct benefit of our approach is that we predict a set of variants for each locus that under reasonable assumptions will contain all of the true causal variants with a high confidence level (e.g., 95%) even when the locus contains multiple causal variants. We use simulations to show that our approach provides 20-50% improvement in our ability to identify the causal variants compared to the existing methods at loci harboring multiple causal variants. We validate our approach using empirical data from an expression QTL study of CHI3L2 to identify new causal variants that affect gene expression at this locus. CAVIAR is publicly available online at http://genetics.cs.ucla.edu/caviar/. Copyright © 2014 by the Genetics Society of America.

  13. The effect of wild card designations and rare alleles in forensic DNA database searches.

    PubMed

    Tvedebrink, Torben; Bright, Jo-Anne; Buckleton, John S; Curran, James M; Morling, Niels

    2015-05-01

    Forensic DNA databases are powerful tools used for the identification of persons of interest in criminal investigations. Typically, they consist of two parts: (1) a database containing DNA profiles of known individuals and (2) a database of DNA profiles associated with crime scenes. The risk of adventitious or chance matches between crimes and innocent people increases as the number of profiles within a database grows and more data is shared between various forensic DNA databases, e.g. from different jurisdictions. The DNA profiles obtained from crime scenes are often partial because crime samples may be compromised in quantity or quality. When an individual's profile cannot be resolved from a DNA mixture, ambiguity is introduced. A wild card, F, may be used in place of an allele that has dropped out or when an ambiguous profile is resolved from a DNA mixture. Variant alleles that do not correspond to any marker in the allelic ladder or appear above or below the extent of the allelic ladder range are assigned the allele designation R for rare allele. R alleles are position specific with respect to the observed/unambiguous allele. The F and R designations are made when the exact genotype has not been determined. The F and R designation are treated as wild cards for searching, which results in increased chance of adventitious matches. We investigated the probability of adventitious matches given these two types of wild cards. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  14. Association of genetic variants of GRIN2B with autism.

    PubMed

    Pan, Yongcheng; Chen, Jingjing; Guo, Hui; Ou, Jianjun; Peng, Yu; Liu, Qiong; Shen, Yidong; Shi, Lijuan; Liu, Yalan; Xiong, Zhimin; Zhu, Tengfei; Luo, Sanchuan; Hu, Zhengmao; Zhao, Jingping; Xia, Kun

    2015-02-06

    Autism (MIM 209850) is a complex neurodevelopmental disorder characterized by social communication impairments and restricted repetitive behaviors. It has a high heritability, although much remains unclear. To evaluate genetic variants of GRIN2B in autism etiology, we performed a system association study of common and rare variants of GRIN2B and autism in cohorts from a Chinese population, involving a total sample of 1,945 subjects. Meta-analysis of a triad family cohort and a case-control cohort identified significant associations of multiple common variants and autism risk (Pmin = 1.73 × 10(-4)). Significantly, the haplotype involved with the top common variants also showed significant association (P = 1.78 × 10(-6)). Sanger sequencing of 275 probands from a triad cohort identified several variants in coding regions, including four common variants and seven rare variants. Two of the common coding variants were located in the autism-related linkage disequilibrium (LD) block, and both were significantly associated with autism (P < 9 × 10(-3)) using an independent control cohort. Burden analysis and case-only analysis of rare coding variants identified by Sanger sequencing did not find this association. Our study for the first time reveals that common variants and related haplotypes of GRIN2B are associated with autism risk.

  15. Mesh Oriented datABase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tautges, Timothy J.

    MOAB is a component for representing and evaluating mesh data. MOAB can store stuctured and unstructured mesh, consisting of elements in the finite element "zoo". The functional interface to MOAB is simple yet powerful, allowing the representation of many types of metadata commonly found on the mesh. MOAB is optimized for efficiency in space and time, based on access to mesh in chunks rather than through individual entities, while also versatile enough to support individual entity access. The MOAB data model consists of a mesh interface instance, mesh entities (vertices and elements), sets, and tags. Entities are addressed through handlesmore » rather than pointers, to allow the underlying representation of an entity to change without changing the handle to that entity. Sets are arbitrary groupings of mesh entities and other sets. Sets also support parent/child relationships as a relation distinct from sets containing other sets. The directed-graph provided by set parent/child relationships is useful for modeling topological relations from a geometric model or other metadata. Tags are named data which can be assigned to the mesh as a whole, individual entities, or sets. Tags are a mechanism for attaching data to individual entities and sets are a mechanism for describing relations between entities; the combination of these two mechanisms isa powerful yet simple interface for representing metadata or application-specific data. For example, sets and tags can be used together to describe geometric topology, boundary condition, and inter-processor interface groupings in a mesh. MOAB is used in several ways in various applications. MOAB serves as the underlying mesh data representation in the VERDE mesh verification code. MOAB can also be used as a mesh input mechanism, using mesh readers induded with MOAB, or as a t’anslator between mesh formats, using readers and writers included with MOAB.« less

  16. Quantitative transmission electron microscopy analysis of multi-variant grains in present L1{sub 0}-FePt based heat assisted magnetic recording media

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ho, Hoan, E-mail: hoan.ho@wdc.com; Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213; Zhu, Jingxi, E-mail: jingxiz@andrew.cmu.edu

    2014-11-21

    We present a study on atomic ordering within individual grains in granular L1{sub 0}-FePt thin films using transmission electron microscopy techniques. The film, used as a medium for heat assisted magnetic recording, consists of a single layer of FePt grains separated by non-magnetic grain boundaries and is grown on an MgO underlayer. Using convergent-beam techniques, diffraction patterns of individual grains are obtained for a large number of crystallites. The study found that although the majority of grains are ordered in the perpendicular direction, more than 15% of them are multi-variant, or of in-plane c-axis orientation, or disordered fcc. It wasmore » also found that these multi-variant and in-plane grains have always grown across MgO grain boundaries separating two or more MgO grains of the underlayer. The in-plane ordered portion within a multi-variant L1{sub 0}-FePt grain always lacks atomic coherence with the MgO directly underneath it, whereas, the perpendicularly ordered portion is always coherent with the underlying MgO grain. Since the existence of multi-variant and in-plane ordered grains are severely detrimental to high density data storage capability, the understanding of their formation mechanism obtained here should make a significant impact on the future development of hard disk drive technology.« less

  17. Embodied memory allows accurate and stable perception of hidden objects despite orientation change.

    PubMed

    Pan, Jing Samantha; Bingham, Ned; Bingham, Geoffrey P

    2017-07-01

    Rotating a scene in a frontoparallel plane (rolling) yields a change in orientation of constituent images. When using only information provided by static images to perceive a scene after orientation change, identification performance typically decreases (Rock & Heimer, 1957). However, rolling generates optic flow information that relates the discrete, static images (before and after the change) and forms an embodied memory that aids recognition. The embodied memory hypothesis predicts that upon detecting a continuous spatial transformation of image structure, or in other words, seeing the continuous rolling process and objects undergoing rolling observers should accurately perceive objects during and after motion. Thus, in this case, orientation change should not affect performance. We tested this hypothesis in three experiments and found that (a) using combined optic flow and image structure, participants identified locations of previously perceived but currently occluded targets with great accuracy and stability (Experiment 1); (b) using combined optic flow and image structure information, participants identified hidden targets equally well with or without 30° orientation changes (Experiment 2); and (c) when the rolling was unseen, identification of hidden targets after orientation change became worse (Experiment 3). Furthermore, when rolling was unseen, although target identification was better when participants were told about the orientation change than when they were not told, performance was still worse than when there was no orientation change. Therefore, combined optic flow and image structure information, not mere knowledge about the rolling, enables accurate and stable perception despite orientation change. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  18. MRPrimerV: a database of PCR primers for RNA virus detection

    PubMed Central

    Kim, Hyerin; Kang, NaNa; An, KyuHyeon; Kim, Doyun; Koo, JaeHyung; Kim, Min-Soo

    2017-01-01

    Many infectious diseases are caused by viral infections, and in particular by RNA viruses such as MERS, Ebola and Zika. To understand viral disease, detection and identification of these viruses are essential. Although PCR is widely used for rapid virus identification due to its low cost and high sensitivity and specificity, very few online database resources have compiled PCR primers for RNA viruses. To effectively detect viruses, the MRPrimerV database (http://MRPrimerV.com) contains 152 380 247 PCR primer pairs for detection of 1818 viruses, covering 7144 coding sequences (CDSs), representing 100% of the RNA viruses in the most up-to-date NCBI RefSeq database. Due to rigorous similarity testing against all human and viral sequences, every primer in MRPrimerV is highly target-specific. Because MRPrimerV ranks CDSs by the penalty scores of their best primer, users need only use the first primer pair for a single-phase PCR or the first two primer pairs for two-phase PCR. Moreover, MRPrimerV provides the list of genome neighbors that can be detected using each primer pair, covering 22 192 variants of 532 RefSeq RNA viruses. We believe that the public availability of MRPrimerV will facilitate viral metagenomics studies aimed at evaluating the variability of viruses, as well as other scientific tasks. PMID:27899620

  19. Identification of lung cancer histology-specific variants applying Bayesian framework variant prioritization approaches within the TRICL and ILCCO consortia.

    PubMed

    Brenner, Darren R; Amos, Christopher I; Brhane, Yonathan; Timofeeva, Maria N; Caporaso, Neil; Wang, Yufei; Christiani, David C; Bickeböller, Heike; Yang, Ping; Albanes, Demetrius; Stevens, Victoria L; Gapstur, Susan; McKay, James; Boffetta, Paolo; Zaridze, David; Szeszenia-Dabrowska, Neonilia; Lissowska, Jolanta; Rudnai, Peter; Fabianova, Eleonora; Mates, Dana; Bencko, Vladimir; Foretova, Lenka; Janout, Vladimir; Krokan, Hans E; Skorpen, Frank; Gabrielsen, Maiken E; Vatten, Lars; Njølstad, Inger; Chen, Chu; Goodman, Gary; Lathrop, Mark; Vooder, Tõnu; Välk, Kristjan; Nelis, Mari; Metspalu, Andres; Broderick, Peter; Eisen, Timothy; Wu, Xifeng; Zhang, Di; Chen, Wei; Spitz, Margaret R; Wei, Yongyue; Su, Li; Xie, Dong; She, Jun; Matsuo, Keitaro; Matsuda, Fumihiko; Ito, Hidemi; Risch, Angela; Heinrich, Joachim; Rosenberger, Albert; Muley, Thomas; Dienemann, Hendrik; Field, John K; Raji, Olaide; Chen, Ying; Gosney, John; Liloglou, Triantafillos; Davies, Michael P A; Marcus, Michael; McLaughlin, John; Orlow, Irene; Han, Younghun; Li, Yafang; Zong, Xuchen; Johansson, Mattias; Liu, Geoffrey; Tworoger, Shelley S; Le Marchand, Loic; Henderson, Brian E; Wilkens, Lynne R; Dai, Juncheng; Shen, Hongbing; Houlston, Richard S; Landi, Maria T; Brennan, Paul; Hung, Rayjean J

    2015-11-01

    Large-scale genome-wide association studies (GWAS) have likely uncovered all common variants at the GWAS significance level. Additional variants within the suggestive range (0.0001> P > 5×10(-8)) are, however, still of interest for identifying causal associations. This analysis aimed to apply novel variant prioritization approaches to identify additional lung cancer variants that may not reach the GWAS level. Effects were combined across studies with a total of 33456 controls and 6756 adenocarcinoma (AC; 13 studies), 5061 squamous cell carcinoma (SCC; 12 studies) and 2216 small cell lung cancer cases (9 studies). Based on prior information such as variant physical properties and functional significance, we applied stratified false discovery rates, hierarchical modeling and Bayesian false discovery probabilities for variant prioritization. We conducted a fine mapping analysis as validation of our methods by examining top-ranking novel variants in six independent populations with a total of 3128 cases and 2966 controls. Three novel loci in the suggestive range were identified based on our Bayesian framework analyses: KCNIP4 at 4p15.2 (rs6448050, P = 4.6×10(-7)) and MTMR2 at 11q21 (rs10501831, P = 3.1×10(-6)) with SCC, as well as GAREM at 18q12.1 (rs11662168, P = 3.4×10(-7)) with AC. Use of our prioritization methods validated two of the top three loci associated with SCC (P = 1.05×10(-4) for KCNIP4, represented by rs9799795) and AC (P = 2.16×10(-4) for GAREM, represented by rs3786309) in the independent fine mapping populations. This study highlights the utility of using prior functional data for sequence variants in prioritization analyses to search for robust signals in the suggestive range. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. Base Excision Repair Variants in Cancer

    PubMed Central

    Marsden, Carolyn G.; Dragon, Julie A.; Wallace, Susan S.; Sweasy, Joann B.

    2018-01-01

    Base excision repair (BER) is a key genome maintenance pathway that removes endogenously damaged DNA bases that arise in cells at very high levels on a daily basis. Failure to remove these damaged DNA bases leads to increased levels of mutagenesis and chromosomal instability, which have the potential to drive carcinogenesis. Next Generation sequencing efforts of the germline and tumors genomes of thousands of individuals has uncovered many rare mutations in BER genes. Given that BER is critical for genome maintenance, it is important to determine whether BER genomic variants have functional phenotypes. In this chapter we present our in silico methods for the identification and prioritization of BER variants for further study. We also provide detailed instructions and commentary on the initial cellular assays we employ to dissect potentially important phenotypes of human BER variants and highlight the strengths and weaknesses of our approaches. BER variants possessing interesting functional phenotypes can then be studied in more detail to provide important mechanistic insights regarding the role of aberrant BER in carcinogenesis. PMID:28645367

  1. Kinematic model for the space-variant image motion of star sensors under dynamical conditions

    NASA Astrophysics Data System (ADS)

    Liu, Chao-Shan; Hu, Lai-Hong; Liu, Guang-Bin; Yang, Bo; Li, Ai-Jun

    2015-06-01

    A kinematic description of a star spot in the focal plane is presented for star sensors under dynamical conditions, which involves all necessary parameters such as the image motion, velocity, and attitude parameters of the vehicle. Stars at different locations of the focal plane correspond to the slightly different orientation and extent of motion blur, which characterize the space-variant point spread function. Finally, the image motion, the energy distribution, and centroid extraction are numerically investigated using the kinematic model under dynamic conditions. A centroid error of eight successive iterations <0.002 pixel is used as the termination criterion for the Richardson-Lucy deconvolution algorithm. The kinematic model of a star sensor is useful for evaluating the compensation algorithms of motion-blurred images.

  2. Rare copy number variants in a population-based investigation of hypoplastic right heart syndrome.

    PubMed

    Dimopoulos, Aggeliki; Sicko, Robert J; Kay, Denise M; Rigler, Shannon L; Druschel, Charlotte M; Caggana, Michele; Browne, Marilyn L; Fan, Ruzong; Romitti, Paul A; Brody, Lawrence C; Mills, James L

    2017-01-20

    Hypoplastic right heart syndrome (HRHS) is a rare congenital defect characterized by underdevelopment of the right heart structures commonly accompanied by an atrial septal defect. Familial HRHS reports suggest genetic factor involvement. We examined the role of copy number variants (CNVs) in HRHS. We genotyped 32 HRHS cases identified from all New York State live births (1998-2005) using Illumina HumanOmni2.5 microarrays. CNVs were called with PennCNV and prioritized if they were ≥20 Kb, contained ≥10 SNPs and had minimal overlap with CNVs from in-house controls, the Database of Genomic Variants, HapMap3, and Childrens Hospital of Philadelphia database. We identified 28 CNVs in 17 cases; several encompassed genes important for right heart development. One case had a 2p16-2p23 duplication spanning LBH, a limb and heart development transcription factor. Lbh mis-expression results in right ventricular hypoplasia and pulmonary valve defects. This duplication also encompassed SOS1, a factor associated with pulmonary valve stenosis in Noonan syndrome. Sos1 -/- mice display thin and poorly trabeculated ventricles. In another case, we identified a 1.5 Mb deletion associated with Williams-Beuren syndrome, a disorder that includes valvular malformations. A third case had a 24 Kb deletion upstream of the TGFβ ligand ITGB8. Embryos genetically null for Itgb8, and its intracellular interactant Band 4.1B, display lethal cardiac phenotypes. To our knowledge, this is the first study of CNVs in HRHS. We identified several rare CNVs that overlap genes related to right ventricular wall and valve development, suggesting that genetics plays a role in HRHS and providing clues for further investigation. Birth Defects Research 109:16-26, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  3. Rare Copy Number Variants in a Population Based Investigation of Hypoplastic Right Heart Syndrome

    PubMed Central

    Dimopoulos, Aggeliki; Sicko, Robert J.; Kay, Denise M.; Rigler, Shannon L.; Druschel, Charlotte M.; Caggana, Michele; Browne, Marilyn L.; Fan, Ruzong; Romitti, Paul A.; Brody, Lawrence C.; Mills, James L.

    2016-01-01

    Background Hypoplastic right heart syndrome (HRHS) is a rare congenital defect characterized by underdevelopment of the right heart structures commonly accompanied by an atrial septal defect. Familial HRHS reports suggest genetic factor involvement. We examined the role of copy number variants (CNVs) in HRHS. Methods We genotyped 32 HRHS cases identified from all New York State live births (1998–2005) using Illumina HumanOmni2.5 microarrays. CNVs were called with PennCNV and prioritized if they were ≥20Kb, contained ≥10 SNPs and had minimal overlap with CNVs from in-house controls, the Database of Genomic Variants, HapMap3 and CHOP database. Results We identified 28 CNVs in 17 cases; several encompassed genes important for right heart development. One case had a 2p16–2p23 duplication spanning LBH, a limb and heart development transcription factor. Lbh mis-expression results in right ventricular hypoplasia and pulmonary valve defects. This duplication also encompassed SOS1, a factor associated with pulmonary valve stenosis in Noonan syndrome. Sos1−/− mice display thin and poorly trabeculated ventricles. In another case, we identified a 1.5Mb deletion associated with Williams Beuren syndrome, a disorder that includes valvular malformations. A third case had a 24Kb deletion upstream of the TGFβ ligand ITGB8. Embryos genetically null for Itgb8, and its intracellular interactant Band 4.1B, display lethal cardiac phenotypes. Conclusions To our knowledge, this is the first study of CNVs in HRHS. We identified several rare CNVs that overlap genes related to right ventricular wall and valve development, suggesting that genetics plays a role in HRHS and providing clues for further investigation. PMID:28009100

  4. Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller.

    PubMed

    Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun

    2017-01-03

    Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.

  5. Guidelines for investigating causality of sequence variants in human disease

    PubMed Central

    MacArthur, D. G.; Manolio, T. A.; Dimmock, D. P.; Rehm, H. L.; Shendure, J.; Abecasis, G. R.; Adams, D. R.; Altman, R. B.; Antonarakis, S. E.; Ashley, E. A.; Barrett, J. C.; Biesecker, L. G.; Conrad, D. F.; Cooper, G. M.; Cox, N. J.; Daly, M. J.; Gerstein, M. B.; Goldstein, D. B.; Hirschhorn, J. N.; Leal, S. M.; Pennacchio, L. A.; Stamatoyannopoulos, J. A.; Sunyaev, S. R.; Valle, D.; Voight, B. F.; Winckler, W.; Gunter, C.

    2014-01-01

    The discovery of rare genetic variants is accelerating, and clear guidelines for distinguishing disease-causing sequence variants from the many potentially functional variants present in any human genome are urgently needed. Without rigorous standards we risk an acceleration of false-positive reports of causality, which would impede the translation of genomic research findings into the clinical diagnostic setting and hinder biological understanding of disease. Here we discuss the key challenges of assessing sequence variants in human disease, integrating both gene-level and variant-level support for causality. We propose guidelines for summarizing confidence in variant pathogenicity and highlight several areas that require further resource development. PMID:24759409

  6. Guidelines for investigating causality of sequence variants in human disease.

    PubMed

    MacArthur, D G; Manolio, T A; Dimmock, D P; Rehm, H L; Shendure, J; Abecasis, G R; Adams, D R; Altman, R B; Antonarakis, S E; Ashley, E A; Barrett, J C; Biesecker, L G; Conrad, D F; Cooper, G M; Cox, N J; Daly, M J; Gerstein, M B; Goldstein, D B; Hirschhorn, J N; Leal, S M; Pennacchio, L A; Stamatoyannopoulos, J A; Sunyaev, S R; Valle, D; Voight, B F; Winckler, W; Gunter, C

    2014-04-24

    The discovery of rare genetic variants is accelerating, and clear guidelines for distinguishing disease-causing sequence variants from the many potentially functional variants present in any human genome are urgently needed. Without rigorous standards we risk an acceleration of false-positive reports of causality, which would impede the translation of genomic research findings into the clinical diagnostic setting and hinder biological understanding of disease. Here we discuss the key challenges of assessing sequence variants in human disease, integrating both gene-level and variant-level support for causality. We propose guidelines for summarizing confidence in variant pathogenicity and highlight several areas that require further resource development.

  7. Hemoglobin Abraham Lincoln, β32 (B14) Leucine → Proline AN UNSTABLE VARIANT PRODUCING SEVERE HEMOLYTIC DISEASE

    PubMed Central

    Honig, George R.; Green, David; Shamsuddin, Mir; Vida, Loyda N.; Mason, R. George; Gnarra, David J.; Maurer, Helen S.

    1973-01-01

    An unstable hemoglobin variant was identified in a Negro woman with hemolytic anemia since infancy. A splenectomy had been performed when the patient was a child. The anemia was accompanied by erythrocyte inclusion bodies and excretion of darkly pigmented urine. Neither parent of the proposita demonstrated any hematologic abnormality, and it appeared that this hemoglobin variant arose as a new mutation. Erythrocyte survival in the patient was greatly reduced: the erythrocyte t½ using radiochromium as a tag was 2.4 days, and a reticulocyte survival study performed after labeling the cells with L-[14C]leucine indicated a t½ of 7.2 days. When stroma-free hemolysates were heated at 50°C, 16-20% of the hemoglobin precipitated. The thermolability was prevented by the addition of hemin, carbon monoxide, or dithionite, suggesting an abnormality of heme binding. An increased rate of methemoglobin formation was also observed after incubation of erythrocytes at 37°C. The abnormal hemoglobin could not be separated from hemoglobin A by electrophoresis or chromatography, but it was possible to isolate the variant β-chain by precipitation with p-hydroxymercuribenzoate. Purification of the β-chain by column chromatography followed by peptide mapping and amino acid analysis demonstrated a substitution of proline for β32 leucine. It appears likely that a major effect of this substitution is a disruption of the normal orientation of the adjacent leucine residue at β31 to impair heme stabilization. Images PMID:4352462

  8. Federated web-accessible clinical data management within an extensible neuroimaging database.

    PubMed

    Ozyurt, I Burak; Keator, David B; Wei, Dingying; Fennema-Notestine, Christine; Pease, Karen R; Bockholt, Jeremy; Grethe, Jeffrey S

    2010-12-01

    Managing vast datasets collected throughout multiple clinical imaging communities has become critical with the ever increasing and diverse nature of datasets. Development of data management infrastructure is further complicated by technical and experimental advances that drive modifications to existing protocols and acquisition of new types of research data to be incorporated into existing data management systems. In this paper, an extensible data management system for clinical neuroimaging studies is introduced: The Human Clinical Imaging Database (HID) and Toolkit. The database schema is constructed to support the storage of new data types without changes to the underlying schema. The complex infrastructure allows management of experiment data, such as image protocol and behavioral task parameters, as well as subject-specific data, including demographics, clinical assessments, and behavioral task performance metrics. Of significant interest, embedded clinical data entry and management tools enhance both consistency of data reporting and automatic entry of data into the database. The Clinical Assessment Layout Manager (CALM) allows users to create on-line data entry forms for use within and across sites, through which data is pulled into the underlying database via the generic clinical assessment management engine (GAME). Importantly, the system is designed to operate in a distributed environment, serving both human users and client applications in a service-oriented manner. Querying capabilities use a built-in multi-database parallel query builder/result combiner, allowing web-accessible queries within and across multiple federated databases. The system along with its documentation is open-source and available from the Neuroimaging Informatics Tools and Resource Clearinghouse (NITRC) site.

  9. A SImplified method for Segregation Analysis (SISA) to determine penetrance and expression of a genetic variant in a family.

    PubMed

    Møller, Pål; Clark, Neal; Mæhle, Lovise

    2011-05-01

    A method for SImplified rapid Segregation Analysis (SISA) to assess penetrance and expression of genetic variants in pedigrees of any complexity is presented. For this purpose the probability for recombination between the variant and the gene is zero. An assumption is that the variant of undetermined significance (VUS) is introduced into the family once only. If so, all family members in between two members demonstrated to carry a VUS, are obligate carriers. Probabilities for cosegregation of disease and VUS by chance, penetrance, and expression, may be calculated. SISA return values do not include person identifiers and need no explicit informed consent. There will be no ethical complications in submitting SISA return values to central databases. Values for several families may be combined. Values for a family may be updated by the contributor. SISA is used to consider penetrance whenever sequencing demonstrates a VUS in the known cancer-predisposing genes. Any family structure at hand in a genetic clinic may be used. One may include an extended lineage in a family through demonstrating the same VUS in a distant relative, and thereby identifying all obligate carriers in between. Such extension is a way to escape the selection biases through expanding the families outside the clusters used to select the families. © 2011 Wiley-Liss, Inc.

  10. [Therapeutic effect and safety of montelukast sodium combined with budesonide in children with cough variant asthma: a Meta analysis].

    PubMed

    Wei, Yan; Li, Dong-Sheng; Liu, Jian-Jun; Zhang, Jing; Zhao, Hai-En

    2016-11-01

    To evaluate the therapeutic effect and safety of montelukast sodium combined with budesonide in children with cough variant asthma. The databases CNKI, Wanfang Data, VIP, PubMed, EMbase, and BioMed Central were searched for randomized controlled trials (RCTs) of montelukast sodium combined with budesonide in the treatment of children with cough variant asthma. Data extraction and quality assessment were performed for RCTs which met the inclusion criteria, and RevMan 5.3 software was used to perform quality assessment of the articles included and Meta analysis. A total of 11 RCTs involving 1 097 patients were included. The results of the Meta analysis showed that compared with the control group (inhalation of budesonide alone), the observation group (inhalation of montelukast sodium combined with budesonide) had significantly higher overall response rate and more improved pulmonary function parameters including forced expiratory volume in the first second, percentage of forced expiratory volume in the first second, and peak expiratory flow, as well as significantly lower recurrence rate (P<0.01). The incidence of adverse events showed no significant difference between the two groups. Inhalation of montelukast sodium combined with budesonide has a significant effect in children with cough variant asthma and does not increase the incidence of adverse events.

  11. Integrated Controlling System and Unified Database for High Throughput Protein Crystallography Experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gaponov, Yu.A.; Igarashi, N.; Hiraki, M.

    2004-05-12

    An integrated controlling system and a unified database for high throughput protein crystallography experiments have been developed. Main features of protein crystallography experiments (purification, crystallization, crystal harvesting, data collection, data processing) were integrated into the software under development. All information necessary to perform protein crystallography experiments is stored (except raw X-ray data that are stored in a central data server) in a MySQL relational database. The database contains four mutually linked hierarchical trees describing protein crystals, data collection of protein crystal and experimental data processing. A database editor was designed and developed. The editor supports basic database functions to view,more » create, modify and delete user records in the database. Two search engines were realized: direct search of necessary information in the database and object oriented search. The system is based on TCP/IP secure UNIX sockets with four predefined sending and receiving behaviors, which support communications between all connected servers and clients with remote control functions (creating and modifying data for experimental conditions, data acquisition, viewing experimental data, and performing data processing). Two secure login schemes were designed and developed: a direct method (using the developed Linux clients with secure connection) and an indirect method (using the secure SSL connection using secure X11 support from any operating system with X-terminal and SSH support). A part of the system has been implemented on a new MAD beam line, NW12, at the Photon Factory Advanced Ring for general user experiments.« less

  12. Pathogenic Anti-Müllerian Hormone Variants in Polycystic Ovary Syndrome.

    PubMed

    Gorsic, Lidija K; Kosova, Gulum; Werstein, Brian; Sisk, Ryan; Legro, Richard S; Hayes, M Geoffrey; Teixeira, Jose M; Dunaif, Andrea; Urbanek, Margrit

    2017-08-01

    Polycystic ovary syndrome (PCOS), a common endocrine condition, is the leading cause of anovulatory infertility. Given that common disease-susceptibility variants account for only a small percentage of the estimated PCOS heritability, we tested the hypothesis that rare variants contribute to this deficit in heritability. Unbiased whole-genome sequencing (WGS) of 80 patients with PCOS and 24 reproductively normal control subjects identified potentially deleterious variants in AMH, the gene encoding anti-Müllerian hormone (AMH). Targeted sequencing of AMH of 643 patients with PCOS and 153 control patients was used to replicate WGS findings. Dual luciferase reporter assays measured the impact of the variants on downstream AMH signaling. We found 24 rare (minor allele frequency < 0.01) AMH variants in patients with PCOS and control subjects; 18 variants were specific to women with PCOS. Seventeen of 18 (94%) PCOS-specific variants had significantly reduced AMH signaling, whereas none of 6 variants observed in control subjects showed significant defects in signaling. Thus, we identified rare AMH coding variants that reduced AMH-mediated signaling in a subset of patients with PCOS. To our knowledge, this study is the first to identify rare genetic variants associated with a common PCOS phenotype. Our findings suggest decreased AMH signaling as a mechanism for the pathogenesis of PCOS. AMH decreases androgen biosynthesis by inhibiting CYP17 activity; a potential mechanism of action for AMH variants in PCOS, therefore, is to increase androgen biosynthesis due to decreased AMH-mediated inhibition of CYP17 activity. Copyright © 2017 Endocrine Society

  13. Attention to baseline: does orienting visuospatial attention really facilitate target detection?

    PubMed

    Albares, Marion; Criaud, Marion; Wardak, Claire; Nguyen, Song Chi Trung; Ben Hamed, Suliann; Boulinguez, Philippe

    2011-08-01

    Standard protocols testing the orientation of visuospatial attention usually present spatial cues before targets and compare valid-cue trials with invalid-cue trials. The valid/invalid contrast results in a relative behavioral or physiological difference that is generally interpreted as a benefit of attention orientation. However, growing evidence suggests that inhibitory control of response is closely involved in this kind of protocol that requires the subjects to withhold automatic responses to cues, probably biasing behavioral and physiological baselines. Here, we used two experiments to disentangle the inhibitory control of automatic responses from orienting of visuospatial attention in a saccadic reaction time task in humans, a variant of the classical cue-target detection task and a sustained visuospatial attentional task. Surprisingly, when referring to a simple target detection task in which there is no need to refrain from reacting to avoid inappropriate responses, we found no consistent evidence of facilitation of target detection at the attended location. Instead, we observed a cost at the unattended location. Departing from the classical view, our results suggest that reaction time measures of visuospatial attention probably relie on the attenuation of elementary processes involved in visual target detection and saccade initiation away from the attended location rather than on facilitation at the attended location. This highlights the need to use proper control conditions in experimental designs to disambiguate relative from absolute cueing benefits on target detection reaction times, both in psychophysical and neurophysiological studies.

  14. Comparison of the BioRad Variant and Primus Ultra2 high-pressure liquid chromatography (HPLC) instruments for the detection of variant hemoglobins.

    PubMed

    Gosselin, R C; Carlin, A C; Dwyre, D M

    2011-04-01

    Hemoglobin variants are a result of genetic changes resulting in abnormal or dys-synchronous hemoglobin chain production (thalassemia) or the generation of hemoglobin chain variants such as hemoglobin S. Automated high-pressure liquid chromatography (HPLC) systems have become the method of choice for the evaluation of patients suspected with hemoglobinopathies. In this study, we evaluated the performance of two HPLC methods used in the detection of common hemoglobin variants: Variant and Ultra2. There were 377 samples tested, 26% (99/377) with HbS, 8.5% (32/377) with HbC, 20.7% (78/377) with other hemoglobin variant or thalassemia, and 2.9% with increased hemoglobin A(1) c. The interpretations of each chromatograph were compared. There were no differences noted for hemoglobins A(0), S, or C. There were significant differences between HPLC methods for hemoglobins F, A(2), and A(1) c. However, there was good concordance between normal and abnormal interpretations (97.9% and 96.2%, respectively). Both Variant and Ultra2 HPLC methods were able to detect most common hemoglobin variants. There was better discrimination for fast hemoglobins, between hemoglobins E and A(2), and between hemoglobins S and F using the Ultra2 HPLC method. © 2010 Blackwell Publishing Ltd.

  15. How important are rare variants in common disease?

    PubMed

    Saint Pierre, Aude; Génin, Emmanuelle

    2014-09-01

    Genome-wide association studies have uncovered hundreds of common genetic variants involved in complex diseases. However, for most complex diseases, these common genetic variants only marginally contribute to disease susceptibility. It is now argued that rare variants located in different genes could in fact play a more important role in disease susceptibility than common variants. These rare genetic variants were not captured by genome-wide association studies using single nucleotide polymorphism-chips but with the advent of next-generation sequencing technologies, they have become detectable. It is now possible to study their contribution to common disease by resequencing samples of cases and controls or by using new genotyping exome arrays that cover rare alleles. In this review, we address the question of the contribution of rare variants in common disease by taking the examples of different diseases for which some resequencing studies have already been performed, and by summarizing the results of simulation studies conducted so far to investigate the genetic architecture of complex traits in human. So far, empirical data have not allowed the exclusion of many models except the most extreme ones involving only a small number of rare variants with large effects contributing to complex disease. To unravel the genetic architecture of complex disease, case-control data will not be sufficient, and alternative study designs need to be proposed together with methodological developments. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  16. Genotype and phenotype spectrum of NRAS germline variants.

    PubMed

    Altmüller, Franziska; Lissewski, Christina; Bertola, Debora; Flex, Elisabetta; Stark, Zornitza; Spranger, Stephanie; Baynam, Gareth; Buscarilli, Michelle; Dyack, Sarah; Gillis, Jane; Yntema, Helger G; Pantaleoni, Francesca; van Loon, Rosa LE; MacKay, Sara; Mina, Kym; Schanze, Ina; Tan, Tiong Yang; Walsh, Maie; White, Susan M; Niewisch, Marena R; García-Miñaúr, Sixto; Plaza, Diego; Ahmadian, Mohammad Reza; Cavé, Hélène; Tartaglia, Marco; Zenker, Martin

    2017-06-01

    RASopathies comprise a group of disorders clinically characterized by short stature, heart defects, facial dysmorphism, and varying degrees of intellectual disability and cancer predisposition. They are caused by germline variants in genes encoding key components or modulators of the highly conserved RAS-MAPK signalling pathway that lead to dysregulation of cell signal transmission. Germline changes in the genes encoding members of the RAS subfamily of GTPases are rare and associated with variable phenotypes of the RASopathy spectrum, ranging from Costello syndrome (HRAS variants) to Noonan and Cardiofaciocutaneous syndromes (KRAS variants). A small number of RASopathy cases with disease-causing germline NRAS alterations have been reported. Affected individuals exhibited features fitting Noonan syndrome, and the observed germline variants differed from the typical oncogenic NRAS changes occurring as somatic events in tumours. Here we describe 19 new cases with RASopathy due to disease-causing variants in NRAS. Importantly, four of them harbored missense changes affecting Gly12, which was previously described to occur exclusively in cancer. The phenotype in our cohort was variable but well within the RASopathy spectrum. Further, one of the patients (c.35G>A; p.(Gly12Asp)) had a myeloproliferative disorder, and one subject (c.34G>C; p.(Gly12Arg)) exhibited an uncharacterized brain tumour. With this report, we expand the genotype and phenotype spectrum of RASopathy-associated germline NRAS variants and provide evidence that NRAS variants do not spare the cancer-associated mutation hotspots.

  17. High-resolution analysis of copy number variants in adults with simple-to-moderate congenital heart disease.

    PubMed

    Zhao, Wei; Niu, Guannan; Shen, Botao; Zheng, Yang; Gong, Fangchao; Wang, Xianfu; Lee, Jiyun; Mulvihill, John J; Chen, Xiaohui; Li, Shibo

    2013-12-01

    As patients with congenital heart disease (CHD) increasingly survive to childbearing age, it becomes important to understand the genetic origins of CHD. In children, CHD is frequently caused by chromosomal imbalances. We searched for submicroscopic imbalances in adults with CHD focusing on simple-to-moderate phenotypes, without associated dysmorphic features, a group not previously examined. A total of 100 Han Chinese adults with a diverse range of isolated CHD and 65 ethnically matched controls were screened using whole-genome array comparative genomic hybridization. Forty-five large (>100 kb) rare copy number variants (CNVs) were identified in 36/100 patients. These variants were not listed in the Database of Genomic Variants nor found in controls. In three of these genomic imbalances (22q11.2, 18q23, 3q21.3), genes that play an important role in cardiac development were implicated, including CRKL, NFATC1, PLXNA1, the latter has not been associated with human CHD before. This study detected a 0.7 Mb 22q11.2 deletion, which marginally overlapped the common 3 Mb 22q11.2 deletion, in one patient with a perimembranous ventricular septal defect without any extracardiac manifestation. Furthermore, we detected a novel inherited aberration dup (16q23.1). Although a causal relationship with CHD remains to be established, this CNVs profile provides a spectrum of genomic imbalances in this condition, and improves the CNV-phenotype correlations. © 2013 Wiley Periodicals, Inc.

  18. Subtle perceptions of male sexual orientation influence occupational opportunities.

    PubMed

    Rule, Nicholas O; Bjornsdottir, R Thora; Tskhay, Konstantin O; Ambady, Nalini

    2016-12-01

    Theories linking the literatures on stereotyping and human resource management have proposed that individuals may enjoy greater success obtaining jobs congruent with stereotypes about their social categories or traits. Here, we explored such effects for a detectable, but not obvious, social group distinction: male sexual orientation. Bridging previous work on prejudice and occupational success with that on social perception, we found that perceivers rated gay and straight men as more suited to professions consistent with stereotypes about their groups (nurses, pediatricians, and English teachers vs. engineers, managers, surgeons, and math teachers) from mere photos of their faces. Notably, distinct evaluations of the gay and straight men emerged based on perceptions of their faces with no explicit indication of sexual orientation. Neither perceivers' expertise with hiring decisions nor diagnostic information about the targets eliminated these biases, but encouraging fair decisions did contribute to partly ameliorating the differences. Mediation analysis further showed that perceptions of the targets' sexual orientations and facial affect accounted for these effects. Individuals may therefore infer characteristics about individuals' group memberships from their faces and use this information in a way that meaningfully influences evaluations of their suitability for particular jobs. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  19. The analysis of selected orientation methods of architectural objects' scans

    NASA Astrophysics Data System (ADS)

    Markiewicz, Jakub S.; Kajdewicz, Irmina; Zawieska, Dorota

    2015-05-01

    The terrestrial laser scanning is commonly used in different areas, inter alia in modelling architectural objects. One of the most important part of TLS data processing is scans registration. It significantly affects the accuracy of generation of high resolution photogrammetric documentation. This process is time consuming, especially in case of a large number of scans. It is mostly based on an automatic detection and a semi-automatic measurement of control points placed on the object. In case of the complicated historical buildings, sometimes it is forbidden to place survey targets on an object or it may be difficult to distribute survey targets in the optimal way. Such problems encourage the search for the new methods of scan registration which enable to eliminate the step of placing survey targets on the object. In this paper the results of target-based registration method are presented The survey targets placed on the walls of historical chambers of the Museum of King Jan III's Palace at Wilanów and on the walls of ruins of the Bishops Castle in Iłża were used for scan orientation. Several variants of orientation were performed, taking into account different placement and different number of survey marks. Afterwards, during next research works, raster images were generated from scans and the SIFT and SURF algorithms for image processing were used to automatically search for corresponding natural points. The case of utilisation of automatically identified points for TLS data orientation was analysed. The results of both methods for TLS data registration were summarized and presented in numerical and graphical forms.

  20. FAVR (Filtering and Annotation of Variants that are Rare): methods to facilitate the analysis of rare germline genetic variants from massively parallel sequencing datasets

    PubMed Central

    2013-01-01

    Background Characterising genetic diversity through the analysis of massively parallel sequencing (MPS) data offers enormous potential to significantly improve our understanding of the genetic basis for observed phenotypes, including predisposition to and progression of complex human disease. Great challenges remain in resolving genetic variants that are genuine from the millions of artefactual signals. Results FAVR is a suite of new methods designed to work with commonly used MPS analysis pipelines to assist in the resolution of some of the issues related to the analysis of the vast amount of resulting data, with a focus on relatively rare genetic variants. To the best of our knowledge, no equivalent method has previously been described. The most important and novel aspect of FAVR is the use of signatures in comparator sequence alignment files during variant filtering, and annotation of variants potentially shared between individuals. The FAVR methods use these signatures to facilitate filtering of (i) platform and/or mapping-specific artefacts, (ii) common genetic variants, and, where relevant, (iii) artefacts derived from imbalanced paired-end sequencing, as well as annotation of genetic variants based on evidence of co-occurrence in individuals. We applied conventional variant calling applied to whole-exome sequencing datasets, produced using both SOLiD and TruSeq chemistries, with or without downstream processing by FAVR methods. We demonstrate a 3-fold smaller rare single nucleotide variant shortlist with no detected reduction in sensitivity. This analysis included Sanger sequencing of rare variant signals not evident in dbSNP131, assessment of known variant signal preservation, and comparison of observed and expected rare variant numbers across a range of first cousin pairs. The principles described herein were applied in our recent publication identifying XRCC2 as a new breast cancer risk gene and have been made publically available as a suite of software

  1. Rare-Variant Association Analysis: Study Designs and Statistical Tests

    PubMed Central

    Lee, Seunggeung; Abecasis, Gonçalo R.; Boehnke, Michael; Lin, Xihong

    2014-01-01

    Despite the extensive discovery of trait- and disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants can explain additional disease risk or trait variability. An increasing number of studies are underway to identify trait- and disease-associated rare variants. In this review, we provide an overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests. We present the design and analysis pipeline of rare-variant studies and review cost-effective sequencing designs and genotyping platforms. We compare various gene- or region-based association tests, including burden tests, variance-component tests, and combined omnibus tests, in terms of their assumptions and performance. Also discussed are the related topics of meta-analysis, population-stratification adjustment, genotype imputation, follow-up studies, and heritability due to rare variants. We provide guidelines for analysis and discuss some of the challenges inherent in these studies and future research directions. PMID:24995866

  2. Neonatal nonepileptic myoclonus is a prominent clinical feature of KCNQ2 gain-of-function variants R201C and R201H.

    PubMed

    Mulkey, Sarah B; Ben-Zeev, Bruria; Nicolai, Joost; Carroll, John L; Grønborg, Sabine; Jiang, Yong-Hui; Joshi, Nishtha; Kelly, Megan; Koolen, David A; Mikati, Mohamad A; Park, Kristen; Pearl, Phillip L; Scheffer, Ingrid E; Spillmann, Rebecca C; Taglialatela, Maurizio; Vieker, Silvia; Weckhuysen, Sarah; Cooper, Edward C; Cilio, Maria Roberta

    2017-03-01

    To analyze whether KCNQ2 R201C and R201H variants, which show atypical gain-of-function electrophysiologic properties in vitro, have a distinct clinical presentation and outcome. Ten children with heterozygous, de novo KCNQ2 R201C or R201H variants were identified worldwide, using an institutional review board (IRB)-approved KCNQ2 patient registry and database. We reviewed medical records and, where possible, interviewed parents and treating physicians using a structured, detailed phenotype inventory focusing on the neonatal presentation and subsequent course. Nine patients had encephalopathy from birth and presented with prominent startle-like myoclonus, which could be triggered by sound or touch. In seven patients, electroencephalography (EEG) was performed in the neonatal period and showed a burst-suppression pattern. However, myoclonus did not have an EEG correlate. In many patients the paroxysmal movements were misdiagnosed as seizures. Seven patients developed epileptic spasms in infancy. In all patients, EEG showed a slow background and multifocal epileptiform discharges later in life. Other prominent features included respiratory dysfunction (perinatal respiratory failure and/or chronic hypoventilation), hypomyelination, reduced brain volume, and profound developmental delay. One patient had a later onset, and sequencing indicated that a low abundance (~20%) R201C variant had arisen by postzygotic mosaicism. Heterozygous KCNQ2 R201C and R201H gain-of-function variants present with profound neonatal encephalopathy in the absence of neonatal seizures. Neonates present with nonepileptic myoclonus that is often misdiagnosed and treated as seizures. Prognosis is poor. This clinical presentation is distinct from the phenotype associated with loss-of-function variants, supporting the value of in vitro functional screening. These findings suggest that gain-of-function and loss-of-function variants need different targeted therapeutic approaches. Wiley Periodicals

  3. Impact of rare variants in ARHGAP29 to the etiology of oral clefts: role of loss-of-function vs missense variants.

    PubMed

    Savastano, C P; Brito, L A; Faria, Á C; Setó-Salvia, N; Peskett, E; Musso, C M; Alvizi, L; Ezquina, S A M; James, C; GOSgene; Beales, P; Lees, M; Moore, G E; Stanier, P; Passos-Bueno, M R

    2017-05-01

    Non-syndromic cleft lip with or without cleft palate (NSCL/P) is a prevalent, complex congenital malformation. Genome-wide association studies (GWAS) on NSCL/P have consistently identified association for the 1p22 region, in which ARHGAP29 has emerged as the main candidate gene. ARHGAP29 re-sequencing studies in NSCL/P patients have identified rare variants; however, their clinical impact is still unclear. In this study we identified 10 rare variants in ARHGAP29, including five missense, one in-frame deletion, and four loss-of-function (LoF) variants, in a cohort of 188 familial NSCL/P cases. A significant mutational burden was found for LoF (Sequence Kernel Association Test, p = 0.0005) but not for missense variants in ARHGAP29, suggesting that only LoF variants contribute to the etiology of NSCL/P. Penetrance was estimated as 59%, indicating that heterozygous LoF variants in ARHGAP29 confer a moderate risk to NSCL/P. The GWAS hits in IRF6 (rs642961) and 1p22 (rs560426 and rs4147811) do not seem to contribute to the penetrance of the phenotype, based on co-segregation analysis. Our data show that rare variants leading to haploinsufficiency of ARHGAP29 represent an important etiological clefting mechanism, and genetic testing for this gene might be taken into consideration in genetic counseling of familial cases. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  4. Epidemiological dynamics of norovirus GII.4 variant New Orleans 2009.

    PubMed

    Medici, Maria Cristina; Tummolo, Fabio; De Grazia, Simona; Calderaro, Adriana; De Conto, Flora; Terio, Valentina; Chironna, Maria; Bonura, Floriana; Pucci, Marzia; Bányai, Kristián; Martella, Vito; Giammanco, Giovanni Maurizio

    2015-09-01

    Norovirus (NoV) is one of the major causes of diarrhoeal disease with epidemic, outbreak and sporadic patterns in humans of all ages worldwide. NoVs of genotype GII.4 cause nearly 80-90 % of all NoV infections in humans. Periodically, some GII.4 strains become predominant, generating major pandemic variants. Retrospective analysis of the GII.4 NoV strains detected in Italy between 2007 and 2013 indicated that the pandemic variant New Orleans 2009 emerged in Italy in the late 2009, became predominant in 2010-2011 and continued to circulate in a sporadic fashion until April 2013. Upon phylogenetic analysis based on the small diagnostic regions A and C, the late New Orleans 2009 NoVs circulating during 2011-2013 appeared to be genetically different from the early New Orleans 2009 strains that circulated in 2010. For a selection of strains, a 3.2 kb genome portion at the 3' end was sequenced. In the partial ORF1 and in the full-length ORF2 and ORF3, the 2011-2013 New Orleans NoVs comprised at least three distinct genetic subclusters. By comparison with sequences retrieved from the databases, these subclusters were also found to circulate globally, suggesting that the local circulation reflected repeated introductions of different strains, rather than local selection of novel viruses. Phylogenetic subclustering did not correlate with changes in residues located in predicted putative capsid epitopes, although several changes affected the P2 domain in epitopes A, C, D and E.

  5. N-terminal nesprin-2 variants regulate β-catenin signalling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Qiuping; Minaisah, Rose-Marie; Ferraro, Elisa

    2016-07-15

    The spatial compartmentalisation of biochemical signalling pathways is essential for cell function. Nesprins are a multi-isomeric family of proteins that have emerged as signalling scaffolds, herein, we investigate the localisation and function of novel nesprin-2 N-terminal variants. We show that these nesprin-2 variants display cell specific distribution and reside in both the cytoplasm and nucleus. Immunofluorescence microscopy revealed that nesprin-2 N-terminal variants colocalised with β-catenin at cell-cell junctions in U2OS cells. Calcium switch assays demonstrated that nesprin-2 and β-catenin are lost from cell-cell junctions in low calcium conditions whereas emerin localisation at the NE remained unaltered, furthermore, an N-terminal fragmentmore » of nesprin-2 was sufficient for cell-cell junction localisation and interacted with β-catenin. Disruption of these N-terminal nesprin-2 variants, using siRNA depletion resulted in loss of β-catenin from cell-cell junctions, nuclear accumulation of active β-catenin and augmented β-catenin transcriptional activity. Importantly, we show that U2OS cells lack nesprin-2 giant, suggesting that the N-terminal nesprin-2 variants regulate β-catenin signalling independently of the NE. Together, these data identify N-terminal nesprin-2 variants as novel regulators of β-catenin signalling that tether β-catenin to cell-cell contacts to inhibit β-catenin transcriptional activity. - Highlights: • N-terminal nesprin-2 variants display cell specific expression patterns. • N-terminal spectrin repeats of nesprin-2 interact with β-catenin. • N-terminal nesprin-2 variants scaffold β-catenin at cell-cell junctions.. • Nesprin-2 variants play multiple roles in β-catenin signalling.« less

  6. Investigation of mutations in the HBB gene using the 1,000 genomes database.

    PubMed

    Carlice-Dos-Reis, Tânia; Viana, Jaime; Moreira, Fabiano Cordeiro; Cardoso, Greice de Lemos; Guerreiro, João; Santos, Sidney; Ribeiro-Dos-Santos, Ândrea

    2017-01-01

    Mutations in the HBB gene are responsible for several serious hemoglobinopathies, such as sickle cell anemia and β-thalassemia. Sickle cell anemia is one of the most common monogenic diseases worldwide. Due to its prevalence, diverse strategies have been developed for a better understanding of its molecular mechanisms. In silico analysis has been increasingly used to investigate the genotype-phenotype relationship of many diseases, and the sequences of healthy individuals deposited in the 1,000 Genomes database appear to be an excellent tool for such analysis. The objective of this study is to analyze the variations in the HBB gene in the 1,000 Genomes database, to describe the mutation frequencies in the different population groups, and to investigate the pattern of pathogenicity. The computational tool SNPEFF was used to align the data from 2,504 samples of the 1,000 Genomes database with the HG19 genome reference. The pathogenicity of each amino acid change was investigated using the databases CLINVAR, dbSNP and HbVar and five different predictors. Twenty different mutations were found in 209 healthy individuals. The African group had the highest number of individuals with mutations, and the European group had the lowest number. Thus, it is concluded that approximately 8.3% of phenotypically healthy individuals from the 1,000 Genomes database have some mutation in the HBB gene. The frequency of mutated genes was estimated at 0.042, so that the expected frequency of being homozygous or compound heterozygous for these variants in the next generation is approximately 0.002. In total, 193 subjects had a non-synonymous mutation, which 186 (7.4%) have a deleterious mutation. Considering that the 1,000 Genomes database is representative of the world's population, it can be estimated that fourteen out of every 10,000 individuals in the world will have a hemoglobinopathy in the next generation.

  7. Web based aphasia test using service oriented architecture (SOA)

    NASA Astrophysics Data System (ADS)

    Voos, J. A.; Vigliecca, N. S.; Gonzalez, E. A.

    2007-11-01

    Based on an aphasia test for Spanish speakers which analyze the patient's basic resources of verbal communication, a web-enabled software was developed to automate its execution. A clinical database was designed as a complement, in order to evaluate the antecedents (risk factors, pharmacological and medical backgrounds, neurological or psychiatric symptoms, brain injury -anatomical and physiological characteristics, etc) which are necessary to carry out a multi-factor statistical analysis in different samples of patients. The automated test was developed following service oriented architecture and implemented in a web site which contains a tests suite, which would allow both integrating the aphasia test with other neuropsychological instruments and increasing the available site information for scientific research. The test design, the database and the study of its psychometric properties (validity, reliability and objectivity) were made in conjunction with neuropsychological researchers, who participate actively in the software design, based on the patients or other subjects of investigation feedback.

  8. Searching for missing heritability: Designing rare variant association studies

    PubMed Central

    Zuk, Or; Schaffner, Stephen F.; Samocha, Kaitlin; Do, Ron; Hechter, Eliana; Kathiresan, Sekar; Daly, Mark J.; Neale, Benjamin M.; Sunyaev, Shamil R.; Lander, Eric S.

    2014-01-01

    Genetic studies have revealed thousands of loci predisposing to hundreds of human diseases and traits, revealing important biological pathways and defining novel therapeutic hypotheses. However, the genes discovered to date typically explain less than half of the apparent heritability. Because efforts have largely focused on common genetic variants, one hypothesis is that much of the missing heritability is due to rare genetic variants. Studies of common variants are typically referred to as genomewide association studies, whereas studies of rare variants are often simply called sequencing studies. Because they are actually closely related, we use the terms common variant association study (CVAS) and rare variant association study (RVAS). In this paper, we outline the similarities and differences between RVAS and CVAS and describe a conceptual framework for the design of RVAS. We apply the framework to address key questions about the sample sizes needed to detect association, the relative merits of testing disruptive alleles vs. missense alleles, frequency thresholds for filtering alleles, the value of predictors of the functional impact of missense alleles, the potential utility of isolated populations, the value of gene-set analysis, and the utility of de novo mutations. The optimal design depends critically on the selection coefficient against deleterious alleles and thus varies across genes. The analysis shows that common variant and rare variant studies require similarly large sample collections. In particular, a well-powered RVAS should involve discovery sets with at least 25,000 cases, together with a substantial replication set. PMID:24443550

  9. NPS (Naval Postgraduate School) Supply Requisition Database - Interactive Software as an Alternative to Written Instructions.

    DTIC Science & Technology

    1986-03-01

    SRdb ... .......... .35 APPENDIX A: ABBREVIATIONS AND ACRONYMS ......... 37 " APPENDIX B: USER’S MANUAL ..... ............... 38 APPENDIX C: DATABASE...percentage of situations. The purpose of this paper is to examine and propose a software-oriented alternative to the current manual , instruction-driven...Department Customer Service Manual (Ref. 1] and the applicable NPS Comptroller instruction [Ref. 2]. Several modifications to these written quidelines

  10. Molecular characterization of feline calicivirus variants from multicat household and public animal shelter in Rio de Janeiro, Brazil.

    PubMed

    Pereira, Joylson de Jesus; Baumworcel, Natasha; Fioretti, Júlia Monassa; Domingues, Cinthya Fonseca; Moraes, Laís Fernandes de; Marinho, Robson Dos Santos Souza; Vieira, Maria Clara Rodrigues; Pinto, Ana Maria Viana; de Castro, Tatiana Xavier

    2018-02-28

    The aim of this study was to perform the molecular characterization of conserved and variable regions of feline calicivirus capsid genome in order to investigate the molecular diversity of variants in Brazilian cat population. Twenty-six conjunctival samples from cats living in five public short-term animal shelters and three multicat life-long households were analyzed. Fifteen cats had conjunctivitis, three had oral ulceration, eight had respiratory signs (cough, sneeze and nasal discharge) and nine were asymptomatic. Feline calicivirus were isolated in CRFK cells and characterized by reverse transcription PCR target to both conserved and variable regions of open reading frame 2. The amplicons obtained were sequenced. A phylogenetic analysis along with most of the prototypes available in GenBank database and an amino acid analysis were performed. Phylogenetic analysis based on both conserved and variable region revealed two clusters with an aLTR value of 1.00 and 0.98 respectively and the variants from this study belong to feline calicivirus genogroup I. No association between geographical distribution and/or clinical signs and clustering in phylogenetic tree was observed. The variants circulating in public short-term animal shelter demonstrated a high variability because of the relatively rapid turnover of carrier cats constantly introduced of multiple viruses into this location over time. Copyright © 2018 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.

  11. The TREAT-NMD DMD Global Database: Analysis of More than 7,000 Duchenne Muscular Dystrophy Mutations

    PubMed Central

    Bladen, Catherine L; Salgado, David; Monges, Soledad; Foncuberta, Maria E; Kekou, Kyriaki; Kosma, Konstantina; Dawkins, Hugh; Lamont, Leanne; Roy, Anna J; Chamova, Teodora; Guergueltcheva, Velina; Chan, Sophelia; Korngut, Lawrence; Campbell, Craig; Dai, Yi; Wang, Jen; Barišić, Nina; Brabec, Petr; Lahdetie, Jaana; Walter, Maggie C; Schreiber-Katz, Olivia; Karcagi, Veronika; Garami, Marta; Viswanathan, Venkatarman; Bayat, Farhad; Buccella, Filippo; Kimura, En; Koeks, Zaïda; van den Bergen, Janneke C; Rodrigues, Miriam; Roxburgh, Richard; Lusakowska, Anna; Kostera-Pruszczyk, Anna; Zimowski, Janusz; Santos, Rosário; Neagu, Elena; Artemieva, Svetlana; Rasic, Vedrana Milic; Vojinovic, Dina; Posada, Manuel; Bloetzer, Clemens; Jeannet, Pierre-Yves; Joncourt, Franziska; Díaz-Manera, Jordi; Gallardo, Eduard; Karaduman, A Ayşe; Topaloğlu, Haluk; El Sherif, Rasha; Stringer, Angela; Shatillo, Andriy V; Martin, Ann S; Peay, Holly L; Bellgard, Matthew I; Kirschner, Jan; Flanigan, Kevin M; Straub, Volker; Bushby, Kate; Verschuuren, Jan; Aartsma-Rus, Annemieke; Béroud, Christophe; Lochmüller, Hanns

    2015-01-01

    Analyzing the type and frequency of patient-specific mutations that give rise to Duchenne muscular dystrophy (DMD) is an invaluable tool for diagnostics, basic scientific research, trial planning, and improved clinical care. Locus-specific databases allow for the collection, organization, storage, and analysis of genetic variants of disease. Here, we describe the development and analysis of the TREAT-NMD DMD Global database (http://umd.be/TREAT_DMD/). We analyzed genetic data for 7,149 DMD mutations held within the database. A total of 5,682 large mutations were observed (80% of total mutations), of which 4,894 (86%) were deletions (1 exon or larger) and 784 (14%) were duplications (1 exon or larger). There were 1,445 small mutations (smaller than 1 exon, 20% of all mutations), of which 358 (25%) were small deletions and 132 (9%) small insertions and 199 (14%) affected the splice sites. Point mutations totalled 756 (52% of small mutations) with 726 (50%) nonsense mutations and 30 (2%) missense mutations. Finally, 22 (0.3%) mid-intronic mutations were observed. In addition, mutations were identified within the database that would potentially benefit from novel genetic therapies for DMD including stop codon read-through therapies (10% of total mutations) and exon skipping therapy (80% of deletions and 55% of total mutations). PMID:25604253

  12. Dual tuning in creative processes: Joint contributions of intrinsic and extrinsic motivational orientations.

    PubMed

    Gong, Yaping; Wu, Junfeng; Song, Lynda Jiwen; Zhang, Zhen

    2017-05-01

    Intrinsic and extrinsic motivational orientations often coexist and can serve important functions. We develop and test a model in which intrinsic and extrinsic motivational orientations interact positively to influence personal creativity goal. Personal creativity goal, in turn, has a positive relationship with incremental creativity and an inverted U-shaped relationship with radical creativity. In a pilot study, we validated the personal creativity goal measure using 180 (Sample 1) and 69 (Sample 2) employees from a consulting firm. In the primary study, we tested the overall model using a sample of 657 research and development employees and their direct supervisors from an automobile firm. The results support the hypothesized model and yield several new insights. Intrinsic and extrinsic motivational orientations synergize with each other to strengthen personal creativity goal. Personal creativity goal in turn benefits incremental and radical creativity, but only up to a certain point for the latter. In addition to its linear indirect relationship with incremental creativity, intrinsic motivational orientation has an inverted U-shaped indirect relationship with radical creativity via personal creativity goal. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  13. HFE gene variants affect iron in the brain.

    PubMed

    Nandar, Wint; Connor, James R

    2011-04-01

    Iron accumulation in the brain and increased oxidative stress are consistent observations in many neurodegenerative diseases. Thus, we have begun examination into gene mutations or allelic variants that could be associated with loss of iron homeostasis. One of the mechanisms leading to iron overload is a mutation in the HFE gene, which is involved in iron metabolism. The 2 most common HFE gene variants are C282Y (1.9%) and H63D (8.9%). The C282Y HFE variant is more commonly associated with hereditary hemochromatosis, which is an autosomal recessive disorder, characterized by iron overload in a number of systemic organs. The H63D HFE variant appears less frequently associated with hemochromatosis, but its role in the neurodegenerative diseases has received more attention. At the cellular level, the HFE mutant protein resulting from the H63D HFE gene variant is associated with iron dyshomeostasis, increased oxidative stress, glutamate release, tau phosphorylation, and alteration in inflammatory response, each of which is under investigation as a contributing factor to neurodegenerative diseases. Therefore, the HFE gene variants are proposed to be genetic modifiers or a risk factor for neurodegenerative diseases by establishing an enabling milieu for pathogenic agents. This review will discuss the current knowledge of the association of the HFE gene variants with neurodegenerative diseases: amyotrophic lateral sclerosis, Alzheimer's disease, Parkinson's disease, and ischemic stroke. Importantly, the data herein also begin to dispel the long-held view that the brain is protected from iron accumulation associated with the HFE mutations.

  14. Using high-resolution variant frequencies to empower clinical genome interpretation.

    PubMed

    Whiffin, Nicola; Minikel, Eric; Walsh, Roddy; O'Donnell-Luria, Anne H; Karczewski, Konrad; Ing, Alexander Y; Barton, Paul J R; Funke, Birgit; Cook, Stuart A; MacArthur, Daniel; Ware, James S

    2017-10-01

    PurposeWhole-exome and whole-genome sequencing have transformed the discovery of genetic variants that cause human Mendelian disease, but discriminating pathogenic from benign variants remains a daunting challenge. Rarity is recognized as a necessary, although not sufficient, criterion for pathogenicity, but frequency cutoffs used in Mendelian analysis are often arbitrary and overly lenient. Recent very large reference datasets, such as the Exome Aggregation Consortium (ExAC), provide an unprecedented opportunity to obtain robust frequency estimates even for very rare variants.MethodsWe present a statistical framework for the frequency-based filtering of candidate disease-causing variants, accounting for disease prevalence, genetic and allelic heterogeneity, inheritance mode, penetrance, and sampling variance in reference datasets.ResultsUsing the example of cardiomyopathy, we show that our approach reduces by two-thirds the number of candidate variants under consideration in the average exome, without removing true pathogenic variants (false-positive rate<0.001).ConclusionWe outline a statistically robust framework for assessing whether a variant is "too common" to be causative for a Mendelian disorder of interest. We present precomputed allele frequency cutoffs for all variants in the ExAC dataset.

  15. Improved genetic counseling in Alport syndrome by new variants of COL4A5 gene.

    PubMed

    Fernandez-Rosado, Francisco; Campos, Ana; Alvarez-Cubero, Maria Jesus; Ruiz, Ana; Entrala-Bernal, Carmen

    2015-07-01

    There are current requirements of using genetic databases for offering a better genetic assistance to patients of some syndromes, especially those with X-linked heredity patterns (like Alport Syndrome) for the high probability of having descendants affected by the disease. We describe the first reported case of COL4A5 gene missense c.1499 G>T mutation in a 16-year-old girl confirmed to be affected by Alport Syndrome after genetic counseling. Next Generation Sequencing procedures let discover this mutation and offer an accurate clinical treatment to this patient. Current scientific understanding of genetic syndromes suggests the high importance of updated databases and the inclusion of Variant of Unknown Significance related to clinical cases. All of this updating could enable patients to have a better opportunity of diagnosis and having genetic and clinical counseling. This event is even more important in women planning to start a family to have correct genetic counseling regarding the risk posed to offspring, and allowing the decision to undergo prenatal testing. © 2015 Asian Pacific Society of Nephrology.

  16. sapFinder: an R/Bioconductor package for detection of variant peptides in shotgun proteomics experiments.

    PubMed

    Wen, Bo; Xu, Shaohang; Sheynkman, Gloria M; Feng, Qiang; Lin, Liang; Wang, Quanhui; Xu, Xun; Wang, Jun; Liu, Siqi

    2014-11-01

    Single nucleotide variations (SNVs) located within a reading frame can result in single amino acid polymorphisms (SAPs), leading to alteration of the corresponding amino acid sequence as well as function of a protein. Accurate detection of SAPs is an important issue in proteomic analysis at the experimental and bioinformatic level. Herein, we present sapFinder, an R software package, for detection of the variant peptides based on tandem mass spectrometry (MS/MS)-based proteomics data. This package automates the construction of variation-associated databases from public SNV repositories or sample-specific next-generation sequencing (NGS) data and the identification of SAPs through database searching, post-processing and generation of HTML-based report with visualized interface. sapFinder is implemented as a Bioconductor package in R. The package and the vignette can be downloaded at http://bioconductor.org/packages/devel/bioc/html/sapFinder.html and are provided under a GPL-2 license. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Bulk and Thin Film Synthesis of Compositionally Variant Entropy-stabilized Oxides.

    PubMed

    Sivakumar, Sai; Zwier, Elizabeth; Meisenheimer, Peter Benjamin; Heron, John T

    2018-05-29

    Here, we present a procedure for the synthesis of bulk and thin film multicomponent (Mg0.25(1-x)CoxNi0.25(1-x)Cu0.25(1-x)Zn0.25(1-x))O (Co variant) and (Mg0.25(1-x)Co0.25(1-x)Ni0.25(1-x)CuxZn0.25(1-x))O (Cu variant) entropy-stabilized oxides. Phase pure and chemically homogeneous (Mg0.25(1-x)CoxNi0.25(1-x)Cu0.25(1-x)Zn0.25(1-x))O (x = 0.20, 0.27, 0.33) and (Mg0.25(1-x)Co0.25(1-x)Ni0.25(1-x)CuxZn0.25(1-x))O (x = 0.11, 0.27) ceramic pellets are synthesized and used in the deposition of ultra-high quality, phase pure, single crystalline thin films of the target stoichiometry. A detailed methodology for the deposition of smooth, chemically homogeneous, entropy-stabilized oxide thin films by pulsed laser deposition on (001)-oriented MgO substrates is described. The phase and crystallinity of bulk and thin film materials are confirmed using X-ray diffraction. Composition and chemical homogeneity are confirmed by X-ray photoelectron spectroscopy and energy dispersive X-ray spectroscopy. The surface topography of thin films is measured with scanning probe microscopy. The synthesis of high quality, single crystalline, entropy-stabilized oxide thin films enables the study of interface, size, strain, and disorder effects on the properties in this new class of highly disordered oxide materials.

  18. Differential Expression Profile of ZFX Variants Discriminates Breast Cancer Subtypes

    PubMed

    Pourkeramati, Fatemeh; Asadi, Malek Hossein; Shakeri, Shahryar; Farsinejad, Alireza

    2018-05-13

    ZFX is a transcriptional regulator in embryonic stem cells that plays an important role in pluripotency and self-renewal. ZFX is widely expressed in pluripotent stem cells and is down-regulated during differentiation of embryonic stem cells. ZFX has five different variants that encode three different protein isoforms. While several reports have determined the overexpression of ZFX in a variety of somatic cancers, the expression of ZFX-spliced variants in cancer cells is not well-understood. We investigated the expression of ZFX variants in a series of breast cancer tissues and cell lines using quantitative PCR. The expression of ZFX variant 1/3 was higher in tumor tissue compared to marginal tissue. In contrast, the ZFX variant 5 was down-regulated in tumor tissues. While the ZFX variant 1/3 and ZFX variant 5 expression significantly increased in low-grade tumors, ZFX variant 4 was strongly expressed in high-grade tumors and demonstrating lymphatic invasion. In addition, our result revealed a significant association between the HER2 status and the expression of ZFX-spliced variants. Our data suggest that the expression of ZFX-spliced transcripts varies between different types of breast cancer and may contribute to their tumorigenesis process. Hence, ZFX-spliced transcripts could be considered as novel tumor markers with a probable value in diagnosis, prognosis, and therapy of breast cancer.

  19. Genotype–phenotype correlations in individuals with pathogenic RERE variants

    PubMed Central

    Jordan, Valerie K.; Fregeau, Brieana; Ge, Xiaoyan; Giordano, Jessica; Wapner, Ronald J.; Balci, Tugce B.; Carter, Melissa T.; Bernat, John A.; Moccia, Amanda N.; Srivastava, Anshika; Martin, Donna M.; Bielas, Stephanie L.; Pappas, John; Svoboda, Melissa D.; Rio, Marlène; Boddaert, Nathalie; Cantagrel, Vincent; Lewis, Andrea M.; Scaglia, Fernando; Kohler, Jennefer N.; Bernstein, Jonathan A.; Dries, Annika M.; Rosenfeld, Jill A.; DeFilippo, Colette; Thorson, Willa; Yang, Yaping; Sherr, Elliott H.; Bi, Weimin; Scott, Daryl A.

    2018-01-01

    Heterozygous variants in the arginine-glutamic acid dipeptide repeats gene (RERE) have been shown to cause neurodevelopmental disorder with or without anomalies of the brain, eye, or heart (NEDBEH). Here, we report nine individuals with NEDBEH who carry partial deletions or deleterious sequence variants in RERE. These variants were found to be de novo in all cases in which parental samples were available. An analysis of data from individuals with NEDBEH suggests that point mutations affecting the Atrophin-1 domain of RERE are associated with an increased risk of structural eye defects, congenital heart defects, renal anomalies, and sensorineural hearing loss when compared with loss-of-function variants that are likely to lead to haploinsufficiency. A high percentage of RERE pathogenic variants affect a histidine-rich region in the Atrophin-1 domain. We have also identified a recurrent two-amino-acid duplication in this region that is associated with the development of a CHARGE syndrome-like phenotype. We conclude that mutations affecting RERE result in a spectrum of clinical phenotypes. Genotype–phenotype correlations exist and can be used to guide medical decision making. Consideration should also be given to screening for RERE variants in individuals who fulfill diagnostic criteria for CHARGE syndrome but do not carry pathogenic variants in CHD7. PMID:29330883

  20. Visual Search for Object Orientation Can Be Modulated by Canonical Orientation

    ERIC Educational Resources Information Center

    Ballaz, Cecile; Boutsen, Luc; Peyrin, Carole; Humphreys, Glyn W.; Marendaz, Christian

    2005-01-01

    The authors studied the influence of canonical orientation on visual search for object orientation. Displays consisted of pictures of animals whose axis of elongation was either vertical or tilted in their canonical orientation. Target orientation could be either congruent or incongruent with the object's canonical orientation. In Experiment 1,…