The annotation-enriched non-redundant patent sequence databases.
Li, Weizhong; Kondratowicz, Bartosz; McWilliam, Hamish; Nauche, Stephane; Lopez, Rodrigo
2013-01-01
The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/
The Annotation-enriched non-redundant patent sequence databases
Li, Weizhong; Kondratowicz, Bartosz; McWilliam, Hamish; Nauche, Stephane; Lopez, Rodrigo
2013-01-01
The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/ PMID:23396323
Non-redundant patent sequence databases with value-added annotations at two levels
Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo
2010-01-01
The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/. PMID:19884134
Non-redundant patent sequence databases with value-added annotations at two levels.
Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo
2010-01-01
The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/.
Lee, Byungwook; Kim, Taehyung; Kim, Seon-Kyu; Lee, Kwang H; Lee, Doheon
2007-01-01
With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a gene-patent table. From the analysis, we found that 55% of human genes were associated with patenting. The gene-patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at http://www.patome.org/; the information is updated bimonthly.
Lee, Byungwook; Kim, Taehyung; Kim, Seon-Kyu; Lee, Kwang H.; Lee, Doheon
2007-01-01
With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a gene–patent table. From the analysis, we found that 55% of human genes were associated with patenting. The gene–patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at ; the information is updated bimonthly. PMID:17085479
Aguilera-Mendoza, Longendri; Marrero-Ponce, Yovani; Tellez-Ibarra, Roberto; Llorente-Quesada, Monica T; Salgado, Jesús; Barigye, Stephen J; Liu, Jun
2015-08-01
The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are included in CAMP_Patent. However, the majority of databases have their own set of unique sequences, as well as some overlap with other databases. The complete set of non-duplicate sequences comprises 16 990 cases, which is almost half of the total number of reported peptides. On the other hand, the diversity analysis identifies the most and least diverse databases and proves that all databases exhibit some level of redundancy. Finally, we present a new parallel-free software, named Dover Analyzer, developed to compute the overlap and diversity between any number of databases and compile a set of non-redundant sequences. These results are useful for selecting or building a suitable representative set of AMPs, according to specific needs. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
PatGen--a consolidated resource for searching genetic patent sequences.
Rouse, Richard J D; Castagnetto, Jesus; Niedner, Roland H
2005-04-15
Compared to the wealth of online resources covering genomic, proteomic and derived data the Bioinformatics community is rather underserved when it comes to patent information related to biological sequences. The current online resources are either incomplete or rather expensive. This paper describes, PatGen, an integrated database containing data from bioinformatic and patent resources. This effort addresses the inconsistency of publicly available genetic patent data coverage by providing access to a consolidated dataset. PatGen can be searched at http://www.patgendb.com rjdrouse@patentinformatics.com.
Genomics dataset on unclassified published organism (patent US 7547531).
Khan Shawan, Mohammad Mahfuz Ali; Hasan, Md Ashraful; Hossain, Md Mozammel; Hasan, Md Mahmudul; Parvin, Afroza; Akter, Salina; Uddin, Kazi Rasel; Banik, Subrata; Morshed, Mahbubul; Rahman, Md Nazibur; Rahman, S M Badier
2016-12-01
Nucleotide (DNA) sequence analysis provides important clues regarding the characteristics and taxonomic position of an organism. With the intention that, DNA sequence analysis is very crucial to learn about hierarchical classification of that particular organism. This dataset (patent US 7547531) is chosen to simplify all the complex raw data buried in undisclosed DNA sequences which help to open doors for new collaborations. In this data, a total of 48 unidentified DNA sequences from patent US 7547531 were selected and their complete sequences were retrieved from NCBI BioSample database. Quick response (QR) code of those DNA sequences was constructed by DNA BarID tool. QR code is useful for the identification and comparison of isolates with other organisms. AT/GC content of the DNA sequences was determined using ENDMEMO GC Content Calculator, which indicates their stability at different temperature. The highest GC content was observed in GP445188 (62.5%) which was followed by GP445198 (61.8%) and GP445189 (59.44%), while lowest was in GP445178 (24.39%). In addition, New England BioLabs (NEB) database was used to identify cleavage code indicating the 5, 3 and blunt end and enzyme code indicating the methylation site of the DNA sequences was also shown. These data will be helpful for the construction of the organisms' hierarchical classification, determination of their phylogenetic and taxonomic position and revelation of their molecular characteristics.
Mapping the patent landscape of synthetic biology for fine chemical production pathways.
Carbonell, Pablo; Gök, Abdullah; Shapira, Philip; Faulon, Jean-Loup
2016-09-01
A goal of synthetic biology bio-foundries is to innovate through an iterative design/build/test/learn pipeline. In assessing the value of new chemical production routes, the intellectual property (IP) novelty of the pathway is important. Exploratory studies can be carried using knowledge of the patent/IP landscape for synthetic biology and metabolic engineering. In this paper, we perform an assessment of pathways as potential targets for chemical production across the full catalogue of reachable chemicals in the extended metabolic space of chassis organisms, as computed by the retrosynthesis-based algorithm RetroPath. Our database for reactions processed by sequences in heterologous pathways was screened against the PatSeq database, a comprehensive collection of more than 150M sequences present in patent grants and applications. We also examine related patent families using Derwent Innovations. This large-scale computational study provides useful insights into the IP landscape of synthetic biology for fine and specialty chemicals production. © 2016 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Literature and patent analysis of the cloning and identification of human functional genes in China.
Xia, Yan; Tang, LiSha; Yao, Lei; Wan, Bo; Yang, XianMei; Yu, Long
2012-03-01
The Human Genome Project was launched at the end of the 1980s. Since then, the cloning and identification of functional genes has been a major focus of research across the world. In China too, the potentially profound impact of such studies on the life sciences and on human health was realized, and relevant studies were initiated in the 1990s. To advance China's involvement in the Human Genome Project, in the mid-1990s, Committee of Experts in Biology from National High Technology Research and Development Program of China (863 Program) proposed the "two 1%" goal. This goal envisaged China contributing 1% of the total sequencing work, and cloning and identifying 1% of the total human functional genes. Over the past 20 years, tremendous achievement has been accomplished by Chinese scientists. It is well known that scientists in China finished the 1% of sequencing work of the Human Genome Project, whereas, there is no comprehensive report about "whether China had finished cloning and identifying 1% of human functional genes". In the present study, the GenBank database at the National Center of Biotechnology Information, the PubMed search tool, and the patent database of the State Intellectual Property Office, China, were used to retrieve entries based on two screening standards: (i) Were the newly cloned and identified genes first reported by Chinese scientists? (ii) Were the Chinese scientists awarded the gene sequence patent? Entries were retrieved from the databases up to the cut-off date of 30 June 2011 and the obtained data were analyzed further. The results showed that 589 new human functional genes were first reported by Chinese scientists and 159 gene sequences were patented (http://gene.fudan.sh.cn/introduction/database/chinagene/chinagene.html). This study systematically summarizes China's contributions to human functional genomics research and answers the question "has China finished cloning and identifying 1% of human functional genes?" in the affirmative.
JRC GMO-Amplicons: a collection of nucleic acid sequences related to genetically modified organisms
Petrillo, Mauro; Angers-Loustau, Alexandre; Henriksson, Peter; Bonfini, Laura; Patak, Alex; Kreysa, Joachim
2015-01-01
The DNA target sequence is the key element in designing detection methods for genetically modified organisms (GMOs). Unfortunately this information is frequently lacking, especially for unauthorized GMOs. In addition, patent sequences are generally poorly annotated, buried in complex and extensive documentation and hard to link to the corresponding GM event. Here, we present the JRC GMO-Amplicons, a database of amplicons collected by screening public nucleotide sequence databanks by in silico determination of PCR amplification with reference methods for GMO analysis. The European Union Reference Laboratory for Genetically Modified Food and Feed (EU-RL GMFF) provides these methods in the GMOMETHODS database to support enforcement of EU legislation and GM food/feed control. The JRC GMO-Amplicons database is composed of more than 240 000 amplicons, which can be easily accessed and screened through a web interface. To our knowledge, this is the first attempt at pooling and collecting publicly available sequences related to GMOs in food and feed. The JRC GMO-Amplicons supports control laboratories in the design and assessment of GMO methods, providing inter-alia in silico prediction of primers specificity and GM targets coverage. The new tool can assist the laboratories in the analysis of complex issues, such as the detection and identification of unauthorized GMOs. Notably, the JRC GMO-Amplicons database allows the retrieval and characterization of GMO-related sequences included in patents documentation. Finally, it can help annotating poorly described GM sequences and identifying new relevant GMO-related sequences in public databases. The JRC GMO-Amplicons is freely accessible through a web-based portal that is hosted on the EU-RL GMFF website. Database URL: http://gmo-crl.jrc.ec.europa.eu/jrcgmoamplicons/ PMID:26424080
JRC GMO-Amplicons: a collection of nucleic acid sequences related to genetically modified organisms.
Petrillo, Mauro; Angers-Loustau, Alexandre; Henriksson, Peter; Bonfini, Laura; Patak, Alex; Kreysa, Joachim
2015-01-01
The DNA target sequence is the key element in designing detection methods for genetically modified organisms (GMOs). Unfortunately this information is frequently lacking, especially for unauthorized GMOs. In addition, patent sequences are generally poorly annotated, buried in complex and extensive documentation and hard to link to the corresponding GM event. Here, we present the JRC GMO-Amplicons, a database of amplicons collected by screening public nucleotide sequence databanks by in silico determination of PCR amplification with reference methods for GMO analysis. The European Union Reference Laboratory for Genetically Modified Food and Feed (EU-RL GMFF) provides these methods in the GMOMETHODS database to support enforcement of EU legislation and GM food/feed control. The JRC GMO-Amplicons database is composed of more than 240 000 amplicons, which can be easily accessed and screened through a web interface. To our knowledge, this is the first attempt at pooling and collecting publicly available sequences related to GMOs in food and feed. The JRC GMO-Amplicons supports control laboratories in the design and assessment of GMO methods, providing inter-alia in silico prediction of primers specificity and GM targets coverage. The new tool can assist the laboratories in the analysis of complex issues, such as the detection and identification of unauthorized GMOs. Notably, the JRC GMO-Amplicons database allows the retrieval and characterization of GMO-related sequences included in patents documentation. Finally, it can help annotating poorly described GM sequences and identifying new relevant GMO-related sequences in public databases. The JRC GMO-Amplicons is freely accessible through a web-based portal that is hosted on the EU-RL GMFF website. Database URL: http://gmo-crl.jrc.ec.europa.eu/jrcgmoamplicons/. © The Author(s) 2015. Published by Oxford University Press.
The EMBL nucleotide sequence database
Stoesser, Guenter; Baker, Wendy; van den Broek, Alexandra; Camon, Evelyn; Garcia-Pastor, Maria; Kanz, Carola; Kulikova, Tamara; Lombard, Vincent; Lopez, Rodrigo; Parkinson, Helen; Redaschi, Nicole; Sterk, Peter; Stoehr, Peter; Tuli, Mary Ann
2001-01-01
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:11125039
Corporate control and global governance of marine genetic resources
Österblom, Henrik
2018-01-01
Who owns ocean biodiversity? This is an increasingly relevant question, given the legal uncertainties associated with the use of genetic resources from areas beyond national jurisdiction, which cover half of the Earth’s surface. We accessed 38 million records of genetic sequences associated with patents and created a database of 12,998 sequences extracted from 862 marine species. We identified >1600 sequences from 91 species associated with deep-sea and hydrothermal vent systems, reflecting commercial interest in organisms from remote ocean areas, as well as a capacity to collect and use the genes of such species. A single corporation registered 47% of all marine sequences included in gene patents, exceeding the combined share of 220 other companies (37%). Universities and their commercialization partners registered 12%. Actors located or headquartered in 10 countries registered 98% of all patent sequences, and 165 countries were unrepresented. Our findings highlight the importance of inclusive participation by all states in international negotiations and the urgency of clarifying the legal regime around access and benefit sharing of marine genetic resources. We identify a need for greater transparency regarding species provenance, transfer of patent ownership, and activities of corporations with a disproportionate influence over the patenting of marine biodiversity. We suggest that identifying these key actors is a critical step toward encouraging innovation, fostering greater equity, and promoting better ocean stewardship. PMID:29881777
Virus-Based RNA Silencing Agents and Virus-Derived Expression Vectors as Gene Therapy Vehicles.
Venkataraman, Srividhya; Ahmad, Tauqeer; AbouHaidar, Mounir G; Hefferon, Kathleen L
2017-01-01
In consideration of recent developments in understanding the genomics and proteomics of viruses, the use of viral DNA / RNA sequences as well as their gene expression schemes, have found new in-roads towards the prognosis and therapy of diseases. Correspondingly, the sphere of the patenting scenario has expanded significantly. The current review addresses patented inventions concerning the use of virus sequences as gene silencing machineries and inventions concerning the generation and application of viral sequences as expression vectors. Furthermore, this review also discusses the employment of these patents for clinical, agricultural and biotechnological applications. Considering these objectives, the Delphion Research Intellectual Property Network database was searched using keywords such as "gene silencing", "engineered viruses" and "expression vectors" and descriptions of recent patents on the said topics were discussed. Despite several recent advances in the use of viruses as disease therapy vehicles and biotechnological vectors, these developments have yet to be proven effective in practice, in clinical and field trials. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Patent Databases. . .A Survey of What Is Available from DIALOG, Questel, SDC, Pergamon and INPADOC.
ERIC Educational Resources Information Center
Kulp, Carol S.
1984-01-01
Presents survey of two groups of databases covering patent literature: patent literature only and general literature that includes patents relevant to subject area of database. Description of databases and comparison tables for patent and general databases (cost, country coverage, years covered, update frequency, file size, and searchable data…
Analysis of Patent Databases Using VxInsight
DOE Office of Scientific and Technical Information (OSTI.GOV)
BOYACK,KEVIN W.; WYLIE,BRIAN N.; DAVIDSON,GEORGE S.
2000-12-12
We present the application of a new knowledge visualization tool, VxInsight, to the mapping and analysis of patent databases. Patent data are mined and placed in a database, relationships between the patents are identified, primarily using the citation and classification structures, then the patents are clustered using a proprietary force-directed placement algorithm. Related patents cluster together to produce a 3-D landscape view of the tens of thousands of patents. The user can navigate the landscape by zooming into or out of regions of interest. Querying the underlying database places a colored marker on each patent matching the query. Automatically generatedmore » labels, showing landscape content, update continually upon zooming. Optionally, citation links between patents may be shown on the landscape. The combination of these features enables powerful analyses of patent databases.« less
ERIC Educational Resources Information Center
Simmons, Edlyn S.
1985-01-01
Reports on retrieval of patent information online and includes definition of patent family, basic and equivalent patents, "parents and children" applications, designated states, patent family databases--International Patent Documentation Center, World Patents Index, APIPAT (American Petroleum Institute), CLAIMS (IFI/Plenum). A table…
Bibliometric trend and patent analysis in nano-alloys research for period 2000-2013.
Živković, Dragana; Niculović, Milica; Manasijević, Dragan; Minić, Duško; Ćosović, Vladan; Sibinović, Maja
2015-05-04
This paper presents an overview of current situation in nano-alloys investigations based on bibliometric and patent analysis. Bibliometric analysis data, for period from 2000 to September 2013, were obtained using Scopus database as selected index database, whereas analyzed parameters were: number of scientific papers per years, authors, countries, affiliations, subject areas and document types. Analysis of nano-alloys patents was done with specific database, using the International Patent Classification and Patent Scope for the period from 2003 to 2013 year. Information found in this database was the number of patents, patent classification by country, patent applicators, main inventors and pub date.
Bibliometric trend and patent analysis in nano-alloys research for period 2000-2013.
Živković, Dragana; Niculović, Milica; Manasijević, Dragan; Minić, Duško; Ćosović, Vladan; Sibinović, Maja
2015-01-01
This paper presents an overview of current situation in nano-alloys investigations based on bibliometric and patent analysis. Bibliometric analysis data, for the period 2000 to 2013, were obtained using Scopus database as selected index database, whereas analyzed parameters were: number of scientific papers per year, authors, countries, affiliations, subject areas and document types. Analysis of nano-alloys patents was done with specific database, using the International Patent Classification and Patent Scope for the period 2003 to 2013. Information found in this database was the number of patents, patent classification by country, patent applicators, main inventors and publication date.
Internet Patent Databases: Everyone Is a Patent Searcher Now.
ERIC Educational Resources Information Center
Wohrley, Andrew A.; Mitchell, Cindy
1997-01-01
Patent information has never been so available, at such low cost, to so many people. Describes patent databases accessible on the Web (Micropatent, Source Translation and Optimization Questel-Orbit QPAT, Internet Patents/Community of Science, and the U.S. Patent and Trademark Office), lists their strengths and weaknesses, and recommends the best…
Federal Register 2010, 2011, 2012, 2013, 2014
2012-10-29
... DEPARTMENT OF COMMERCE Patent and Trademark Office Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request... Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of...
The risk of paradoxical embolism (RoPE) study: initial description of the completed database.
Thaler, David E; Di Angelantonio, Emanuele; Di Tullio, Marco R; Donovan, Jennifer S; Griffith, John; Homma, Shunichi; Jaigobin, Cheryl; Mas, Jean-Louis; Mattle, Heinrich P; Michel, Patrik; Mono, Marie-Luise; Nedeltchev, Krassen; Papetti, Federica; Ruthazer, Robin; Serena, Joaquín; Weimar, Christian; Elkind, Mitchell S V; Kent, David M
2013-12-01
Detecting a benefit from closure of patent foramen ovale in patients with cryptogenic stroke is hampered by low rates of stroke recurrence and uncertainty about the causal role of patent foramen ovale in the index event. A method to predict patent foramen ovale-attributable recurrence risk is needed. However, individual databases generally have too few stroke recurrences to support risk modeling. Prior studies of this population have been limited by low statistical power for examining factors related to recurrence. The aim of this study was to develop a database to support modeling of patent foramen ovale-attributable recurrence risk by combining extant data sets. We identified investigators with extant databases including subjects with cryptogenic stroke investigated for patent foramen ovale, determined the availability and characteristics of data in each database, collaboratively specified the variables to be included in the Risk of Paradoxical Embolism database, harmonized the variables across databases, and collected new primary data when necessary and feasible. The Risk of Paradoxical Embolism database has individual clinical, radiologic, and echocardiographic data from 12 component databases, including subjects with cryptogenic stroke both with (n = 1925) and without (n = 1749) patent foramen ovale. In the patent foramen ovale subjects, a total of 381 outcomes (stroke, transient ischemic attack, death) occurred (median follow-up 2·2 years). While there were substantial variations in data collection between studies, there was sufficient overlap to define a common set of variables suitable for risk modeling. While individual studies are inadequate for modeling patent foramen ovale-attributable recurrence risk, collaboration between investigators has yielded a database with sufficient power to identify those patients at highest risk for a patent foramen ovale-related stroke recurrence who may have the greatest potential benefit from patent foramen ovale closure. © 2012 The Authors. International Journal of Stroke © 2012 World Stroke Organization.
Annual patents review, January-December 2004
Roland Gleisner; Karen Scallon; Michael Fleischmann; Julie Blankenburg; Marguerite Sykes
2005-01-01
This review summarizes patents related to paper recycling that first appeared in patent databases during the 2004. Two on-line databases, Claims/U.S. Patents Abstracts and Derwent World Patents Index, were searched for this review. This feature is intended to inform readers about recent developments in equipment design, chemicals, and process technologies for recycling...
Pervasive sequence patents cover the entire human genome.
Rosenfeld, Jeffrey A; Mason, Christopher E
2013-01-01
The scope and eligibility of patents for genetic sequences have been debated for decades, but a critical case regarding gene patents (Association of Molecular Pathologists v. Myriad Genetics) is now reaching the US Supreme Court. Recent court rulings have supported the assertion that such patents can provide intellectual property rights on sequences as small as 15 nucleotides (15mers), but an analysis of all current US patent claims and the human genome presented here shows that 15mer sequences from all human genes match at least one other gene. The average gene matches 364 other genes as 15mers; the breast-cancer-associated gene BRCA1 has 15mers matching at least 689 other genes. Longer sequences (1,000 bp) still showed extensive cross-gene matches. Furthermore, 15mer-length claims from bovine and other animal patents could also claim as much as 84% of the genes in the human genome. In addition, when we expanded our analysis to full-length patent claims on DNA from all US patents to date, we found that 41% of the genes in the human genome have been claimed. Thus, current patents for both short and long nucleotide sequences are extraordinarily non-specific and create an uncertain, problematic liability for genomic medicine, especially in regard to targeted re-sequencing and other sequence diagnostic assays.
Indexing of Patents of Pharmaceutical Composition in Online Databases
NASA Astrophysics Data System (ADS)
Online searching of patents of pharmaceutical composition is generally considered to be very difficult. It is due to the fact that the patent databases include extensive technical information as well as legal information so that they are not likely to have index proper to the pharmaceutical composition or even if they have such index, the scope and coverage of indexing is ambiguous. This paper discusses how patents of pharmaceutical composition are indexed in online databases such as WPl, CA, CLAIMS, USP and PATOLIS. Online searching of patents of pharmaceutical composition are also discussed in some detail.
SCRIPDB: a portal for easy access to syntheses, chemicals and reactions in patents
Heifets, Abraham; Jurisica, Igor
2012-01-01
The patent literature is a rich catalog of biologically relevant chemicals; many public and commercial molecular databases contain the structures disclosed in patent claims. However, patents are an equally rich source of metadata about bioactive molecules, including mechanism of action, disease class, homologous experimental series, structural alternatives, or the synthetic pathways used to produce molecules of interest. Unfortunately, this metadata is discarded when chemical structures are deposited separately in databases. SCRIPDB is a chemical structure database designed to make this metadata accessible. SCRIPDB provides the full original patent text, reactions and relationships described within any individual patent, in addition to the molecular files common to structural databases. We discuss how such information is valuable in medical text mining, chemical image analysis, reaction extraction and in silico pharmaceutical lead optimization. SCRIPDB may be searched by exact chemical structure, substructure or molecular similarity and the results may be restricted to patents describing synthetic routes. SCRIPDB is available at http://dcv.uhnres.utoronto.ca/SCRIPDB. PMID:22067445
The role of patent and non-patent databases in patent research in universities
NASA Astrophysics Data System (ADS)
Tolstaya, A. M.; Suslina, I. V.; Tolstaya, P. M.
2017-01-01
This studies deal with the description and systematization of the popular patent retrieval resources. The importance of the non-patent information when conducting patent research for the intellectual property created in educational and scientific activity of the university is highlighted. The differences in the patent and non-patent information are found out. Based on the databases` analysis the authors conducted the patent research on "Wireless endoscopic capsules" (development of the NRNU MEPhI). This study can be used to facilitate the university work on the new product development in order to improve the efficiency of the process of the commercialization of the intellectual activity results, including the entering the international market.
Online Patent Searching: The Realities.
ERIC Educational Resources Information Center
Kaback, Stuart M.
1983-01-01
Considers patent subject searching capabilities of major online databases, noting patent claims, "deep-indexed" files, test searches, retrieval of related references, multi-database searching, improvements needed in indexing of chemical structures, full text searching, improvements needed in handling numerical data, and augmenting a…
Senger, Stefan; Bartek, Luca; Papadatos, George; Gaulton, Anna
2015-12-01
First public disclosure of new chemical entities often takes place in patents, which makes them an important source of information. However, with an ever increasing number of patent applications, manual processing and curation on such a large scale becomes even more challenging. An alternative approach better suited for this large corpus of documents is the automated extraction of chemical structures. A number of patent chemistry databases generated by using the latter approach are now available but little is known that can help to manage expectations when using them. This study aims to address this by comparing two such freely available sources, SureChEMBL and IBM SIIP (IBM Strategic Intellectual Property Insight Platform), with manually curated commercial databases. When looking at the percentage of chemical structures successfully extracted from a set of patents, using SciFinder as our reference, 59 and 51 % were also found in our comparison in SureChEMBL and IBM SIIP, respectively. When performing this comparison with compounds as starting point, i.e. establishing if for a list of compounds the databases provide the links between chemical structures and patents they appear in, we obtained similar results. SureChEMBL and IBM SIIP found 62 and 59 %, respectively, of the compound-patent pairs obtained from Reaxys. In our comparison of automatically generated vs. manually curated patent chemistry databases, the former successfully provided approximately 60 % of links between chemical structure and patents. It needs to be stressed that only a very limited number of patents and compound-patent pairs were used for our comparison. Nevertheless, our results will hopefully help to manage expectations of users of patent chemistry databases of this type and provide a useful framework for more studies like ours as well as guide future developments of the workflows used for the automated extraction of chemical structures from patents. The challenges we have encountered whilst performing this study highlight that more needs to be done to make such assessments easier. Above all, more adequate, preferably open access to relevant 'gold standards' is required.
Tinnemann, Peter; Ozbay, Jonas; Saint, Victoria A; Willich, Stefan N
2010-11-18
Patents are one of the most important forms of intellectual property. They grant a time-limited exclusivity on the use of an invention allowing the recuperation of research costs. The use of patents is fiercely debated for medical innovation and especially controversial for publicly funded research, where the patent holder is an institution accountable to public interest. Despite this controversy, for the situation in Germany almost no empirical information exists. The purpose of this study is to examine the amount, types and trends of patent applications for health products submitted by German public research organisations. We conducted a systematic search for patent documents using the publicly accessible database search interface of the German Patent and Trademark Office. We defined keywords and search criteria and developed search patterns for the database request. We retrieved documents with application date between 1988 and 2006 and processed the collected data stepwise to compile the most relevant documents in patent families for further analysis. We developed a rationale and present individual steps of a systematic method to request and process patent data from a publicly accessible database. We retrieved and processed 10194 patent documents. Out of these, we identified 1772 relevant patent families, applied for by 193 different universities and non-university public research organisations. 827 (47%) of these patent families contained granted patents. The number of patent applications submitted by universities and university-affiliated institutions more than tripled since the introduction of legal reforms in 2002, constituting almost half of all patent applications and accounting for most of the post-reform increase. Patenting of most non-university public research organisations remained stable. We search, process and analyse patent applications from publicly accessible databases. Internationally mounting evidence questions the viability of policies to increase commercial exploitation of publicly funded research results. To evaluate the outcome of research policies a transparent evidence base for public debate is needed in Germany.
Tinnemann, Peter; Özbay, Jonas; Saint, Victoria A.; Willich, Stefan N.
2010-01-01
Background Patents are one of the most important forms of intellectual property. They grant a time-limited exclusivity on the use of an invention allowing the recuperation of research costs. The use of patents is fiercely debated for medical innovation and especially controversial for publicly funded research, where the patent holder is an institution accountable to public interest. Despite this controversy, for the situation in Germany almost no empirical information exists. The purpose of this study is to examine the amount, types and trends of patent applications for health products submitted by German public research organisations. Methods/Principal Findings We conducted a systematic search for patent documents using the publicly accessible database search interface of the German Patent and Trademark Office. We defined keywords and search criteria and developed search patterns for the database request. We retrieved documents with application date between 1988 and 2006 and processed the collected data stepwise to compile the most relevant documents in patent families for further analysis. We developed a rationale and present individual steps of a systematic method to request and process patent data from a publicly accessible database. We retrieved and processed 10194 patent documents. Out of these, we identified 1772 relevant patent families, applied for by 193 different universities and non-university public research organisations. 827 (47%) of these patent families contained granted patents. The number of patent applications submitted by universities and university-affiliated institutions more than tripled since the introduction of legal reforms in 2002, constituting almost half of all patent applications and accounting for most of the post-reform increase. Patenting of most non-university public research organisations remained stable. Conclusions We search, process and analyse patent applications from publicly accessible databases. Internationally mounting evidence questions the viability of policies to increase commercial exploitation of publicly funded research results. To evaluate the outcome of research policies a transparent evidence base for public debate is needed in Germany. PMID:21124982
Park, Hyun-Seok
2012-12-01
Whereas a vast amount of new information on bioinformatics is made available to the public through patents, only a small set of patents are cited in academic papers. A detailed analysis of registered bioinformatics patents, using the existing patent search system, can provide valuable information links between science and technology. However, it is extremely difficult to select keywords to capture bioinformatics patents, reflecting the convergence of several underlying technologies. No single word or even several words are sufficient to identify such patents. The analysis of patent subclasses can provide valuable information. In this paper, I did a preliminary study of the current status of bioinformatics patents and their International Patent Classification (IPC) groups registered in the Korea Intellectual Property Rights Information Service (KIPRIS) database.
Searching bioremediation patents through Cooperative Patent Classification (CPC).
Prasad, Rajendra
2016-03-01
Patent classification systems have traditionally evolved independently at each patent jurisdiction to classify patents handled by their examiners to be able to search previous patents while dealing with new patent applications. As patent databases maintained by them went online for free access to public as also for global search of prior art by examiners, the need arose for a common platform and uniform structure of patent databases. The diversity of different classification, however, posed problems of integrating and searching relevant patents across patent jurisdictions. To address this problem of comparability of data from different sources and searching patents, WIPO in the recent past developed what is known as International Patent Classification (IPC) system which most countries readily adopted to code their patents with IPC codes along with their own codes. The Cooperative Patent Classification (CPC) is the latest patent classification system based on IPC/European Classification (ECLA) system, developed by the European Patent Office (EPO) and the United States Patent and Trademark Office (USPTO) which is likely to become a global standard. This paper discusses this new classification system with reference to patents on bioremediation.
37 CFR Appendix A to Subpart G to... - Sample Sequence Listing
Code of Federal Regulations, 2011 CFR
2011-07-01
... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Sample Sequence Listing A Appendix A to Subpart G to Part 1 Patents, Trademarks, and Copyrights UNITED STATES PATENT AND TRADEMARK OFFICE, DEPARTMENT OF COMMERCE GENERAL RULES OF PRACTICE IN PATENT CASES Biotechnology Invention...
37 CFR Appendix A to Subpart G to... - Sample Sequence Listing
Code of Federal Regulations, 2010 CFR
2010-07-01
... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Sample Sequence Listing A Appendix A to Subpart G to Part 1 Patents, Trademarks, and Copyrights UNITED STATES PATENT AND TRADEMARK OFFICE, DEPARTMENT OF COMMERCE GENERAL RULES OF PRACTICE IN PATENT CASES Biotechnology Invention...
Routh, Shreya; Nandagopal, Krishnadas
2017-01-01
Resveratrol, taxol, podophyllotoxin, withanolides and their derivatives find applications in anti-cancer therapy. They are plant-derived compounds whose chemical structures and synthesis limit their natural availability and restrict a large-scale industrial production. Hence, their production by various biotechnological approaches may hold promise for a continuous and reliable mode of supply. We review process and product patents in this regard. Accordingly, we provide a general outline to search the freely accessible WIPO, EPO, USPTO and Cambia databases with several keywords and patent codes. We have tabulated both granted and filed patents from the said databases. We retrieved ~40 patents from these databases. Novel biotechnological processes for production of these anticancer compounds include Agrobacterium rhizogenes-mediated hairy root culture, suspension culture, cell culture with elicitors, use of recombinant microorganisms, and bioreactors among others. The results are indicative of being both database-specific as well as queryspecific. A ten-year search window yielded 33 patents. The utility of the search strategy is discussed in the light of biotechnological developments in the field. Those who examine patent literature using similar search strategies may complement their knowledge obtained from perusal of mainstream journal resources. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
Ma, Li-xin; Wang, Yu-yi; Li, Xin-xue; Liu, Jian-ping
2012-03-01
Randomized controlled trial (RCT) is considered as the gold standard for the efficacy assessment of medicines. With the increasing number of Chinese patent drugs for treatment of type 2 diabetes, the methodology of post-marketing RCTs evaluating the efficacy and specific effect has become more important. To investigate post-marketing Chinese patent drugs for treatment of type 2 diabetes, as well as the methodological quality of post-marketing RCTs. Literature was searched from the books of Newly Compiled Traditional Chinese Patent Medicine and Chinese Pharmacopeia, the websites of the State Food and Drug Administration and the Ministry of Human Resources and Social Security of the People's Republic of China, China National Knowledge Infrastructure Database, Chongqing VIP Chinese Science and Technology Periodical Database, Chinese Biomedical Database (SinoMed) and Wanfang Data. The time period for searching ran from the commencement of each database to August 2011. RCTs of post-marketing Chinese patent drugs for treatment of type 2 diabetes with intervention course no less than 3 months. Two authors independently evaluated the research quality of the RCTs by the checklist of risk bias assessment and the data collection forms based on the CONSORT Statement. Independent double data-extraction was performed. The authors identified a total of 149 Chinese patent drugs for treatment of type 2 diabetes. According to different indicative syndromes, the Chinese patent drugs can be divided into the following types, namely, yin deficiency and interior heat (n=48, 32%), dual deficiency of qi and yin (n=58, 39%) and dual deficiency of qi and yin combined with blood stasis (n=22, 15%). A total of 41 RCTs meeting the inclusion criteria were included. Neither multicenter RCTs nor endpoint outcome reports were found. Risk bias analysis showed that 81% of the included studies reported randomization for grouping without sequence generation, 98% of these studies did not report concealment of random numbers, 5% used placebo, 10% reported outcome attrition bias and no study employed the analysis of intention-to-treat and 98% reported the diagnostic criteria for type 2 diabetes. The participants mainly consisted of outpatients without complications (76%). The minimum and maximum sample size was 40 and 300 (106 ± 60), respectively. The inclusion and exclusion criteria and outcome measures did not match the purposes and contents of post-marketing research in the included studies. They also failed to reflect the basic principles of traditional Chinese medicine in the process of diagnosis and treatment. The demographic characteristics of the patients, the indications for medicine and the syndrome differentiation process were not reported sufficiently and transparently. In order to improve the post-marketing research and promote the rational use of Chinese patent drugs, it is recommended that phase IV clinical trials should establish clear research purpose as well as hypothesis first, and choose scientific and evidence-based study design and outcome measures. In addition, guidelines for implementation of post-marketing research should be developed.
Yang, Wei; Xie, Yanming; Zhuang, Yan
2011-10-01
There are many kinds of Chinese traditional patent medicine used in clinical practice and many adverse events have been reported by clinical professionals. Chinese patent medicine's safety problems are the most concerned by patients and physicians. At present, many researchers have studied re-evaluation methods about post marketing Chinese medicine safety inside and outside China. However, it is rare that using data from hospital information system (HIS) to re-evaluating post marketing Chinese traditional patent medicine safety problems. HIS database in real world is a good resource with rich information to research medicine safety. This study planed to analyze HIS data selected from ten top general hospitals in Beijing, formed a large HIS database in real world with a capacity of 1 000 000 cases in total after a series of data cleaning and integrating procedures. This study could be a new project that using information to evaluate traditional Chinese medicine safety based on HIS database. A clear protocol has been completed as for the first step for the whole study. The protocol is as follows. First of all, separate each of the Chinese traditional patent medicines existing in the total HIS database as a single database. Secondly, select some related laboratory tests indexes as the safety evaluating outcomes, such as routine blood, routine urine, feces routine, conventional coagulation, liver function, kidney function and other tests. Thirdly, use the data mining method to analyze those selected safety outcomes which had abnormal change before and after using Chinese patent medicines. Finally, judge the relationship between those abnormal changing and Chinese patent medicine. We hope this method could imply useful information to Chinese medicine researchers interested in safety evaluation of traditional Chinese medicine.
Towards the ophthalmology patentome: a comprehensive patent database of ocular drugs and biomarkers.
Mucke, Hermann A M; Mucke, Eva; Mucke, Peter M
2013-01-01
We are currently building a database of all patent documents that contain substantial information related to pharmacology, drug delivery, tissue technology, and molecular diagnostics in ophthalmology. The goal is to establish a 'patentome', a body of cleaned and annotated data where all text-based, chemistry and pharmacology information can be accessed and mined in its context. We provide metrics on patent convention treaty documents, which demonstrate that ocular-related patenting has shown stronger growth than general patent cooperation treaty patenting during the past 25 years, and, while the majority of applications of this type have always provided substantial biological data, both data support and objections by patent examiners have been increasing since 2006-2007. Separately, we present a case study of chemistry information extraction from patents published during the 1950s and 1970s, which reveal compounds with corneal anesthesia potential that were never published in the peer-reviewed literature.
Developing a Systematic Patent Search Training Program
ERIC Educational Resources Information Center
Zhang, Li
2009-01-01
This study aims to develop a systematic patent training program using patent analysis and citation analysis techniques applied to patents held by the University of Saskatchewan. The results indicate that the target audience will be researchers in life sciences, and aggregated patent database searching and advanced search techniques should be…
DNA patenting: implications for public health research.
Dutfield, Graham
2006-01-01
I weigh the arguments for and against the patenting of functional DNA sequences including genes, and find the objections to be compelling. Is an outright ban on DNA patenting the right policy response? Not necessarily. Governments may wish to consider options ranging from patent law reforms to the creation of new rights. There are alternative ways to protect DNA sequences that industry may choose if DNA patenting is restricted or banned. Some of these alternatives may be more harmful than patents. Such unintended consequences of patent bans mean that we should think hard before concluding that prohibition is the only response to legitimate concerns about the appropriateness of patents in the field of human genomics. PMID:16710549
Sterckx, Sigrid; Cockbain, Julian; Howard, Heidi; Huys, Isabelle; Borry, Pascal
2013-05-01
Recently, 23andMe announced that it had obtained its first patent, related to "polymorphisms associated with Parkinson's disease" (US-B-8187811). This announcement immediately sparked controversy in the community of 23andMe users and research participants, especially with regard to issues of transparency and trust. The purpose of this article was to analyze the patent portfolio of this prominent direct-to-consumer genetic testing company and discuss the potential ethical implications of patenting in this field for public participation in Web-based genetic research. We searched the publicly accessible patent database Espacenet as well as the commercially available database Micropatent for published patents and patent applications of 23andMe. Six patent families were identified for 23andMe. These included patent applications related to: genetic comparisons between grandparents and grandchildren, family inheritance, genome sharing, processing data from genotyping chips, gamete donor selection based on genetic calculations, finding relatives in a database, and polymorphisms associated with Parkinson disease. An important lesson to be drawn from this ongoing controversy seems to be that any (private or public) organization involved in research that relies on human participation, whether by providing information, body material, or both, needs to be transparent, not only about its research goals but also about its strategies and policies regarding commercialization.
Construction of In-house Databases in a Corporation
NASA Astrophysics Data System (ADS)
Dezaki, Kyoko; Saeki, Makoto
Rapid progress in advanced informationalization has increased need to enforce documentation activities in industries. Responding to it Tokin Corporation has been engaged in database construction for patent information, technical reports and so on accumulated inside the Company. Two results are obtained; One is TOPICS, inhouse patent information management system, the other is TOMATIS, management and technical information system by use of personal computers and all-purposed relational database software. These systems aim at compiling databases of patent and technological management information generated internally and externally by low labor efforts as well as low cost, and providing for comprehensive information company-wide. This paper introduces the outline of these systems and how they are actually used.
Intellectual property (IP) analysis of embossed hologram business
NASA Astrophysics Data System (ADS)
Hunt, David; Reingand, Nadya; Cantrell, Robert
2006-02-01
This paper presents an overview of patents and patent applications on security embossed holograms, and highlights the possibilities offered by patent searching and analysis. Thousands of patent documents relevant to embossed holograms were uncovered by the study. The search was performed in the following databases: U.S. Patent Office, European Patent Office, Japanese Patent Office and Korean Patent Office for the time frame from 1971 through November 2005. The patent analysis unveils trends in patent temporal distribution, patent families formation, significant technological coverage within the embossed holography market and other interesting insights.
Intellectual property in holographic interferometry
NASA Astrophysics Data System (ADS)
Reingand, Nadya; Hunt, David
2006-08-01
This paper presents an overview of patents and patent applications on holographic interferometry, and highlights the possibilities offered by patent searching and analysis. Thousands of patent documents relevant to holographic interferometry were uncovered by the study. The search was performed in the following databases: U.S. Patent Office, European Patent Office, Japanese Patent Office and Korean Patent Office for the time frame from 1971 through May 2006. The patent analysis unveils trends in patent temporal distribution, patent families formation, significant technological coverage within the market of system that employ holographic interferometry and other interesting insights.
Teaching Chemistry Students How To Use Patent Databases and Glean Patent Information
NASA Astrophysics Data System (ADS)
MacMillan, Margy; Shaw, Lawton
2008-07-01
Patent literature is an important source of chemical information that is often neglected by chemical educators. This paper describes an effort to teach chemistry students how to use patent databases to search for information on applied chemical technology related to the manufacture of industrial and specialty chemicals. Students in a second-year-level organic chemistry class were shown how to search patent literature as part of a group research paper assignment that involved determining the feasibility of starting an industrial chemical operation to manufacture a given industrial chemical. Students who were assigned high value or specialty chemicals were most likely to cite patent literature in their final papers. Students who were assigned plastics or bulk commodity chemicals were less likely to cite patents. It is suggested that students made choices about the usefulness of patent literature and that patents were most useful when current patents existed and provided the patent owner a competitive advantage. For plastics or commodity chemicals, manufacturing technologies tend to be mature and are well described by more accessible information sources. Suggestions are made for effective introduction of patent literature instruction into upper-level chemistry courses.
[Application of ultrasound counter currentextraction in patent of traditional Chinese medicine].
Miao, Yan-ni; Wu, Bin; Yue, Xue-lian
2015-07-01
The patent information of ultrasound countercurrent extraction used in traditional Chinese medicine was analyzed in this paper by the samples from Derwent World Patent Database (DWPI) and the Chinese Patent Abstracts Database (CNABS). The application of ultrasound countercurrent was discussed with the patent applicant,the amount of the annual distribution, and the pharmaceutical raw materials and other aspects. While the technical parameters published in the patent was deeply analyzed, such as material crushing, extraction solvent, extraction time and temperature, extraction equipment and ultrasonic frequency. Thought above research, various technical parameters of ultrasound countercurrent extraction used in traditional Chinese was summarize. The analysis conclusion of the paper can be used in discovering the technical advantages, optimizing extraction conditions, and providing a reference to extraction technological innovation of traditional Chinese medicine.
International patent analysis of water source heat pump based on orbit database
NASA Astrophysics Data System (ADS)
Li, Na
2018-02-01
Using orbit database, this paper analysed the international patents of water source heat pump (WSHP) industry with patent analysis methods such as analysis of publication tendency, geographical distribution, technology leaders and top assignees. It is found that the beginning of the 21st century is a period of rapid growth of the patent application of WSHP. Germany and the United States had done researches and development of WSHP in an early time, but now Japan and China have become important countries of patent applications. China has been developing faster and faster in recent years, but the patents are concentrated in universities and urgent to be transferred. Through an objective analysis, this paper aims to provide appropriate decision references for the development of domestic WSHP industry.
dos REIS, José Maciel Caldas; PINHEIRO, Maurício Fortuna; OTI, André Takashi; FEITOSA-JUNIOR, Denilson José Silva; PANTOJA, Mauro de Souza; BARROS, Rui Sérgio Monteiro
2016-01-01
ABSTRACT Introduction: Food is a key factor both in prevention and in promoting human health. Among the functional food are highlighted probiotics and prebiotics. Patent databases are the main source of technological information about innovation worldwide, providing extensive library for research sector. Objective: Perform mapping in the main patent databases about pre and probiotics, seeking relevant information regarding the use of biotechnology, nanotechnology and genetic engineering in the production of these foods. Method: Electronic consultation was conducted (online) in the main public databases of patents in Brazil (INPI), United States (USPTO) and the European Patent Bank (EPO). The research involved the period from January 2014 to July 2015, being used in the title fields and summary of patents, the following descriptors in INPI "prebiotic", "prebiotic" "probiotics", "probiotic" and the USPTO and EPO: "prebiotic", "prebiotics", "probiotic", "probiotics". Results: This search haven't found any deposit at the brazilian patents website (INPI) in this period; US Patent &Trademark Office had registered 60 titles in patents and the European Patent Office (EPO) showed 10 documents on the issue. Conclusion: Information technology offered by genetic engineering, biotechnology and nanotechnology deposited in the form of titles and abstracts of patents in relation to early nutritional intervention as functional foods, has increasingly required to decrease the risks and control the progression of health problems. But, the existing summaries, although attractive and promising in this sense, are still incipient to recommend them safely as a therapeutic tool. Therefore, they should be seen more as diet elements and healthy lifestyles. PMID:28076487
SureChEMBL: a large-scale, chemically annotated patent document database.
Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P
2016-01-04
SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
SureChEMBL: a large-scale, chemically annotated patent document database
Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A.; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.
2016-01-01
SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922
ERIC Educational Resources Information Center
Ha¨rtinger, Stefan; Clarke, Nigel
2016-01-01
Developing skills for searching the patent literature is an essential element of chemical information literacy programs at the university level. The present article creates awareness of patents as a rich source of chemical information. Patent classification is introduced as a key-component in comprehensive search strategies. The free Espacenet…
Patent Documents as a Resource for Studies and Education in Geophysics - An Approach.
NASA Astrophysics Data System (ADS)
Wollny, K. G.
2016-12-01
Patents are a highly neglected source of information in geophysics, although they supply a wealth of technical and historically relevant data and might be an important asset for researchers and students. The technical drawings and descriptions in patent documents provide insight into the personal work of a researcher or a scientific group and give detailed technical background information, show interdisciplinary solutions for similar problems, help to learn about inventions too advanced for their time but maybe useful now, and to explore the historical background and timelines of inventions and their inventors. It will be shown how to get access to patent documents and how to use them for research and education purposes. Exemplary inventions by well-known geoscientists or scientists in related fields will be presented to illustrate the usefulness of patent documents. The data pool used is the International Patent Classification (IPC) class G01V that the United Nations' World Intellectual Property Organisation (WIPO) has set up mainly for inventions with key aspects in geophysics. This class contains approximately 235,000 patent documents (July 2016) for methods, apparatuses or scientific instruments developed during scientific projects or by geophysical companies. The patent documents can be accessed via patent databases. The most important patent databases are for free, search functionality is self-explanatory and the amount of information to be extracted is enormous. For example, more than 90 million multilingual patent documents are currently available online (July 2016) in DEPATIS database of the German Patent and Trade Mark Office or ESPACENET of the European Patent Office. To summarize, patent documents are a highly useful tool for educational and research purposes to strengthen students' and scientists' knowledge in a practically orientated geophysical field and to widen the horizon to adjacent technical areas. Last but not least, they also provide insight into historical aspects of geophysics and the persons working in that area.
Kaplan, Warren A; Beall, Reed F
2017-01-01
Lack of access to insulin and poor health outcomes are issues for both low and high income countries. This has been accompanied by a shift from relatively inexpensive human insulin to its more expensive analogs, marketed by three to four main global players. Nonetheless, patent-based market exclusivities are beginning to expire there for the first generation insulin analogs. This paper adds a global dimension to information on the U.S. patent landscape for insulin by reviewing the patent status of insulins with emphasis on the situation outside the US and Europe. Using the term "insulin", we searched for patents listed on the United States Food and Drug Administration's (USFDA) Orange Book and the Canadian Online Drug Product Database Online Query and its Patent Register. With this information, we expanded the search globally using the World Intellectual Property Organization (WIPO) PatentScope database, the European Patent Office's INPADOC database and various country-specific Patent Offices. Patent protected insulins marketed in the U.S. and other countries are facing an imminent patent-expiration "cliff' yet the three companies that dominate the global insulin market are continuing to file for patents in and outside the U.S, but very rarely in Africa. Only a few local producers in the so-called "pharmerging" markets (e.g., Brazil, India, China) are filing for global patent protection on their own insulins. There is moderate, but statistically significant association between patent filings and diabetes disease burden. The global market dominance by a few companies of analog over human insulin will likely continue even though patents on the current portfolio of insulin analogs will expire very soon. Multinationals are continuing to file for more insulin patents in the bigger markets with large disease burdens and a rapidly emerging middle class. Off-patent human insulins can effectively manage diabetes. A practical way forward would be find (potential) generic manufacturers globally and nudge them towards opportunities to diversify their national insulin markets with acceptable off-patent products for export.
Analysing patent landscapes in plant biotechnology and new plant breeding techniques.
Parisi, Claudia; Rodríguez-Cerezo, Emilio; Thangaraj, Harry
2013-02-01
This article aims to inform the reader of the importance of searching patent landscapes in plant biotechnology and the use of basic tools to perform a patent search. The recommendations for a patent search strategy are illustrated with the specific example of zinc finger nuclease technology for genetic engineering in plants. Within this scope, we provide a general introduction to searching using two online and free-access patent databases esp@cenet and PatentScope. The essential features of the two databases, and their functionality is described, together with short descriptions to enable the reader to understand patents, searching, their content, patent families, and their territorial scope. We mostly stress the value of patent searching for mining scientific, rather than legal information. Search methods through the use of keywords and patent codes are elucidated together with suggestions about how to search with or combine codes with keywords and we also comment on limitations of each method. We stress the importance of patent literature to complement more mainstream scientific literature, and the relative complexities and difficulties in searching patents compared to the latter. A parallel online resource where we describe detailed search exercises is available through reference for those intending further exploration. In essence this is aimed at a novice patent searcher who may want to examine accessory patent literature to complement knowledge gained from mainstream journal resources.
Analysis of Patent Activity in the Field of Quantum Information Processing
NASA Astrophysics Data System (ADS)
Winiarczyk, Ryszard; Gawron, Piotr; Miszczak, Jarosław Adam; Pawela, Łukasz; Puchała, Zbigniew
2013-03-01
This paper provides an analysis of patent activity in the field of quantum information processing. Data from the PatentScope database from the years 1993-2011 was used. In order to predict the future trends in the number of filed patents time series models were used.
Patent Searching: What, Why, When, Where?
ERIC Educational Resources Information Center
Lambert, Nancy
1996-01-01
Although some patent questions should be left to professionals, others can be answered using Internet-based patent databases. Provides an overview of Internet sites, including addresses, descriptions, costs, and evaluations of free and fee-based sites and guidelines for when do-it-yourself online patent searching is and is not appropriate. (PEN)
Worldwide nanotechnology development: a comparative study of USPTO, EPO, and JPO patents (1976-2004)
NASA Astrophysics Data System (ADS)
Li, Xin; Lin, Yiling; Chen, Hsinchun; Roco, Mihail C.
2007-12-01
To assess worldwide development of nanotechnology, this paper compares the numbers and contents of nanotechnology patents in the United States Patent and Trademark Office (USPTO), European Patent Office (EPO), and Japan Patent Office (JPO). It uses the patent databases as indicators of nanotechnology trends via bibliographic analysis, content map analysis, and citation network analysis on nanotechnology patents per country, institution, and technology field. The numbers of nanotechnology patents published in USPTO and EPO have continued to increase quasi-exponentially since 1980, while those published in JPO stabilized after 1993. Institutions and individuals located in the same region as a repository's patent office have a higher contribution to the nanotechnology patent publication in that repository ("home advantage" effect). The USPTO and EPO databases had similar high-productivity contributing countries and technology fields with large number of patents, but quite different high-impact countries and technology fields after the average number of received cites. Bibliographic analysis on USPTO and EPO patents shows that researchers in the United States and Japan published larger numbers of patents than other countries, and that their patents were more frequently cited by other patents. Nanotechnology patents covered physics research topics in all three repositories. In addition, USPTO showed the broadest representation in coverage in biomedical and electronics areas. The analysis of citations by technology field indicates that USPTO had a clear pattern of knowledge diffusion from highly cited fields to less cited fields, while EPO showed knowledge exchange mainly occurred among highly cited fields.
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2011 CFR
2011-01-01
... (CRIS). (b) Initial Documentation in the CRIS Database. Information collected in the “Work Unit... elect) to obtain patent(s) on any such invention; and an identification of equipment purchased with any.... The CRIS database is available to the public on the worldwide web. CRIS project information is...
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2012 CFR
2012-01-01
... (CRIS). (b) Initial Documentation in the CRIS Database. Information collected in the “Work Unit... elect) to obtain patent(s) on any such invention; and an identification of equipment purchased with any.... The CRIS database is available to the public on the worldwide web. CRIS project information is...
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2013 CFR
2013-01-01
... (CRIS). (b) Initial Documentation in the CRIS Database. Information collected in the “Work Unit... elect) to obtain patent(s) on any such invention; and an identification of equipment purchased with any.... The CRIS database is available to the public on the worldwide web. CRIS project information is...
7 CFR 3430.55 - Technical reporting.
Code of Federal Regulations, 2014 CFR
2014-01-01
... (CRIS). (b) Initial Documentation in the CRIS Database. Information collected in the “Work Unit... elect) to obtain patent(s) on any such invention; and an identification of equipment purchased with any.... The CRIS database is available to the public on the worldwide web. CRIS project information is...
Method of identification of patent trends based on descriptions of technical functions
NASA Astrophysics Data System (ADS)
Korobkin, D. M.; Fomenkov, S. A.; Golovanchikov, A. B.
2018-05-01
The use of the global patent space to determine the scientific and technological priorities for the technical systems development (identifying patent trends) allows one to forecast the direction of the technical systems development and, accordingly, select patents of priority technical subjects as a source for updating the technical functions database and physical effects database. The authors propose an original method that uses as trend terms not individual unigrams or n-gram (usually for existing methods and systems), but structured descriptions of technical functions in the form “Subject-Action-Object” (SAO), which in the authors’ opinion are the basis of the invention.
de Luna, Airton José; Santos, Douglas Alves
2017-03-01
Worldwide, year by year, Fenton's Technologies have been highlighted in both academic and patent scopes, in part due to their proven efficiency as environment-friendly technologies destined to the abatement of organic pollutants, and also by their growing interest to produce industrial applications. Thus, aiming to understand the effective dynamic between two worlds, academy vs patents, the present study performs a comparative analysis about publications on Fenton-based Technologies (FbT). Therefore, in this work, technological foresight techniques were adopted focusing on patent and non-patent databases, employing for this, the Web of Science (WoS) database as a prospecting tool. The main results for the last decade point out to a strong increment of the Fenton's Technologies, as much in R&D as in patent applications in the world. Chinese Universities and firms command the scenario. There is an expressive gap between the academic and patent issues.
Semiannual patents review, January — June 2001.
Marguerite S. Sykes; Julie Blankenburg
2001-01-01
This review summarizes patents related to paper recycling that were issued during the first 6 months of 2001. Two online databases, Claims/U.S. Patents Abstracts and Derwent World Patents Index, were searched for this review. This semiannual feature is intended to inform readers about recent developments in equipment design, chemicals, and process technology for...
Semiannual patents review, July 2001-December 2001
Roland Gleisner; Marguerite Sykes; Julie Blankenburg
2002-01-01
This review summarizes patents related to paper recycling that were issued during the last six months of 2001. Two on-line databases, Claims/U.S. Patents Abstracts and Derwent World Patents Index, were searched for this review. This semiannual feature is intended to inform readers about recent developments in equipment design, chemicals and process technology for...
Metastasizing patent claims on BRCA1.
Kepler, Thomas B; Crossman, Colin; Cook-Deegan, Robert
2010-05-01
Many patents make claims on DNA sequences; some include claims on oligonucleotides related to the primary patented gene. We used bioinformatics to quantify the reach of one such claim from patent 4,747,282 on BRCA1. We find that human chromosome 1 (which does not contain BRCA1) contains over 300,000 oligonucleotides covered by this claim, and that 80% of cDNA and mRNA sequences contributed to GenBank before the patent application was filed also contain at least one claimed oligonucleotide. Any "isolated" DNA molecules that include such 15 bp nucleotide sequences would fall under the claim as granted by the US Patent and Trademark Office. Anyone making, using, selling, or importing such a molecule for any purpose within the United States would thus be infringing the claim. This claim and others like it turn out, on examination, to be surprisingly broad, and if enforced would have substantial implications for medical practice and scientific research. Copyright 2010 Elsevier Inc. All rights reserved.
Santos, Daniela Coelho Dos; Schneider, Lara Rodrigues; da Silva Barboza, Andressa; Diniz Campos, Ângela; Lund, Rafael Guerra
2017-08-17
The antimicrobial potential of Tagetes minuta was correlated with its traditional use as antibacterial, insecticidal, biocide, disinfectant, anthelminthic, antifungal, and antiseptic agent as well as its use in urinary tract infections. This study aimed to systematically review articles and patents regarding the antimicrobial activity of T. minuta and give rise to perspectives on this plant as a potential antimicrobial agent. A literature search of studies published between 1997 and 2015 was conducted over five databases: MedLine (PubMed), Web of Science, Scopus, Google Scholar, Portal de Periódicos Capes and SciFinder, grey literature was explored using the System for Information on Dissertations database, and theses were searched using the ProQuest Dissertations and Theses Full text database and the Periódicos Capes Theses database. Additionally, the following databases for patents were analysed: United States Patent and Trademark Office (USPTO), Google Patents, National Institute of Industrial Property (INPI) and Espacenet patent search (EPO). The data were tabulated and analysed using Microsoft Office Excel 2010. After title screening, 51 studies remained and this number decreased to 26 after careful examinations of the abstracts. The full texts of these 26 studies were assessed to check if they were eligible. Among them, 3 were excluded for not having full text access, and 11 were excluded because they did not fit the inclusion criteria, which left 10 articles for this systematic review. The same process was conducted for the patent search, resulting in 4 patents being included in this study. Recent advances highlighted by this review may shed light on future directions of studies concerning T. minuta as a novel antimicrobial agent, which should be repeatedly proven in future animal and clinical studies. Although more evidence on its specificity and clinical efficacy are necessary to support its clinical use, T. minuta is expected to be a highly effective, safe and affordable treatment for infectious diseases. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Southan, Christopher; Várkonyi, Péter; Muresan, Sorel
2009-07-06
Since 2004 public cheminformatic databases and their collective functionality for exploring relationships between compounds, protein sequences, literature and assay data have advanced dramatically. In parallel, commercial sources that extract and curate such relationships from journals and patents have also been expanding. This work updates a previous comparative study of databases chosen because of their bioactive content, availability of downloads and facility to select informative subsets. Where they could be calculated, extracted compounds-per-journal article were in the range of 12 to 19 but compound-per-protein counts increased with document numbers. Chemical structure filtration to facilitate standardised comparisons typically reduced source counts by between 5% and 30%. The pair-wise overlaps between 23 databases and subsets were determined, as well as changes between 2006 and 2008. While all compound sets have increased, PubChem has doubled to 14.2 million. The 2008 comparison matrix shows not only overlap but also unique content across all sources. Many of the detailed differences could be attributed to individual strategies for data selection and extraction. While there was a big increase in patent-derived structures entering PubChem since 2006, GVKBIO contains over 0.8 million unique structures from this source. Venn diagrams showed extensive overlap between compounds extracted by independent expert curation from journals by GVKBIO, WOMBAT (both commercial) and BindingDB (public) but each included unique content. In contrast, the approved drug collections from GVKBIO, MDDR (commercial) and DrugBank (public) showed surprisingly low overlap. Aggregating all commercial sources established that while 1 million compounds overlapped with PubChem 1.2 million did not. On the basis of chemical structure content per se public sources have covered an increasing proportion of commercial databases over the last two years. However, commercial products included in this study provide links between compounds and information from patents and journals at a larger scale than current public efforts. They also continue to capture a significant proportion of unique content. Our results thus demonstrate not only an encouraging overall expansion of data-supported bioactive chemical space but also that both commercial and public sources are complementary for its exploration.
Semiannual patents review July 2002–December 2002
Roland Gleisner; Julie Blankenburg
2003-01-01
This review summarizes patents related to paper recycling that were issued during the last six months of 2002. Two on-line databases, Claims/U.S. Patents Abstracts and Derwent World Patents Index, were searched for this review. This semiannual feature is intended to inform readers about recent developments in equipment design, chemicals, and process technology for...
Semiannual patents review, January-June 1999
Marguerite Sykes; Julie Blankenburg
1999-01-01
This review summarizes patents related to paper recycling that were issued during the first 6 months of 1999. The two on-line databases used for this search were C1aims/U.S. Patents Abstracts and Derwent World Patents Index. This semiannual feature is intended to inform readers about the latest developments in equipment design, chemicals, and process technology for...
Semiannual patents review, July-December 1998
Matthew Stroika; Marguerite Sykes; Julie Blankenburg
1999-01-01
This review summarizes patents related to paper recycling issued during the last 6 months of 1998. The two online databases used for this search are Claim/US. Patents Abstracts and Derwent World Patents Index. This semiannual feature is intended to inform readers about the latest developments in equipment, chemicals, and technology in the field of paper recycling. This...
Intellectual property analysis of holographic materials business
NASA Astrophysics Data System (ADS)
Reingand, Nadya; Hunt, David
2006-02-01
The paper presents an overview of intellectual property in the field of holographic photosensitive materials and highlights the possibilities offered by patent searching and analysis. Thousands of patent documents relevant to holographic materials have been uncovered by the study. The search was performed in the following databases: U.S. Patent Office, European Patent Office, and Japanese Patent Office for the time frame of 1971 through November 2005. The patent analysis has unveiled trends in patent temporal distribution, leading IP portfolios, companies competition within the holographic materials market and other interesting insights.
[Patented technology status quo and development trend for Chinese herbal medicines].
Li, Chang; Huang, Luqi
2009-06-01
Patent technology is regarded as technological trends under the market economy condition. The case showed the information form patent literature can be widely used in technology or economy. In this study, we analyzed the patent technology status quo and development trend for Chinese herbal medicines based on China patent database. The patent technology status quo is divided from the technology of biotechnology, quality control, cultivation and herb processing on Chinese herbal medicines. Furthermore, some recommendations of technology development and advices on patent protection for Chinese herbal medicines were suggested.
Trends in pharmaceutical taste masking technologies: a patent review.
Ayenew, Zelalem; Puri, Vibha; Kumar, Lokesh; Bansal, Arvind K
2009-01-01
According to the year 2003 survey of pediatricians by the American Association of Pediatrics, unpleasant taste was the biggest barrier for completing treatment in pediatrics. The field of taste masking of active pharmaceutical ingredients (API) has been continuously evolving with varied technologies and new excipients. The article reviews the trends in taste masking technologies by studying the current state of the art patent database for the span of year 1997 to 2007. The worldwide database of European patent office (http://ep.espacenet.com) was employed to collect the patents and patent applications. It also discusses the possible reasons for the change of preferences in the taste masking technologies with time. The prime factors critical to the selection of an optimal taste masking technique such as the extent of drug bitterness, solubility, particle characteristics, dosage form and dose are briefly discussed.
NASA Astrophysics Data System (ADS)
Wollny, K. G.
2013-12-01
Geophysical departments of universities or major geophysical research institutes around the world hardly ever file for a patent, even if pioneering and marketable work is done - this is what research in patent databases shows. Patents for methods, apparatuses or scientific instruments developed during scientific projects are mostly filed by companies, i.e. more than 90% of approximately 185,000 patent documents added by May 2013 to the International Patent Classification (IPC) class G01V, which the United Nations' World Intellectual Property Organisation (WIPO) has set up mainly for inventions with key aspects in geophysics. Even inventions born of cooperations between research institutes or universities and well-known geophysical companies where both act as equal partners almost never make it to the G01V. University departments responsible for intellectual property management explain that geoscientists prefer to publish their results in journals rather than in the form of patent applications even if these departments support them and parallel publication is protected legally. This means geoscientists miss the opportunity to protect their intellectual work and to tap its economic potential. But even if scientists don't want to apply for patents, patent documents constitute a wealth of knowledge that should be used much more frequently in research e.g. to stay on top of developments in one's own scientific field. Most important databases are for free, search functionality is self-explanatory and the amount of information to be extracted is enormous. All in all, about 80 million multilingual patent documents are currently available online e.g. in DEPATIS database from the German Patent and Trade Mark Office (DPMA) or ESPACENET from the European Patent Office (EPO). From a researcher's perspective, they might also be interesting for detailed technical background information, interdisciplinary solutions for similar problems, to learn about inventions too advanced for their time, but maybe useful now, and to explore the historical background and/or timelines of inventions. Patent documents can help to avoid pitfalls and mistakes other experts might already have experienced and documented in describing the state of the art or the inspiration for their invention. It will be shown how to get access to these databases, how to use them to solve scientific problems and how to leverage search results to improve expertise, work experience or facilitate personal patent application. Patent documents resemble journal articles a lot - they contain an abstract, a description regarding the state of the art, the applicant's motivation to overcome a deficit, technical figures and claims to protect the invention. This structure is used globally for all patent documents. Besides the technical facts, they include the name of the inventor, the company applying for the patent, patent validity information and potential 'family members', which cover the same invention but often in other languages than the original patent document. To summarize, patent documents are a highly useful tool to strengthen one's knowledge in a practically orientated geophysical field and to widen the horizon to adjacent technical areas.
NASA Astrophysics Data System (ADS)
Leitch, Megan E.; Casman, Elizabeth; Lowry, Gregory V.
2012-12-01
Many international groups study environmental health and safety (EHS) concerns surrounding the use of engineered nanomaterials (ENMs). These researchers frequently use the "Project on Emerging Nanotechnologies" (PEN) inventory of nano-enabled consumer products to prioritize types of ENMs to study because estimates of life-cycle ENM releases to the environment can be extrapolated from the database. An alternative "snapshot" of nanomaterials likely to enter commerce can be determined from the patent literature. The goal of this research was to provide an overview of nanotechnology intellectual property trends, complementary to the PEN consumer product database, to help identify potentially "risky" nanomaterials for study by the nano-EHS community. Ten years of nanotechnology patents were examined to determine the types of nano-functional materials being patented, the chemical compositions of the ENMs, and the products in which they are likely to appear. Patenting trends indicated different distributions of nano-enabled products and materials compared to the PEN database. Recent nanotechnology patenting is dominated by electrical and information technology applications rather than the hygienic and anti-fouling applications shown by PEN. There is an increasing emphasis on patenting of nano-scale layers, coatings, and other surface modifications rather than traditional nanoparticles, and there is widespread use of nano-functional semiconductor, ceramic, magnetic, and biological materials that are currently less studied by EHS professionals. These commonly patented products and the nano-functional materials they contain may warrant life-cycle evaluations to determine the potential for environmental exposure and toxicity. The patent and consumer product lists contribute different and complementary insights into the emerging nanotechnology industry and its potential for introducing nanomaterials into the environment.
Campos, Elena; Campos, Adolfo
2015-07-01
To determine the evolution of patents in immunology, as a result of research and innovation in the years 2004-2011. The search for patents published internationally in immunology was made by using the SCOPUSTM database. SCOPUS gives information about over 23 million patents. The extracted data from patents were: inventors and applicants; their nationalities; sections, classes and subclasses of the International Patent Classification. 89 countries. Data have been obtained from the database SCOPUS. It has been used for the international patent classification. Patents by country, Productive sectors, Productive areas. A total of 17,281 patents were applied for immunology during 2004-2011 of which 16,811 were from 30 Organisation for Economic Cooperation and Development countries, and 5326 from 28 countries in the European Union. These patents were granted in 89 countries and 13,699 of them were submitted by researchers from only one country. Private entities applied for 62.45% of all patents, universities 17.48%, hospitals 3.40% and public research organisations and private applicants applied for the rest. The university that made more applications was the University of California with 315 and the company was Genentech Inc. (US) with 302. The reduction in the number of applications of international patents in all disciplines of science also affected the area of immunology. Collaboration in immunology between universities, companies and hospitals is hard because their interests are different. It is shown in patent applications that the majority of patents in immunology are applied for by only one entity. Patents in immunology are developed, mainly, in aspects such as medical preparations, peptides, mutation or genetic engineering, therapeutic activity of chemical compounds and analysing materials by determining their chemical or physical properties.
Contribution of Latin American Countries to Cancer Research and Patent Generation: Recent Patents.
Perez-Santos, Martin; Anaya-Ruiz, Maricruz; Bandala, Cindy
2017-01-01
Data mining publications and patent data can provide decision support for scientists, inventors and industry in the field of cancer research. The main objective of this article it to identify trends of research and patent generation productivity originating from Latin American countries in the field of cancer. Publications were collected from the Scopus, Web of Science, PubMed database; and patents were collected from Latipat Espacenet databases. Data from January 1, 2000 until December 31, 2014 were searched for documents with specific words in cancer as a ''topic'' and a list of 20 Latin American countries as affiliation country. A total of 12,989 items published and 244 patent applications including "cancer" were retrieved. Brazil, Mexico, Argentina, Chile and Peru were highest contributors in cancer research, while Brazil, Mexico, Cuba and Argentina were highest contributors in cancer patent applications. The analysis of the data from this study provides an overview of research and patent activity in Latin America in the cancer field, which can be useful to help health policy makers and people in academia to shape up cancer research in the future. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
78 FR 19243 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-29
... applicant; or stored in searchable database and retrievable by patent number. Safeguards: Buildings employ... DEPARTMENT OF COMMERCE United States Patent and Trademark Office Privacy Act of 1974; System of Records AGENCY: United States Patent and Trademark Office, Commerce. ACTION: Notice of amendment of...
Wang, Hui; Zhang, Xiao-Bo; Huang, Lu-Qi; Guo, Lan-Ping; Wang, Ling; Zhao, Yu-Ping; Yang, Guang
2017-11-01
The supply of Chinese patent medicine is influenced by the price of raw materials (Chinese herbal medicines) and the stock of resources. On the one hand, raw material prices show cyclical volatility or even irreversible soaring, making the price of Chinese patent medicine is not stable or even the highest cost of hanging upside down. On the other hand, due to lack of resources or disable some of the proprietary Chinese medicine was forced to stop production. Based on the micro-service architecture and Redis cluster deployment Based on the micro-service architecture and Redis cluster deployment, the supply security monitoring and analysis system for Chinese patent medicines in national essential medicines has realized the dynamic monitoring and intelligence warning of herbs and Chinese patent medicine by connecting and integrating the database of Chinese medicine resources, the dynamic monitoring system of traditional Chinese medicine resources and the basic medicine database of Chinese patent medicine. Copyright© by the Chinese Pharmaceutical Association.
PATENTS IN GENOMICS AND HUMAN GENETICS
Cook-Deegan, Robert; Heaney, Christopher
2010-01-01
Genomics and human genetics are scientifically fundamental and commercially valuable. These fields grew to prominence in an era of growth in government and nonprofit research funding, and of even greater growth of privately funded research and development in biotechnology and pharmaceuticals. Patents on DNA technologies are a central feature of this story, illustrating how patent law adapts---and sometimes fails to adapt---to emerging genomic technologies. In instrumentation and for therapeutic proteins, patents have largely played their traditional role of inducing investment in engineering and product development, including expensive postdiscovery clinical research to prove safety and efficacy. Patents on methods and DNA sequences relevant to clinical genetic testing show less evidence of benefits and more evidence of problems and impediments, largely attributable to university exclusive licensing practices. Whole-genome sequencing will confront uncertainty about infringing granted patents but jurisprudence trends away from upholding the broadest and potentially most troublesome patent claims. PMID:20590431
Ponnaiah, Paulraj; Vnoothenei, Nagiah; Chandramohan, Muruganandham; Thevarkattil, Mohamed Javad Pazhayakath
2018-01-30
Polyhydroxyalkanoates are bio-based, biodegradable naturally occurring polymers produced by a wide range of organisms, from bacteria to higher mammals. The properties and biocompatibility of PHA make it possible for a wide spectrum of applications. In this context, we analyze the potential applications of PHA in biomedical science by exploring the global trend through the patent survey. The survey suggests that PHA is an attractive candidate in such a way that their applications are widely distributed in the medical industry, drug delivery system, dental material, tissue engineering, packaging material as well as other useful products. In our present study, we explored patents associated with various biomedical applications of polyhydroxyalkanoates. Patent databases of European Patent Office, United States Patent and Trademark Office and World Intellectual Property Organization were mined. We developed an intensive exploration approach to eliminate overlapping patents and sort out significant patents. We demarcated the keywords and search criterions and established search patterns for the database request. We retrieved documents within the recent 6 years, 2010 to 2016 and sort out the collected data stepwise to gather the most appropriate documents in patent families for further scrutiny. By this approach, we retrieved 23,368 patent documents from all the three databases and the patent titles were further analyzed for the relevance of polyhydroxyalkanoates in biomedical applications. This ensued in the documentation of approximately 226 significant patents associated with biomedical applications of polyhydroxyalkanoates and the information was classified into six major groups. Polyhydroxyalkanoates has been patented in such a way that their applications are widely distributed in the medical industry, drug delivery system, dental material, tissue engineering, packaging material as well as other useful products. There are many avenues through which PHA & PHB could be used. Our analysis shows patent information can be used to identify various applications of PHA and its representatives in the biomedical field. Upcoming studies can focus on the application of PHA in the different field to discover the related topics and associate to this study. We believe that this approach of analysis and findings can initiate new researchers to undertake similar kind of studies in their represented field to fill the gap between the patent articles and researchpublications. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
A Review on Recent Patents and Applications of Inorganic Material Binding Peptides.
Thota, Veeranjaneyulu; Perry, Carole C
2017-01-01
Although the popularity of using combinatorial display techniques for recognising unique peptides having high affinity for inorganic (nano) particles has grown rapidly, there are no systematic reviews showcasing current developments or patents on binding peptides specific to these materials. In this review, we summarize and discuss recent progress in patents on material binding peptides specifically exploring inorganic nano surfaces such as metals, metal oxides, minerals, carbonbased materials, polymer based materials, magnetic materials and semiconductors. We consider both the peptide display strategies used and the exploitation of the identified peptides in the generation of advanced nanomaterials. In order to get a clear picture on the number of patents and literature present to date relevant to inorganic material binding biomolecules and their applications, a thorough online search was conducted using national and worldwide databases. The literature search include standard bibliographic databases while patents included EPO Espacenet, WIPO patent scope, USPTO, Google patent search, Patent lens, etc. along with commercial databases such as Derwent and Patbase. Both English and American spellings were included in the searches. The initial number of patents found related to material binders were 981. After reading and excluding irrelevant patents such as organic binding peptides, works published before 2001, repeated patents, documents not in English etc., 51 highly relevant patents published from 2001 onwards were selected and analysed. These patents were further separated into six categories based on their target inorganic material and combinatorial library used. They include relevant patents on metal, metal oxide or combination binding peptides (19), magnetic and semiconductor binding peptides (8), carbon based (3), mineral (5), polymer (8) and other binders (9). Further, how these material specific binders have been used to synthesize simple to complex bio- or nano-materials, mediate the controlled biomineralization process, direct self-assembly and nanofabrication of ordered structures, facilitate the immobilization of functional biomolecules and construct inorganic-inorganic or organic-inorganic nano hybrids are concisely described. From analysis of recent literature and patents, we clearly show that biomimetic material binders are in the vanguard of new design approaches for novel nanomaterials with improved/ controlled physical and chemical properties that have no adverse effect on the structural or functional activities of the nanomaterials themselves. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
NASA Astrophysics Data System (ADS)
Strandburg, Katherine; Tobochnik, Jan; Csardi, Gabor
2005-03-01
Patent applications contain citations which are similar to but different from those found in published scientific papers. In particular, patent citations are governed by legal rules. Moreover, a large fraction of citations are made not by the patent inventor, but by a patent examiner during the application procedure. Using a patent database, which contains the patent citations, assignees and inventors, we have applied network analysis and built network models. Our work includes determining the structure of the patent citation network and comparing it to existing results for scientific citation networks; identifying differences between various technological fields and comparing the observed differences to expectations based on anecdotal evidence about patenting practice; and developing models to explain the results.
Digital pathology: A systematic evaluation of the patent landscape.
Cucoranu, Ioan C; Parwani, Anil V; Vepa, Suryanarayana; Weinstein, Ronald S; Pantanowitz, Liron
2014-01-01
Digital pathology is a relatively new field. Inventors of technology in this field typically file for patents to protect their intellectual property. An understanding of the patent landscape is crucial for companies wishing to secure patent protection and market dominance for their products. To our knowledge, there has been no prior systematic review of patents related to digital pathology. Therefore, the aim of this study was to systematically identify and evaluate United States patents and patent applications related to digital pathology. Issued patents and patent applications related to digital pathology published in the United States Patent and Trademark Office (USPTO) database (www.uspto.gov) (through January 2014) were searched using the Google Patents search engine (Google Inc., Mountain View, California, USA). Keywords and phrases related to digital pathology, whole-slide imaging (WSI), image analysis, and telepathology were used to query the USPTO database. Data were downloaded and analyzed using the Papers application (Mekentosj BV, Aalsmeer, Netherlands). A total of 588 United States patents that pertain to digital pathology were identified. In addition, 228 patent applications were identified, including 155 that were pending, 65 abandoned, and eight rejected. Of the 588 patents granted, 348 (59.18%) were specific to pathology, while 240 (40.82%) included more general patents also usable outside of pathology. There were 70 (21.12%) patents specific to pathology and 57 (23.75%) more general patents that had expired. Over 120 unique entities (individual inventors, academic institutions, and private companies) applied for pathology specific patents. Patents dealt largely with telepathology and image analysis. WSI related patents addressed image acquisition (scanning and focus), quality (z-stacks), management (storage, retrieval, and transmission of WSI files), and viewing (graphical user interface (GUI), workflow, slide navigation and remote control). An increasing number of recent patents focused on computer-aided diagnosis (CAD) and digital consultation networks. In the last 2 decades, there have been an increasing number of patents granted and patent applications filed related to digital pathology. The number of these patents quadrupled during the last decade, and this trend is predicted to intensify based on the number of patent applications already published by the USPTO.
Digital pathology: A systematic evaluation of the patent landscape
Cucoranu, Ioan C.; Parwani, Anil V.; Vepa, Suryanarayana; Weinstein, Ronald S.; Pantanowitz, Liron
2014-01-01
Introduction: Digital pathology is a relatively new field. Inventors of technology in this field typically file for patents to protect their intellectual property. An understanding of the patent landscape is crucial for companies wishing to secure patent protection and market dominance for their products. To our knowledge, there has been no prior systematic review of patents related to digital pathology. Therefore, the aim of this study was to systematically identify and evaluate United States patents and patent applications related to digital pathology. Materials and Methods: Issued patents and patent applications related to digital pathology published in the United States Patent and Trademark Office (USPTO) database (www.uspto.gov) (through January 2014) were searched using the Google Patents search engine (Google Inc., Mountain View, California, USA). Keywords and phrases related to digital pathology, whole-slide imaging (WSI), image analysis, and telepathology were used to query the USPTO database. Data were downloaded and analyzed using the Papers application (Mekentosj BV, Aalsmeer, Netherlands). Results: A total of 588 United States patents that pertain to digital pathology were identified. In addition, 228 patent applications were identified, including 155 that were pending, 65 abandoned, and eight rejected. Of the 588 patents granted, 348 (59.18%) were specific to pathology, while 240 (40.82%) included more general patents also usable outside of pathology. There were 70 (21.12%) patents specific to pathology and 57 (23.75%) more general patents that had expired. Over 120 unique entities (individual inventors, academic institutions, and private companies) applied for pathology specific patents. Patents dealt largely with telepathology and image analysis. WSI related patents addressed image acquisition (scanning and focus), quality (z-stacks), management (storage, retrieval, and transmission of WSI files), and viewing (graphical user interface (GUI), workflow, slide navigation and remote control). An increasing number of recent patents focused on computer-aided diagnosis (CAD) and digital consultation networks. Conclusion: In the last 2 decades, there have been an increasing number of patents granted and patent applications filed related to digital pathology. The number of these patents quadrupled during the last decade, and this trend is predicted to intensify based on the number of patent applications already published by the USPTO. PMID:25057430
Gitter, D M
2001-12-01
The thought of a large biotech company holding an exclusive right to research and manipulate human genetic material provokes many reactions--from moral revulsion to enthusiasm about the possibilities for therapeutic advancement. While most agree that such a right must exist, debate continues over the appropriate extent of its entitlements and preclusive effects. In this Article, Professor Donna Gitter addresses this multidimensional problem of patents on human deoxyribonucleic acid (DNA) sequences in the United States and the European Union. Professor Gitter chronicles not only the development of the law in this area, but also the array of policy and moral arguments that proponents and detractors of such patents raise. She emphasizes the specific issue of patents on DNA sequences whose function has not fully been identified, and the chilling effect these patents may have on beneficial research. From this discussion emerges a troubling realization: While the legal framework governing "life patents" may be similar in the United States and the European Union, the public perceptions and attitudes toward them are not. Professor Gitter thus proposes a dual reform: a compulsory licensing regime requiring holders of DNA sequence patents to license them to commercial researchers, in return for a royalty keyed to the financial success of the product that the licensee develops; and an experimental-use exemption from this regime for government and nonprofit researchers.
ERIC Educational Resources Information Center
Grooms, David W.
1988-01-01
Discusses the quality controls imposed on text and image data that is currently being converted from paper to digital images by the Patent and Trademark Office. The methods of inspection used on text and on images are described, and the quality of the data delivered thus far is discussed. (CLB)
End-User Searching in a Large Library Network: A Case Study of Patent Attorneys.
ERIC Educational Resources Information Center
Vollaro, Alice J.; Hawkins, Donald T.
1986-01-01
Reports results of study of a group of end users (patent attorneys) doing their own online searching at AT&T Bell Laboratories. Highlights include DIALOG databases used by the attorneys, locations and searching modes, characteristics of patent attorney searchers, and problem areas. Questionnaire is appended. (5 references) (EJS)
A database linking Chinese patents to China’s census firms
He, Zi-Lin; Tong, Tony W.; Zhang, Yuchen; He, Wenlong
2018-01-01
To meet researchers’ increasing interest in the fast growing innovation activities taking place in China, we match patents filed with China’s State Intellectual Property Office to firms covered in China’s Census. China has experienced a strong growth in patent filings over the past two decades, and has since 2011 become the world’s top patent filing country. China’s Census database covers about one million unique manufacturing firms from 1998–2009, representing the broad Chinese economy. We design data parsing and pre-processing routines to clean and stem firm and assignee names, create a matching algorithm that fits with our data and maintains a balance between matching accuracy and workload of manual check, and implement a systematic manual check process to filter out false positives generated from computerized matching. Our project generates 1,113,588 matches for the Census firms, among which 849,647 patents are uniquely matched. By creating the patent-firm linked dataset, we hope to reduce duplicative effort and encourage more research to better understand China’s fast changing innovation landscape. PMID:29583142
Campos, Elena
2015-01-01
Objectives To determine the evolution of patents in immunology, as a result of research and innovation in the years 2004–2011. Design The search for patents published internationally in immunology was made by using the SCOPUSTM database. SCOPUS gives information about over 23 million patents. The extracted data from patents were: inventors and applicants; their nationalities; sections, classes and subclasses of the International Patent Classification. Participants 89 countries Setting Data have been obtained from the database SCOPUS. It has been used for the international patent classification. Main outcome measures Patents by country, Productive sectors, Productive areas Results A total of 17,281 patents were applied for immunology during 2004–2011 of which 16,811 were from 30 Organisation for Economic Cooperation and Development countries, and 5326 from 28 countries in the European Union. These patents were granted in 89 countries and 13,699 of them were submitted by researchers from only one country. Private entities applied for 62.45% of all patents, universities 17.48%, hospitals 3.40% and public research organisations and private applicants applied for the rest. The university that made more applications was the University of California with 315 and the company was Genentech Inc. (US) with 302. The reduction in the number of applications of international patents in all disciplines of science also affected the area of immunology. Conclusions Collaboration in immunology between universities, companies and hospitals is hard because their interests are different. It is shown in patent applications that the majority of patents in immunology are applied for by only one entity. Patents in immunology are developed, mainly, in aspects such as medical preparations, peptides, mutation or genetic engineering, therapeutic activity of chemical compounds and analysing materials by determining their chemical or physical properties. PMID:28008369
McDonald, Rebecca; Danielsson Glende, Øyvind; Dale, Ola; Strang, John
2018-02-01
Non-injectable naloxone formulations are being developed for opioid overdose reversal, but only limited data have been published in the peer-reviewed domain. Through examination of a hitherto-unsearched database, we expand public knowledge of non-injectable formulations, tracing their development and novelty, with the aim to describe and compare their pharmacokinetic properties. (i) The PatentScope database of the World Intellectual Property Organization was searched for relevant English-language patent applications; (ii) Pharmacokinetic data were extracted, collated and analysed; (iii) PubMed was searched using Boolean search query '(nasal OR intranasal OR nose OR buccal OR sublingual) AND naloxone AND pharmacokinetics'. Five hundred and twenty-two PatentScope and 56 PubMed records were identified: three published international patent applications and five peer-reviewed papers were eligible. Pharmacokinetic data were available for intranasal, sublingual, and reference routes. Highly concentrated formulations (10-40 mg mL -1 ) had been developed and tested. Sublingual bioavailability was very low (1%; relative to intravenous). Non-concentrated intranasal spray (1 mg mL -1 ; 1 mL per nostril) had low bioavailability (11%). Concentrated intranasal formulations (≥10 mg mL -1 ) had bioavailability of 21-42% (relative to intravenous) and 26-57% (relative to intramuscular), with peak concentrations (dose-adjusted C max = 0.8-1.7 ng mL -1 ) reached in 19-30 min (t max ). Exploratory analysis identified intranasal bioavailability as associated positively with dose and negatively with volume. We find consistent direction of development of intranasal sprays to high-concentration, low-volume formulations with bioavailability in the 20-60% range. These have potential to deliver a therapeutic dose in 0.1 mL volume. [McDonald R, Danielsson Glende Ø, Dale O, Strang J. International patent applications for non-injectable naloxone for opioid overdose reversal: Exploratory search and retrieve analysis of the PatentScope database. Drug Alcohol Rev 2017;00:000-000]. © 2017 Australasian Professional Society on Alcohol and other Drugs.
Han, Shi-You; Hong, Zhi-You; Xie, Yu-Hua; Zhao, Yong; Xu, Xiao
2017-12-01
Stroke is a condition with high morbidity and mortality, and 75% of stroke survivors lose their ability to work. Stroke is a burden to the family and society. The purpose of this study was to evaluate the effectiveness of Chinese herbal patent medicines in the treatment of patients after the acute phase of a stroke. We searched the following databases through August 2016: PubMed, Embase, Cochrane library, China Knowledge Resource Integrated Database (CNKI), China Science Periodical Database (CSPD), and China Biology Medicine disc (CBMdisc) for studies that evaluated Chinese herbal patent medicines for post stroke recovery. A random-effect model was used to pool therapeutic effects of Chinese herbal patent medicines on stroke recovery. Network meta-analysis was used to rank the treatment for each Chinese herbal patent medicine. In our meta-analysis, we evaluated 28 trials that included 2780 patients. Chinese herbal patent medicines were effective in promoting recovery after stroke (OR, 3.03; 95% CI: 2.53-3.64; P < .001). Chinese herbal patent medicines significantly improved neurological function defect scores when compared with the controls (standard mean difference [SMD], -0.89; 95% CI, -1.44 to -0.35; P = .001). Chinese herbal patent medicines significantly improved the Barthel index (SMD, 0.73; 95% CI, 0.53-0.94; P < .001) and the Fugl-Meyer assessment scores (SMD, 0.60; 95% CI, 0.34-0.86; P < .001). In the network analysis, MLC601, Shuxuetong, and BuchangNaoxintong were most likely to improve stroke recovery in patients without acupuncture. Additionally, Mailuoning, Xuesaitong, BuchangNaoxintong were the patented Chinese herbal medicines most likely to improve stroke recovery when combined with acupuncture. Our research suggests that the Chinese herbal patent medicines were effective for stroke recovery. The most effective treatments for stroke recovery were MLC601, Shuxuetong, and BuchangNaoxintong. However, to clarify the specific effective ingredients of Chinese herbal medicines, a well-designed study is warranted.
Ozcan, Sercan; Islam, Nazrul
2017-01-01
Many challenges still remain in the processing of explicit technological knowledge documents such as patents. Given the limitations and drawbacks of the existing approaches, this research sets out to develop an improved method for searching patent databases and extracting patent information to increase the efficiency and reliability of nanotechnology patent information retrieval process and to empirically analyse patent collaboration. A tech-mining method was applied and the subsequent analysis was performed using Thomson data analyser software. The findings show that nations such as Korea and Japan are highly collaborative in sharing technological knowledge across academic and corporate organisations within their national boundaries, and China presents, in some cases, a great illustration of effective patent collaboration and co-inventorship. This study also analyses key patent strengths by country, organisation and technology.
75 FR 65611 - Native American Tribal Insignia Database
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-26
... DEPARTMENT OF COMMERCE Patent and Trademark Office Native American Tribal Insignia Database ACTION... comprehensive database containing the official insignia of all federally- and State- recognized Native American... to create this database. The USPTO database of official tribal insignias assists trademark attorneys...
Patent Administration by Office Computer - A Case at Mazda Motor Corporation
NASA Astrophysics Data System (ADS)
Kimura, Ikuo; Nakamura, Shinji
The needs of patent administration have been diversified reflecting R&D activities under the severe competition of technical development, and business has been increased in quantity year after year as seen in patent application. Under these circumstances it is necessary to develop business mechanization which assists manual operation as much as possible to enforce the patent administration. Introducing office computer (CPU 512 KB, external memory 128 MB) for exclusive use in this purpose, Patent Department of Mazda Motor Corporation has been constructing database of patent administration centered around patent application by their own company, and utilizes it for automatic preparation of business forms, preparation of various statistical materials, and real-time reference to the application procedures.
Intellectual property issues in holography and high tech
NASA Astrophysics Data System (ADS)
Reingand, Nadya
2004-06-01
The author with technical education background (Ph.D. in holography) shares her 3+ years of experience working on intellectual property (IP) issues that includes patents, trademarks, and copyrights. A special attention is paid to the patent issues: the application procedure, the patent requirements, the databases for prior art search, how to make the cost efficient filing.
Senger, Stefan
2017-04-21
Patents are an important source of information for effective decision making in drug discovery. Encouragingly, freely accessible patent-chemistry databases are now in the public domain. However, at present there is still a wide gap between relatively low coverage-high quality manually-curated data sources and high coverage data sources that use text mining and automated extraction of chemical structures. To secure much needed funding for further research and an improved infrastructure, hard evidence is required to demonstrate the significance of patent-derived information in drug discovery. Surprisingly little such evidence has been reported so far. To address this, the present study attempts to quantify the relevance of patents for formulating and substantiating hypotheses for compound-target interactions. A manually-curated set of 130 compound-target interaction pairs annotated with what are considered to be the earliest patent and publication has been produced. The analysis of this set revealed that in stark contrast to what has been reported for novel chemical structures, only about 10% of the compound-target interaction pairs could be found in publications in the scientific literature within one year of being reported in patents. The average delay across all interaction pairs is close to 4 years. In an attempt to benchmark current capabilities, it was also examined how much of the benefit of using patent-derived information can be retained when a bioannotated version of SureChEMBL is used as secondary source for the patent literature. Encouragingly, this approach found the patents in the annotated set for 72% of the compound-target interaction pairs. Similarly, the effect of using the bioactivity database ChEMBL as secondary source for the scientific literature was studied. Here, the publications from the annotated set were only found for 46% of the compound-target interaction pairs. Patent-derived information is a significant enabler for formulating compound-target interaction hypotheses even in cases where the respective interaction is later reported in the scientific literature. The findings of this study clearly highlight the significance of future investments in the development and provision of databases and tools that will allow scientists to search patent information in a comprehensive, reliable, and efficient manner.
Determinants of the Pace of Global Innovation in Energy Technologies
2013-10-14
quality (see Figures S1 and S2 in File S1), a comprehensive patent database is a powerful tool for investigating the determinants of innovative...model in order to avoid overfitting the data and to maximize predictive power . We develop a model that explains the observed trends in energy...patents. (A.) World map of cumulative patents in photovoltaics (solar). Japan is the leading nation in terms of patent numbers, followed by the US and China
Recently Patented Viral Nucleotide Sequences and Generation of Virus-Derived Vaccines.
Venkataraman, Srividhya; Ahmad, Tauqeer; Haidar, Mounir A; Hefferon, Kathleen L
2017-01-01
With an increase in comprehension of the molecular biology of viruses, there has been a recent surge in the application of virus sequences and viral gene expression strategies towards the diagnosis and treatment of diseases. The scope of the patenting landscape has widened as a result and the current review discusses patents pertaining to live / attenuated viral vaccines. The vaccines addressed here have been developed by both conventional means as well as by the state-of-the-art genetic engineering techniques. This review also addresses the applications of these patents for clinical and biotechnological purposes. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
He, Yu-su; Sun, Zhi-yi; Zhang, Yan-ling
2014-11-01
By using the pharmacophore model of mineralocorticoid receptor antagonists as a starting point, the experiment stud- ies the method of traditional Chinese medicine formula design for anti-hypertensive. Pharmacophore models were generated by 3D-QSAR pharmacophore (Hypogen) program of the DS3.5, based on the training set composed of 33 mineralocorticoid receptor antagonists. The best pharmacophore model consisted of two Hydrogen-bond acceptors, three Hydrophobic and four excluded volumes. Its correlation coefficient of training set and test set, N, and CAI value were 0.9534, 0.6748, 2.878, and 1.119. According to the database screening, 1700 active compounds from 86 source plant were obtained. Because of lacking of available anti-hypertensive medi cation strategy in traditional theory, this article takes advantage of patent retrieval in world traditional medicine patent database, in order to design drug formula. Finally, two formulae was obtained for antihypertensive.
Two centuries of French patents as documentation of musical instrument construction
NASA Astrophysics Data System (ADS)
Jean, Haury
2005-09-01
The French Patent Office I.N.P.I. has preserved the originals of ca. 12
NASA Astrophysics Data System (ADS)
Nakaike, Shin'ichi; Tanaka, Masao
The authors describe present status of patent information service by JAPIO, new on-line system project (PATOLIS-III), Paperless Project by the Patent Office and input of domestic gazettes for patent into optical disks. They also describe CD-ROM created by using image information of the gazettes for patent which is produced under the Paperless Project, its production method, and the terminals and their functions. Some problems found in CD-ROM of JAPIO, such as time lag for the issuance, treatment of the multiple copies, and countermeasures against them are mentioned.
78 FR 60861 - Native American Tribal Insignia Database
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-02
... Database ACTION: Proposed collection; comment request. SUMMARY: The United States Patent and Trademark... the report was that the USPTO create and maintain an accurate and comprehensive database containing... this recommendation, the Senate Committee on Appropriations directed the USPTO to create this database...
US photovoltaic patents: 1991-1993
NASA Astrophysics Data System (ADS)
Pohle, L.
1995-03-01
This document contains US patents on terrestrial photovoltaic (PV) power applications, including systems, components, and materials as well as manufacturing and support functions. The patent entries in this document were issued from 1991 to 1993. The entries were located by searching USPA, the database of the US Patent Office. The final search retrieved all patents under the class 'Batteries, Thermoelectric and Photoelectric' and the subclasses 'Photoelectric,' 'Testing,' and 'Applications.' The search also located patents that contained the words 'photovoltaic(s)' or 'solar cell(s)' and their derivatives. After the initial list was compiled, most of the patents on the following subjects were excluded: space photovoltaic technology, use of the photovoltaic effect for detectors, and subjects only peripherally concerned with photovoltaic. Some patents on these three subjects were included when ft appeared that those inventions might be of use in terrestrial PV power technologies.
Therapeutic effect of Chinese herbal medicines for post stroke recovery
Han, Shi-You; Hong, Zhi-You; Xie, Yu-Hua; Zhao, Yong; Xu, Xiao
2017-01-01
Abstract Background: Stroke is a condition with high morbidity and mortality, and 75% of stroke survivors lose their ability to work. Stroke is a burden to the family and society. The purpose of this study was to evaluate the effectiveness of Chinese herbal patent medicines in the treatment of patients after the acute phase of a stroke. Methods: We searched the following databases through August 2016: PubMed, Embase, Cochrane library, China Knowledge Resource Integrated Database (CNKI), China Science Periodical Database (CSPD), and China Biology Medicine disc (CBMdisc) for studies that evaluated Chinese herbal patent medicines for post stroke recovery. A random-effect model was used to pool therapeutic effects of Chinese herbal patent medicines on stroke recovery. Network meta-analysis was used to rank the treatment for each Chinese herbal patent medicine. Results: In our meta-analysis, we evaluated 28 trials that included 2780 patients. Chinese herbal patent medicines were effective in promoting recovery after stroke (OR, 3.03; 95% CI: 2.53–3.64; P < .001). Chinese herbal patent medicines significantly improved neurological function defect scores when compared with the controls (standard mean difference [SMD], −0.89; 95% CI, −1.44 to −0.35; P = .001). Chinese herbal patent medicines significantly improved the Barthel index (SMD, 0.73; 95% CI, 0.53–0.94; P < .001) and the Fugl–Meyer assessment scores (SMD, 0.60; 95% CI, 0.34–0.86; P < .001). In the network analysis, MLC601, Shuxuetong, and BuchangNaoxintong were most likely to improve stroke recovery in patients without acupuncture. Additionally, Mailuoning, Xuesaitong, BuchangNaoxintong were the patented Chinese herbal medicines most likely to improve stroke recovery when combined with acupuncture. Conclusions: Our research suggests that the Chinese herbal patent medicines were effective for stroke recovery. The most effective treatments for stroke recovery were MLC601, Shuxuetong, and BuchangNaoxintong. However, to clarify the specific effective ingredients of Chinese herbal medicines, a well-designed study is warranted. PMID:29245245
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.
Code of Federal Regulations, 2014 CFR
2014-07-01
...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.
Code of Federal Regulations, 2013 CFR
2013-07-01
...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.
Code of Federal Regulations, 2012 CFR
2012-07-01
...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
Complement in Action: An Analysis of Patent Trends from 1976 Through 2011.
Yang, Kun; Deangelis, Robert A; Reed, Janet E; Ricklin, Daniel; Lambris, John D
2013-01-01
Complement is an essential part of the innate immune response. It interacts with diverse endogenous pathways and contributes to the maintenance of homeostasis, the modulation of adaptive immune responses, and the development of various pathologies. The potential usefulness, in both research and clinical settings, of compounds that detect or modulate complement activity has resulted in thousands of publications on complement-related innovations in fields such as drug discovery, disease diagnosis and treatment, and immunoassays, among others. This study highlights the distribution and publication trends of patents related to the complement system that were granted by the United States Patent and Trademark Office from 1976 to the present day. A comparison to complement-related documents published by the World Intellectual Property Organization is also included. Statistical analyses revealed increasing diversity in complement-related research interests over time. More than half of the patents were found to focus on the discovery of inhibitors; interest in various inhibitor classes exhibited a remarkable transformation from chemical compounds early on to proteins and antibodies in more recent years. Among clinical applications, complement proteins and their modulators have been extensively patented for the diagnosis and treatment of eye diseases (especially age-related macular degeneration), graft rejection, cancer, sepsis, and a variety of other inflammatory and immune diseases. All of the patents discussed in this chapter, as well as those from other databases, are available from our newly constructed complement patent database: www.innateimmunity.us/patent .
Complement in action: an analysis of patent trends from 1976 through 2011.
Yang, Kun; DeAngelis, Robert A; Reed, Janet E; Ricklin, Daniel; Lambris, John D
2013-01-01
Complement is an essential part of the innate immune response. It interacts with diverse endogenous pathways and contributes to the maintenance of homeostasis, the modulation of adaptive immune responses, and the development of various pathologies. The potential usefulness, in both research and clinical settings, of compounds that detect or modulate complement activity has resulted in thousands of publications on complement-related innovations in fields such as drug discovery, disease diagnosis and treatment, and immunoassays, among others. This study highlights the distribution and publication trends of patents related to the complement system that were granted by the United States Patent and Trademark Office from 1976 to the present day. A comparison to complement-related documents published by the World Intellectual Property Organization is also included. Statistical analyses revealed increasing diversity in complement-related research interests over time. More than half of the patents were found to focus on the discovery of inhibitors; interest in various inhibitor classes exhibited a remarkable transformation from chemical compounds early on to proteins and antibodies in more recent years. Among clinical applications, complement proteins and their modulators have been extensively patented for the diagnosis and treatment of eye diseases (especially age-related macular degeneration), graft rejection, cancer, sepsis, and a variety of other inflammatory and immune diseases. All of the patents discussed in this chapter, as well as those from other databases, are available from our newly constructed complement patent database: www.innateimmunity.us/patent.
Nano/micro-electro mechanical systems: a patent view
NASA Astrophysics Data System (ADS)
Hu, Guangyuan; Liu, Weishu
2015-12-01
Combining both bibliometrics and citation network analysis, this research evaluates the global development of micro-electro mechanical systems (MEMS) research based on the Derwent Innovations Index database. We found that worldwide, the growth trajectory of MEMS patents demonstrates an approximate S shape, with United States, Japan, China, and Korea leading the global MEMS race. Evidenced by Derwent class codes, the technology structure of global MEMS patents remains steady over time. Yet there does exist a national competitiveness component among the top country players. The latecomer China has become the second most prolific country filing MEMS patents, but its patent quality still lags behind the global average.
THPdb: Database of FDA-approved peptide and protein therapeutics.
Usmani, Salman Sadullah; Bedi, Gursimran; Samuel, Jesse S; Singh, Sandeep; Kalra, Sourav; Kumar, Pawan; Ahuja, Anjuman Arora; Sharma, Meenu; Gautam, Ankur; Raghava, Gajendra P S
2017-01-01
THPdb (http://crdd.osdd.net/raghava/thpdb/) is a manually curated repository of Food and Drug Administration (FDA) approved therapeutic peptides and proteins. The information in THPdb has been compiled from 985 research publications, 70 patents and other resources like DrugBank. The current version of the database holds a total of 852 entries, providing comprehensive information on 239 US-FDA approved therapeutic peptides and proteins and their 380 drug variants. The information on each peptide and protein includes their sequences, chemical properties, composition, disease area, mode of activity, physical appearance, category or pharmacological class, pharmacodynamics, route of administration, toxicity, target of activity, etc. In addition, we have annotated the structure of most of the protein and peptides. A number of user-friendly tools have been integrated to facilitate easy browsing and data analysis. To assist scientific community, a web interface and mobile App have also been developed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1992-12-01
The bibliography contains citations of selected patents concerning fuel control devices and methods for use in internal combustion engines. Patents describe air-fuel ratio control, fuel injection systems, evaporative fuel control, and surge-corrected fuel control. Citations also discuss electronic and feedback control, methods for engine protection, and fuel conservation. (Contains a minimum of 232 citations and includes a subject term index and title list.)
US photovoltaic patents: 1991--1993
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pohle, L
1995-03-01
This document contains US patents on terrestrial photovoltaic (PV) power applications, including systems, components, and materials as well as manufacturing and support functions. The patent entries in this document were issued from 1991 to 1993. The entries were located by searching USPA, the database of the US Patent Office. The final search retrieved all patents under the class ``Batteries, Thermoelectric and Photoelectric`` and the subclasses ``Photoelectric,`` ``Testing,`` and ``Applications.`` The search also located patents that contained the words ``photovoltaic(s)`` or ``solar cell(s)`` and their derivatives. After the initial list was compiled, most of the patents on the following subjects weremore » excluded: space photovoltaic technology, use of the photovoltaic effect for detectors, and subjects only peripherally concerned with photovoltaic. Some patents on these three subjects were included when ft appeared that those inventions might be of use in terrestrial PV power technologies.« less
Wiechers, Ilse R; Perin, Noah C; Cook-Deegan, Robert
2013-01-01
Development of the commercial genomics sector within the biotechnology industry relied heavily on the scientific commons, public funding, and technology transfer between academic and industrial research. This study tracks financial and intellectual property data on genomics firms from 1990 through 2004, thus following these firms as they emerged in the era of the Human Genome Project and through the 2000 to 2001 market bubble. A database was created based on an early survey of genomics firms, which was expanded using three web-based biotechnology services, scientific journals, and biotechnology trade and technical publications. Financial data for publicly traded firms was collected through the use of four databases specializing in firm financials. Patent searches were conducted using firm names in the US Patent and Trademark Office website search engine and the DNA Patent Database. A biotechnology subsector of genomics firms emerged in parallel to the publicly funded Human Genome Project. Trends among top firms show that hiring, capital improvement, and research and development expenditures continued to grow after a 2000 to 2001 bubble. The majority of firms are small businesses with great diversity in type of research and development, products, and services provided. Over half the public firms holding patents have the majority of their intellectual property portfolio in DNA-based patents. These data allow estimates of investment, research and development expenditures, and jobs that paralleled the rise of genomics as a sector within biotechnology between 1990 and 2004.
ROSA, Wellington Luiz de Oliveira; SILVA, Tiago Machado; LIMA, Giana da Silveira; SILVA, Adriana Fernandes; PIVA, Evandro
2016-01-01
ABSTRACT Objective A systematic review was conducted to analyze Brazilian scientific and technological production related to the dental materials field over the past 50 years. Material and Methods This study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (Prisma) statement. Searches were performed until December 2014 in six databases: MedLine (PubMed), Scopus, LILACS, IBECS, BBO, and the Cochrane Library. Additionally, the Brazilian patent database (INPI - Instituto Nacional de Propriedade Industrial) was screened in order to get an overview of Brazilian technological development in the dental materials field. Two reviewers independently analyzed the documents. Only studies and patents related to dental materials were included in this review. Data regarding the material category, dental specialty, number of documents and patents, filiation countries, and the number of citations were tabulated and analyzed in Microsoft Office Excel (Microsoft Corporation, Redmond, Washington, United States). Results A total of 115,806 studies and 53 patents were related to dental materials and were included in this review. Brazil had 8% affiliation in studies related to dental materials, and the majority of the papers published were related to dental implants (1,137 papers), synthetic resins (681 papers), dental cements (440 papers), dental alloys (392 papers) and dental adhesives (361 papers). The Brazilian technological development with patented dental materials was smaller than the scientific production. The most patented type of material was dental alloys (11 patents), followed by dental implants (8 patents) and composite resins (7 patents). Conclusions Dental materials science has had a substantial number of records, demonstrating an important presence in scientific and technological development of dentistry. In addition, it is important to approximate the relationship between academia and industry to expand the technological development in countries such as Brazil. PMID:27383712
Rosa, Wellington Luiz de Oliveira; Silva, Tiago Machado; Lima, Giana da Silveira; Silva, Adriana Fernandes; Piva, Evandro
2016-01-01
A systematic review was conducted to analyze Brazilian scientific and technological production related to the dental materials field over the past 50 years. This study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (Prisma) statement. Searches were performed until December 2014 in six databases: MedLine (PubMed), Scopus, LILACS, IBECS, BBO, and the Cochrane Library. Additionally, the Brazilian patent database (INPI - Instituto Nacional de Propriedade Industrial) was screened in order to get an overview of Brazilian technological development in the dental materials field. Two reviewers independently analyzed the documents. Only studies and patents related to dental materials were included in this review. Data regarding the material category, dental specialty, number of documents and patents, filiation countries, and the number of citations were tabulated and analyzed in Microsoft Office Excel (Microsoft Corporation, Redmond, Washington, United States). A total of 115,806 studies and 53 patents were related to dental materials and were included in this review. Brazil had 8% affiliation in studies related to dental materials, and the majority of the papers published were related to dental implants (1,137 papers), synthetic resins (681 papers), dental cements (440 papers), dental alloys (392 papers) and dental adhesives (361 papers). The Brazilian technological development with patented dental materials was smaller than the scientific production. The most patented type of material was dental alloys (11 patents), followed by dental implants (8 patents) and composite resins (7 patents). Dental materials science has had a substantial number of records, demonstrating an important presence in scientific and technological development of dentistry. In addition, it is important to approximate the relationship between academia and industry to expand the technological development in countries such as Brazil.
Products with Natural Components to Heal Dermal Burns: A Patent Review.
de Melo Costa, Aida Carla Santana; Pereira Ramos, Karen Perez; Serafini, Mairim Russo; de Carvalho, Fernanda Oliveira; Teixeira, Luciana Garcez Barretto; Garcao, Diogo Costa; Shanmugam, Saravanan; de Souza Araujo, Adriano Antunes; Nunes, Paula Santos
2015-01-01
Burns are a global public health problem, and non-fatal burn injuries are a leading cause of morbidity. The scale of the problem has led researchers to seek to develop new prod- ucts (both synthetic and natural) for use in the treatment of burn lesions. The aim of this study was to examine all patents in databases between 2010 and 2015 related to natural prod- ucts for the treatment of burn-related wounds that targeted tissue repair and healing. The search term "burn" and the code A61K36/00 (plant and other natural derivatives used in medicinal prepara- tions) from the international classification of patents were used to identify treatments. The search was performed in the WIPO, ESPACENET and USPTO databases. The highest number of patent ap- plications was found in the WIPO data base (617), followed by ESPACENET(23) and USPTO(6). The USA and China were the countries with the most patent applications, and 2008 was the year that had the highest number of applications. Patent applications written in Spanish, English and Portuguese and that were published between 2010 and 2015 were se- lected. 559 patent applications in other languages, and 63 that did not result in the creation of new products between 2010 and 2015 were excluded and the remaining 13 patents application were selected for full reading of the text. Through this study we were able to identify and summarize the new active natural compounds that can be used in the treatment of burns, both in terms of tissue recovery and analgesia.
Biotechnological Patents Applications of the Deuterium Oxide in Human Health.
da S Mariano, Reysla M; Bila, Wendell C; Trindade, Maria Jaciara F; Lamounier, Joel A; Galdino, Alexsandro S
2017-01-01
Deuterium oxide is a molecule that has been used for decades in several studies related to human health. Currently, studies on D2O have mobilized a "Race for Patenting" worldwide. Several patents have been registered from biomedical and technological studies of D2O showing the potential of this stable isotope in industry and health care ecosystems. Most of the patents related to the applications of the deuterium oxide in human health have been summarized in this review. The following patents databases were consulted: European Patent Office (Espacenet), the United States Patent and Trademark Office (USPTO), the United States Latin America Patents (LATIPAT), Patent scope -Search International and National Patent Collections (WIPO), Google Patents and Free Patents Online. With this review, the information was collected on recent publications including 22 patents related to deuterium oxide and its applications in different areas. This review showed that deuterium oxide is a promising component in different areas, including biotechnology, chemistry and medicine. In addition, the knowledge of this compound was covered, reinforcing its importance in the field of biotechnology and human health. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Response to 'pervasive sequence patents cover the entire human genome' - authors' reply.
Rosenfeld, Jeffrey; Mason, Christopher
2014-01-01
An author reply to the Letter to the Editor from Tu et al. regarding Pervasive sequence patents cover the entire human genome by J Rosenfeld and C Mason. Genome Med 2013, 5:27. See related Correspondence by Rosenfeld and Mason, http://genomemedicine.com/content/5/3/27, and related letter by Tu et al., http://genomemedicine.com/content/6/2/14.
Quantifying innovation in surgery.
Hughes-Hallett, Archie; Mayer, Erik K; Marcus, Hani J; Cundy, Thomas P; Pratt, Philip J; Parston, Greg; Vale, Justin A; Darzi, Ara W
2014-08-01
The objectives of this study were to assess the applicability of patents and publications as metrics of surgical technology and innovation; evaluate the historical relationship between patents and publications; develop a methodology that can be used to determine the rate of innovation growth in any given health care technology. The study of health care innovation represents an emerging academic field, yet it is limited by a lack of valid scientific methods for quantitative analysis. This article explores and cross-validates 2 innovation metrics using surgical technology as an exemplar. Electronic patenting databases and the MEDLINE database were searched between 1980 and 2010 for "surgeon" OR "surgical" OR "surgery." Resulting patent codes were grouped into technology clusters. Growth curves were plotted for these technology clusters to establish the rate and characteristics of growth. The initial search retrieved 52,046 patents and 1,801,075 publications. The top performing technology cluster of the last 30 years was minimally invasive surgery. Robotic surgery, surgical staplers, and image guidance were the most emergent technology clusters. When examining the growth curves for these clusters they were found to follow an S-shaped pattern of growth, with the emergent technologies lying on the exponential phases of their respective growth curves. In addition, publication and patent counts were closely correlated in areas of technology expansion. This article demonstrates the utility of publically available patent and publication data to quantify innovations within surgical technology and proposes a novel methodology for assessing and forecasting areas of technological innovation.
37 CFR 202.12 - Restored copyrights.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Restored copyrights. 202.12 Section 202.12 Patents, Trademarks, and Copyrights COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT OFFICE... giving the title of the work, nature of the work (for example: computer program, database, videogame, etc...
37 CFR 202.12 - Restored copyrights.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Restored copyrights. 202.12 Section 202.12 Patents, Trademarks, and Copyrights COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT OFFICE... giving the title of the work, nature of the work (for example: computer program, database, videogame, etc...
37 CFR 202.12 - Restored copyrights.
Code of Federal Regulations, 2014 CFR
2014-07-01
... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Restored copyrights. 202.12 Section 202.12 Patents, Trademarks, and Copyrights U.S. COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT..., database, videogame, etc.), plus a brief description of the contents or subject matter of the work. (c...
Patents of drugs extracted from Brazilian medicinal plants.
Balbani, Aracy P S; Silva, Dulce H S; Montovani, Jair C
2009-04-01
Plants synthesise a vast repertoire of chemicals with various biological activities. Brazilian enormous botanical diversity facilitates the development of novel ethical drugs for the treatment of diseases in humans. To present therapeutic patent applications comprising Brazilian native plants published in the 2003-2008 period in light of legal aspects of patentability of biodiversity and public health concerns. Therapeutic patent applications related to Brazilian medicinal plants available at both the European Patent Office and the Brazilian National Institute of Industrial Property databases were reviewed. Twenty-five patents are presented, most of which concern inflammatory, allergic, parasitic, infectious or digestive diseases, including extracts from Carapa guianensis, Copaifera genus, Cordia verbenacea, Erythrina mulungu, Physalis angulata and other pharmaceutical compositions with antileishmanial, antimalarial or trypanocidal activity. Brazilian research centres and universities are responsible for most of these inventions.
NASA Astrophysics Data System (ADS)
Huang, Zan; Chen, Hsinchun; Chen, Zhi-kai; Roco, Mihail C.
2004-08-01
Nanoscale science and engineering (NSE) have seen rapid growth and expansion in new areas in recent years. This paper provides an international patent analysis using the U.S. Patent and Trademark Office (USPTO) data searched by keywords of the entire text: title, abstract, claims, and specifications. A fraction of these patents fully satisfy the National Nanotechnology Initiative definition of nanotechnology (which requires exploiting specific phenomena and direct manipulation at the nanoscale), while others only make use of NSE tools and methods of investigation. In previous work we proposed an integrated patent analysis and visualization framework of patent content mapping for the NSE field and of knowledge flow pattern identification until 2002. In this paper, the results are updated for 2003, and the new trends are presented.
First Look: TRADEMARKSCAN Database.
ERIC Educational Resources Information Center
Fernald, Anne Conway; Davidson, Alan B.
1984-01-01
Describes database produced by Thomson and Thomson and available on Dialog which contains over 700,000 records representing all active federal trademark registrations and applications for registrations filed in United States Patent and Trademark Office. A typical record, special features, database applications, learning to use TRADEMARKSCAN, and…
Response to ‘pervasive sequence patents cover the entire human genome’ - authors’ reply
2014-01-01
An author reply to the Letter to the Editor from Tu et al. regarding Pervasive sequence patents cover the entire human genome by J Rosenfeld and C Mason. Genome Med 2013, 5:27. See related Correspondence by Rosenfeld and Mason, http://genomemedicine.com/content/5/3/27, and related letter by Tu et al., http://genomemedicine.com/content/6/2/14 PMID:24764495
Trends in genetic patent applications: the commercialization of academic intellectual property
Kers, Jannigje G; Van Burg, Elco; Stoop, Tom; Cornel, Martina C
2014-01-01
We studied trends in genetic patent applications in order to identify the trends in the commercialization of research findings in genetics. To define genetic patent applications, the European version (ECLA) of the International Patent Classification (IPC) codes was used. Genetic patent applications data from the PATSTAT database from 1990 until 2009 were analyzed for time trends and regional distribution. Overall, the number of patent applications has been growing. In 2009, 152 000 patent applications were submitted under the Patent Cooperation Treaty (PCT) and within the EP (European Patent) system of the European Patent Office (EPO). The number of genetic patent applications increased until a peak was reached in the year 2000, with >8000 applications, after which it declined by almost 50%. Continents show different patterns over time, with the global peak in 2000 mainly explained by the USA and Europe, while Asia shows a stable number of >1000 per year. Nine countries together account for 98.9% of the total number of genetic patent applications. In The Netherlands, 26.7% of the genetic patent applications originate from public research institutions. After the year 2000, the number of genetic patent applications dropped significantly. Academic leadership and policy as well as patent regulations seem to have an important role in the trend differences. The ongoing investment in genetic research in the past decade is not reflected by an increase of patent applications. PMID:24448546
Trends in genetic patent applications: the commercialization of academic intellectual property.
Kers, Jannigje G; Van Burg, Elco; Stoop, Tom; Cornel, Martina C
2014-10-01
We studied trends in genetic patent applications in order to identify the trends in the commercialization of research findings in genetics. To define genetic patent applications, the European version (ECLA) of the International Patent Classification (IPC) codes was used. Genetic patent applications data from the PATSTAT database from 1990 until 2009 were analyzed for time trends and regional distribution. Overall, the number of patent applications has been growing. In 2009, 152 000 patent applications were submitted under the Patent Cooperation Treaty (PCT) and within the EP (European Patent) system of the European Patent Office (EPO). The number of genetic patent applications increased until a peak was reached in the year 2000, with >8000 applications, after which it declined by almost 50%. Continents show different patterns over time, with the global peak in 2000 mainly explained by the USA and Europe, while Asia shows a stable number of >1000 per year. Nine countries together account for 98.9% of the total number of genetic patent applications. In The Netherlands, 26.7% of the genetic patent applications originate from public research institutions. After the year 2000, the number of genetic patent applications dropped significantly. Academic leadership and policy as well as patent regulations seem to have an important role in the trend differences. The ongoing investment in genetic research in the past decade is not reflected by an increase of patent applications.
MYRIAD AFTER MYRIAD: THE PROPRIETARY DATA DILEMMA
Conley, John M.; Cook-Deegan, Robert; Lázaro-Muñoz, Gabriel
2014-01-01
Myriad Genetics’ long-time monopoly on BRCA gene testing was significantly narrowed by the Supreme Court’s decision in AMP v. Myriad Genetics, Inc., and will be further narrowed in the next few years as many of its still-valid patents expire. But these developments have not caused the company to acquiesce in competition. Instead, it has launched a litigation offensive against a number of actual and potential competitors, suing them for infringement of numerous unexpired patents that survived the Supreme Court case. A parallel strategy may have even greater long-term significance, however. In announcing expanded operations in Europe, Myriad has emphasized that it will rely less on patents and more on its huge proprietary database of genetic mutations and associated health outcomes—a strategy that could be used in the United States as well. Myriad has built that database over its many years as a patent-based monopolist in the BRCA testing field, and has not shared it with the medical community for more than a decade. Consequently, Myriad has a unique ability to interpret the health significance of patients’ genetic mutations, particularly in the case of rare “variants of unknown significance.” This article reviews the current state of Myriad’s patent portfolio, describes its ongoing litigation offensive, and then analyzes its proprietary database strategy. The article argues that Myriad’s strategy, while legally feasible, undercuts important values and objectives in medical research and health policy. The article identifies several ways in which the research and health care communities might fight back, but acknowledges that it will be a difficult uphill fight. PMID:25544836
MYRIAD AFTER MYRIAD: THE PROPRIETARY DATA DILEMMA.
Conley, John M; Cook-Deegan, Robert; Lázaro-Muñoz, Gabriel
2014-06-01
Myriad Genetics' long-time monopoly on BRCA gene testing was significantly narrowed by the Supreme Court's decision in AMP v. Myriad Genetics, Inc. , and will be further narrowed in the next few years as many of its still-valid patents expire. But these developments have not caused the company to acquiesce in competition. Instead, it has launched a litigation offensive against a number of actual and potential competitors, suing them for infringement of numerous unexpired patents that survived the Supreme Court case. A parallel strategy may have even greater long-term significance, however. In announcing expanded operations in Europe, Myriad has emphasized that it will rely less on patents and more on its huge proprietary database of genetic mutations and associated health outcomes-a strategy that could be used in the United States as well. Myriad has built that database over its many years as a patent-based monopolist in the BRCA testing field, and has not shared it with the medical community for more than a decade. Consequently, Myriad has a unique ability to interpret the health significance of patients' genetic mutations, particularly in the case of rare "variants of unknown significance." This article reviews the current state of Myriad's patent portfolio, describes its ongoing litigation offensive, and then analyzes its proprietary database strategy. The article argues that Myriad's strategy, while legally feasible, undercuts important values and objectives in medical research and health policy. The article identifies several ways in which the research and health care communities might fight back, but acknowledges that it will be a difficult uphill fight.
2013-01-01
Background Development of the commercial genomics sector within the biotechnology industry relied heavily on the scientific commons, public funding, and technology transfer between academic and industrial research. This study tracks financial and intellectual property data on genomics firms from 1990 through 2004, thus following these firms as they emerged in the era of the Human Genome Project and through the 2000 to 2001 market bubble. Methods A database was created based on an early survey of genomics firms, which was expanded using three web-based biotechnology services, scientific journals, and biotechnology trade and technical publications. Financial data for publicly traded firms was collected through the use of four databases specializing in firm financials. Patent searches were conducted using firm names in the US Patent and Trademark Office website search engine and the DNA Patent Database. Results A biotechnology subsector of genomics firms emerged in parallel to the publicly funded Human Genome Project. Trends among top firms show that hiring, capital improvement, and research and development expenditures continued to grow after a 2000 to 2001 bubble. The majority of firms are small businesses with great diversity in type of research and development, products, and services provided. Over half the public firms holding patents have the majority of their intellectual property portfolio in DNA-based patents. Conclusions These data allow estimates of investment, research and development expenditures, and jobs that paralleled the rise of genomics as a sector within biotechnology between 1990 and 2004. PMID:24050173
Enhancing the DNA Patent Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walters, LeRoy B.
Final Report on Award No. DE-FG0201ER63171 Principal Investigator: LeRoy B. Walters February 18, 2008 This project successfully completed its goal of surveying and reporting on the DNA patenting and licensing policies at 30 major U.S. academic institutions. The report of survey results was published in the January 2006 issue of Nature Biotechnology under the title “The Licensing of DNA Patents by US Academic Institutions: An Empirical Survey.” Lori Pressman was the lead author on this feature article. A PDF reprint of the article will be submitted to our Program Officer under separate cover. The project team has continued to updatemore » the DNA Patent Database on a weekly basis since the conclusion of the project. The database can be accessed at dnapatents.georgetown.edu. This database provides a valuable research tool for academic researchers, policymakers, and citizens. A report entitled Reaping the Benefits of Genomic and Proteomic Research: Intellectual Property Rights, Innovation, and Public Health was published in 2006 by the Committee on Intellectual Property Rights in Genomic and Protein Research and Innovation, Board on Science, Technology, and Economic Policy at the National Academies. The report was edited by Stephen A. Merrill and Anne-Marie Mazza. This report employed and then adapted the methodology developed by our research project and quoted our findings at several points. (The full report can be viewed online at the following URL: http://www.nap.edu/openbook.php?record_id=11487&page=R1). My colleagues and I are grateful for the research support of the ELSI program at the U.S. Department of Energy.« less
Trends in worldwide nanotechnology patent applications: 1991 to 2008
NASA Astrophysics Data System (ADS)
Dang, Yan; Zhang, Yulei; Fan, Li; Chen, Hsinchun; Roco, Mihail C.
2010-03-01
Nanotechnology patent applications published during 1991-2008 have been examined using the "title-abstract" keyword search on esp@cenet "worldwide" database. The longitudinal evolution of the number of patent applications, their topics, and their respective patent families have been evaluated for 15 national patent offices covering 98% of the total global activity. The patent offices of the United States (USA), People's Republic of China (PRC), Japan, and South Korea have published the largest number of nanotechnology patent applications, and experienced significant but different growth rates after 2000. In most repositories, the largest numbers of nanotechnology patent applications originated from their own countries/regions, indicating a significant "home advantage." The top applicant institutions are from different sectors in different countries (e.g., from industry in the US and Canada patent offices, and from academe or government agencies at the PRC office). As compared to 2000, the year before the establishment of the US National Nanotechnology Initiative (NNI), numerous new invention topics appeared in 2008, in all 15 patent repositories. This is more pronounced in the USA and PRC. Patent families have increased among the 15 patent offices, particularly after 2005. Overlapping patent applications increased from none in 1991 to about 4% in 2000 and to about 27% in 2008. The largest share of equivalent nanotechnology patent applications (1,258) between two repositories was identified between the US and Japan patent offices.
Trends in worldwide nanotechnology patent applications: 1991 to 2008.
Dang, Yan; Zhang, Yulei; Fan, Li; Chen, Hsinchun; Roco, Mihail C
2010-03-01
Nanotechnology patent applications published during 1991-2008 have been examined using the "title-abstract" keyword search on esp@cenet "worldwide" database. The longitudinal evolution of the number of patent applications, their topics, and their respective patent families have been evaluated for 15 national patent offices covering 98% of the total global activity. The patent offices of the United States (USA), People's Republic of China (PRC), Japan, and South Korea have published the largest number of nanotechnology patent applications, and experienced significant but different growth rates after 2000. In most repositories, the largest numbers of nanotechnology patent applications originated from their own countries/regions, indicating a significant "home advantage." The top applicant institutions are from different sectors in different countries (e.g., from industry in the US and Canada patent offices, and from academe or government agencies at the PRC office). As compared to 2000, the year before the establishment of the US National Nanotechnology Initiative (NNI), numerous new invention topics appeared in 2008, in all 15 patent repositories. This is more pronounced in the USA and PRC. Patent families have increased among the 15 patent offices, particularly after 2005. Overlapping patent applications increased from none in 1991 to about 4% in 2000 and to about 27% in 2008. The largest share of equivalent nanotechnology patent applications (1,258) between two repositories was identified between the US and Japan patent offices.
Trends in worldwide nanotechnology patent applications: 1991 to 2008
Zhang, Yulei; Fan, Li; Chen, Hsinchun; Roco, Mihail C.
2009-01-01
Nanotechnology patent applications published during 1991–2008 have been examined using the “title–abstract” keyword search on esp@cenet “worldwide” database. The longitudinal evolution of the number of patent applications, their topics, and their respective patent families have been evaluated for 15 national patent offices covering 98% of the total global activity. The patent offices of the United States (USA), People’s Republic of China (PRC), Japan, and South Korea have published the largest number of nanotechnology patent applications, and experienced significant but different growth rates after 2000. In most repositories, the largest numbers of nanotechnology patent applications originated from their own countries/regions, indicating a significant “home advantage.” The top applicant institutions are from different sectors in different countries (e.g., from industry in the US and Canada patent offices, and from academe or government agencies at the PRC office). As compared to 2000, the year before the establishment of the US National Nanotechnology Initiative (NNI), numerous new invention topics appeared in 2008, in all 15 patent repositories. This is more pronounced in the USA and PRC. Patent families have increased among the 15 patent offices, particularly after 2005. Overlapping patent applications increased from none in 1991 to about 4% in 2000 and to about 27% in 2008. The largest share of equivalent nanotechnology patent applications (1,258) between two repositories was identified between the US and Japan patent offices. PMID:21170123
48 CFR 52.212-4 - Contract Terms and Conditions-Commercial Items.
Code of Federal Regulations, 2014 CFR
2014-10-01
... (OMB) prompt payment regulations at 5 CFR part 1315. (h) Patent indemnity. The Contractor shall... foreign patent, trademark or copyright, arising out of the performance of this contract, provided the... payment of any contract for the accuracy and completeness of the data within the SAM database, and for any...
48 CFR 52.212-4 - Contract Terms and Conditions-Commercial Items.
Code of Federal Regulations, 2013 CFR
2013-10-01
... Office of Management and Budget (OMB) prompt payment regulations at 5 CFR part 1315. (h) Patent indemnity..., any United States or foreign patent, trademark or copyright, arising out of the performance of this... the accuracy and completeness of the data within the SAM database, and for any liability resulting...
48 CFR 52.212-4 - Contract Terms and Conditions-Commercial Items.
Code of Federal Regulations, 2012 CFR
2012-10-01
...) and Office of Management and Budget (OMB) prompt payment regulations at 5 CFR part 1315. (h) Patent... infringe, any United States or foreign patent, trademark or copyright, arising out of the performance of... completeness of the data within the CCR database, and for any liability resulting from the Government's...
ERIC Educational Resources Information Center
Sillince, J. A. A.; Sillince, M.
1993-01-01
Discusses molecular databases and the role that government and private companies play in their administration and development. Highlights include copyright and patent issues relating to public databases and the information contained in them; data quality; data structures and technological questions; the international organization of molecular…
Meirelles, Lyghia Maria Araújo; Raffin, Fernanda Nervo
2017-01-01
There has been a growing trend in recent years for the development of hybrid materials, called composites, based on clay and polymers, whose innovative properties render them attractive for drug release. The objective of this manuscript was to conduct a review of original articles on this topic published over the last decade and of the body of patents related to these carriers. A scientific prospection was carried out spanning the period from 2005 to 2015 on the Web of Science database. The technological prospection encompassed the United States Patent and Trademark Office, the European Patent Office, the World International Patent Office and the National Institute of Industrial Property databases, filtering patents with the code A61K. The survey revealed a rise in the number of publications over the past decade, confirming the potential of these hybrids for use in pharmaceutical technology. Through interaction between polymer and clay, the mechanical and thermal properties of composites are enhanced, promoting stable, controlled drugs release in biological media. The most cited clays analyzed in the articles was montmorillonite, owing to its high surface area and capacity for ion exchange. The polymeric part is commonly obtained by copolymerization, particularly using acrylate derivatives. The hybrid materials are obtained mainly in particulate form on a nanometric scale, attaining a modified release profile often sensitive to stimuli in the media. A low number of patents related to the topic were found. The World International Patent Office had the highest number of lodged patents, while Japan was the country which published the most patents. A need to broaden the application of this technology to include more therapeutic classes was identified. Moreover, the absence of regulation of nanomaterials might explain the disparity between scientific and technological output. This article is open to POST-PUBLICATION REVIEW. Registered readers (see "For Readers") may comment by clicking on ABSTRACT on the issue's contents page.
Vasconcellos, Alexandre Guimarães; Morel, Carlos Medicis
2012-01-01
New tools and approaches are necessary to facilitate public policy planning and foster the management of innovation in countries' public health systems. To this end, an understanding of the integrated way in which the various actors who produce scientific knowledge and inventions in technological areas of interest operate, where they are located and how they relate to one another is of great relevance. Tuberculosis has been chosen as a model for the present study as it is a current challenge for Brazilian research and innovation. Publications about tuberculosis written by Brazilian authors were accessed from international databases, analyzed, processed with text searching tools and networks of coauthors were constructed and visualized. Patent applications about tuberculosis in Brazil were retrieved from the Brazilian National Institute of Industrial Property (INPI) and the European Patent Office databases, through the use of International Patent Classification and keywords and then categorized and analyzed. Brazilian authorship of articles about tuberculosis jumped from 1% in 1995 to 5% in 2010. Article production and patent filings of national origin have been concentrated in public universities and research institutions while the participation of private industry in the filing of Brazilian patents has remained limited. The goals of national patenting efforts have still not been reached, as up to the present none of the applications filed have been granted a patent. The analysis of all this data about TB publishing and patents clearly demonstrates the importance of maintaining the continuity of Brazil's production development policies as well as government support for infrastructure projects to be employed in transforming the potential of research. This policy, which already exists for the promotion of new products and processes that, in addition to bringing diverse economic benefits to the country, will also contribute to effective dealing with public health problems affecting Brazil and the World.
Vasconcellos, Alexandre Guimarães; Morel, Carlos Medicis
2012-01-01
Introduction New tools and approaches are necessary to facilitate public policy planning and foster the management of innovation in countries' public health systems. To this end, an understanding of the integrated way in which the various actors who produce scientific knowledge and inventions in technological areas of interest operate, where they are located and how they relate to one another is of great relevance. Tuberculosis has been chosen as a model for the present study as it is a current challenge for Brazilian research and innovation. Methodology Publications about tuberculosis written by Brazilian authors were accessed from international databases, analyzed, processed with text searching tools and networks of coauthors were constructed and visualized. Patent applications about tuberculosis in Brazil were retrieved from the Brazilian National Institute of Industrial Property (INPI) and the European Patent Office databases, through the use of International Patent Classification and keywords and then categorized and analyzed. Results/Conclusions Brazilian authorship of articles about tuberculosis jumped from 1% in 1995 to 5% in 2010. Article production and patent filings of national origin have been concentrated in public universities and research institutions while the participation of private industry in the filing of Brazilian patents has remained limited. The goals of national patenting efforts have still not been reached, as up to the present none of the applications filed have been granted a patent. The analysis of all this data about TB publishing and patents clearly demonstrates the importance of maintaining the continuity of Brazil's production development policies as well as government support for infrastructure projects to be employed in transforming the potential of research. This policy, which already exists for the promotion of new products and processes that, in addition to bringing diverse economic benefits to the country, will also contribute to effective dealing with public health problems affecting Brazil and the World. PMID:23056208
Toyoda, Tetsuro
2011-01-01
Synthetic biology requires both engineering efficiency and compliance with safety guidelines and ethics. Focusing on the rational construction of biological systems based on engineering principles, synthetic biology depends on a genome-design platform to explore the combinations of multiple biological components or BIO bricks for quickly producing innovative devices. This chapter explains the differences among various platform models and details a methodology for promoting open innovation within the scope of the statutory exemption of patent laws. The detailed platform adopts a centralized evaluation model (CEM), computer-aided design (CAD) bricks, and a freemium model. It is also important for the platform to support the legal aspects of copyrights as well as patent and safety guidelines because intellectual work including DNA sequences designed rationally by human intelligence is basically copyrightable. An informational platform with high traceability, transparency, auditability, and security is required for copyright proof, safety compliance, and incentive management for open innovation in synthetic biology. GenoCon, which we have organized and explained here, is a competition-styled, open-innovation method involving worldwide participants from scientific, commercial, and educational communities that aims to improve the designs of genomic sequences that confer a desired function on an organism. Using only a Web browser, a participating contributor proposes a design expressed with CAD bricks that generate a relevant DNA sequence, which is then experimentally and intensively evaluated by the GenoCon organizers. The CAD bricks that comprise programs and databases as a Semantic Web are developed, executed, shared, reused, and well stocked on the secure Semantic Web platform called the Scientists' Networking System or SciNetS/SciNeS, based on which a CEM research center for synthetic biology and open innovation should be established. Copyright © 2011 Elsevier Inc. All rights reserved.
A Quantitative Analysis of Undisclosed Conflicts of Interest in Pharmacology Textbooks.
Piper, Brian J; Telku, Hassenet M; Lambert, Drew A
2015-01-01
Disclosure of potential conflicts of interest (CoI) is a standard practice for many biomedical journals but not for educational materials. The goal of this investigation was to determine whether the authors of pharmacology textbooks have undisclosed financial CoIs and to identify author characteristics associated with CoIs. The presence of potential CoIs was evaluated by submitting author names (N = 403; 36.3% female) to a patent database (Google Scholar) as well as a database that reports on the compensation ($USD) received from 15 pharmaceutical companies (ProPublica's Dollars for Docs). All publications (N = 410) of the ten highest compensated authors from 2009 to 2013 and indexed in Pubmed were also examined for disclosure of additional companies that the authors received research support, consulted, or served on speaker's bureaus. A total of 134 patents had been awarded (Maximum = 18/author) to textbook authors. Relative to DiPiro's Pharmacotherapy: A Pathophysiologic Approach, contributors to Goodman and Gilman's Pharmacological Basis of Therapeutics and Katzung's Basic and Clinical Pharmacology were more frequently patent holders (OR = 6.45, P < .0005). Female authors were less likely than males to have > 1 patent (OR = 0.15, P < .0005). A total of $2,411,080 USD (28.3% for speaking, 27.0% for consulting, and 23.9% for research), was received by 53 authors (Range = $299 to $310,000/author). Highly compensated authors were from multiple fields including oncology, psychiatry, neurology, and urology. The maximum number of additional companies, not currently indexed in the Dollars for Docs database, for which an author had potential CoIs was 73. Financial CoIs are common among the authors of pharmacology and pharmacotherapy textbooks. Full transparency of potential CoIs, particularly patents, should become standard procedure for future editions of educational materials in pharmacology.
Database tomography for commercial application
NASA Technical Reports Server (NTRS)
Kostoff, Ronald N.; Eberhart, Henry J.
1994-01-01
Database tomography is a method for extracting themes and their relationships from text. The algorithms, employed begin with word frequency and word proximity analysis and build upon these results. When the word 'database' is used, think of medical or police records, patents, journals, or papers, etc. (any text information that can be computer stored). Database tomography features a full text, user interactive technique enabling the user to identify areas of interest, establish relationships, and map trends for a deeper understanding of an area of interest. Database tomography concepts and applications have been reported in journals and presented at conferences. One important feature of the database tomography algorithm is that it can be used on a database of any size, and will facilitate the users ability to understand the volume of content therein. While employing the process to identify research opportunities it became obvious that this promising technology has potential applications for business, science, engineering, law, and academe. Examples include evaluating marketing trends, strategies, relationships and associations. Also, the database tomography process would be a powerful component in the area of competitive intelligence, national security intelligence and patent analysis. User interests and involvement cannot be overemphasized.
Recent patents of nanopore DNA sequencing technology: progress and challenges.
Zhou, Jianfeng; Xu, Bingqian
2010-11-01
DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Malavera, Alejandra; Vasquez, Alejandra; Fregni, Felipe
2015-01-01
Transcranial direct current stimulation (tDCS) is a neuromodulatory technique that has been extensively studied. While there have been initial positive results in some clinical trials, there is still variability in tDCS results. The aim of this article is to review and discuss patents assessing novel methods to optimize the use of tDCS. A systematic review was performed using Google patents database with tDCS as the main technique, with patents filling date between 2010 and 2015. Twenty-two patents met our inclusion criteria. These patents attempt to address current tDCS limitations. Only a few of them have been investigated in clinical trials (i.e., high-definition tDCS), and indeed most of them have not been tested before in human trials. Further clinical testing is required to assess which patents are more likely to optimize the effects of tDCS. We discuss the potential optimization of tDCS based on these patents and the current experience with standard tDCS.
Patenting of nanopharmaceuticals in drug delivery: no small issue.
du Toit, Lisa Claire; Pillay, Viness; Choonara, Yahya E; Pillay, Samantha; Harilall, Sheri-lee
2007-01-01
Nanotechnology is a rapidly evolving interdisciplinary field based on the manipulation of matter on a submicron scale, encompassing matter between 1 and 100 nanometers (nm). The currently registered nanotechnology patents comprise 35 countries being involved in the global distribution of these patents. Close to 3000 patents were issued in the USA since 1996 with the term 'nano' in the patents, with a considerable number having application in nanomedicine. The large majority of therapeutic patents are focused on drug delivery systems, highlighting an important application globally. Nanopharmaceutical patents are centered mainly on non-communicable diseases, with cancer receiving the greatest focus, followed by hepatitis. Drug delivery systems employing nanotechnology have the ability to allow superior drug absorption, controlled drug release and reduced side-effects, enhancing the effectiveness of existing drug delivery systems. Nanoparticle-based drug delivery systems may be among the first types of products to generate serious nanotechnology patent disputes as the multi-billion dollar pharmaceutical industry begins to adopt them. This review article aimed to locate patented nanopharmaceuticals in drug delivery online, employing pertinent key terms while searching the patent databases. Awarded and pending patents in the past 20 years pertaining to nanopharmaceutical or nano-enabled systems such as micelles, nanoemulsions, nanogels, liposomes, nanofibres, dendrimer technology and polymer therapeutics are presented in the review article, providing an overview of the diversity of the patent applications.
The History of Patenting Genetic Material.
Sherkow, Jacob S; Greely, Henry T
2015-01-01
The US Supreme Court's recent decision in Association for Molecular Pathology v. Myriad Genetics, Inc. declared, for the first time, that isolated human genes cannot be patented. Many have wondered how genes were ever the subjects of patents. The answer lies in a nuanced understanding of both legal and scientific history. Since the early twentieth century, "products of nature" were not eligible to be patented unless they were "isolated and purified" from their surrounding environment. As molecular biology advanced, and the capability to isolate genes both physically and by sequence came to fruition, researchers (and patent offices) began to apply patent-law logic to genes themselves. These patents, along with other biological patents, generated substantial social and political criticism. Myriad Genetics, a company with patents on BRCA1 and BRCA2, two genes critical to assessing early-onset breast and ovarian cancer risk, and with a particularly controversial business approach, became the antagonist in an ultimately successful campaign to overturn gene patents in court. Despite Myriad's defeat, some questions concerning the rights to monopolize genetic information remain. The history leading to that defeat may be relevant to these future issues.
R & D on carbon nanostructures in Russia: scientometric analysis, 1990-2011
NASA Astrophysics Data System (ADS)
Terekhov, Alexander I.
2015-02-01
The analysis, based on scientific publications and patents, was conducted to form an understanding of the overall scientific and technology landscape in the field of carbon nanostructures and determine Russia's place on it. The scientific publications came from the Science Citation Index Expanded database (DB SCIE); the patent information was extracted from databases of the United States Patent and Trade Office (USPTO), the World Intellectual Property Organization (WIPO), and Russian Federal Service for Intellectual Property (Rospatent). We used also data about research projects, obtained via information systems of the U.S. National Science Foundation (NSF) and the Russian Foundation for Basic Research (RFBR). Bibliometric methods are used to rank countries, institutions, and scientists, contributing to the carbon nanostructures research. We analyze the current state and trends of the research in Russia as compared to other countries, and the contribution and impact of its institutions, especially research of the "highest quality." Considerable focus is on research collaboration and its relationship with citation impact. Patent datasets are used to determine the composition of participants of innovative processes and international patent activity of Russian inventors in the field, and to identify the most active representatives of small and medium business and some technological developments ripe for commercialization. The article contains a critical analysis of the findings, including a policy discussion of the country's scientific authorities.
Chandrasekharan, Subhashini; McGuire, Amy L.; Van den Veyver, Ignatia B.
2015-01-01
Thousands of patents have been awarded that claim human gene sequences and their uses, and some have been challenged in court. In a recent high-profile case, Association for Molecular Pathology, et al. vs. Myriad Genetics, Inc., et al., the United States Supreme Court ruled that genes are natural occurring substances and therefore not patentable through “composition of matter” claims. The consequences of this ruling will extend well beyond ending Myriad's monopoly over BRCA testing, and may affect similar monopolies of other commercial laboratories for tests involving other genes. It could also simplify intellectual property issues surrounding genome-wide clinical sequencing, which can generate results for genes covered by intellectual property. Non-invasive prenatal testing (NIPT) for common aneuploidies using cell-free fetal (cff) DNA in maternal blood is currently offered through commercial laboratories and is also the subject of ongoing patent litigation. The recent Supreme Court decision in the Myriad case has already been invoked by a lower district court in NIPT litigation and resulted in invalidation of primary claims in a patent on currently marketed cffDNA-based testing for chromosomal aneuploidies. PMID:24989832
Reis, José Maciel Caldas Dos; Pinheiro, Maurício Fortuna; Oti, André Takashi; Feitosa-Junior, Denilson José Silva; Pantoja, Mauro de Souza; Barros, Rui Sérgio Monteiro
2016-01-01
Food is a key factor both in prevention and in promoting human health. Among the functional food are highlighted probiotics and prebiotics. Patent databases are the main source of technological information about innovation worldwide, providing extensive library for research sector. Perform mapping in the main patent databases about pre and probiotics, seeking relevant information regarding the use of biotechnology, nanotechnology and genetic engineering in the production of these foods. Electronic consultation was conducted (online) in the main public databases of patents in Brazil (INPI), United States (USPTO) and the European Patent Bank (EPO). The research involved the period from January 2014 to July 2015, being used in the title fields and summary of patents, the following descriptors in INPI "prebiotic", "prebiotic" "probiotics", "probiotic" and the USPTO and EPO: "prebiotic", "prebiotics", "probiotic", "probiotics". This search haven't found any deposit at the brazilian patents website (INPI) in this period; US Patent &Trademark Office had registered 60 titles in patents and the European Patent Office (EPO) showed 10 documents on the issue. Information technology offered by genetic engineering, biotechnology and nanotechnology deposited in the form of titles and abstracts of patents in relation to early nutritional intervention as functional foods, has increasingly required to decrease the risks and control the progression of health problems. But, the existing summaries, although attractive and promising in this sense, are still incipient to recommend them safely as a therapeutic tool. Therefore, they should be seen more as diet elements and healthy lifestyles. A alimentação é fator primordial tanto na prevenção quanto na promoção para a saúde humana. Dentre os alimentos funcionais destacam-se os probióticos e prebióticos. Os bancos de dados de patentes representam a maior fonte de informação tecnológica acerca de inovação em nível mundial, provendo vasta biblioteca para o setor de pesquisa. Realizar mapeamento nas principais bases de dados de patentes relacionada aos pré e probióticos buscando informações relevantes com relação ao uso da biotecnologia, nanotecnologia e engenharia genética na produção desses alimentos. Foi realizada consulta eletrônica (online) nas principais bases de dados públicas de patentes do Brasil (INPI), Estados Unidos da América (USPTO) e o Banco de Patentes Europeu (EPO). A pesquisa envolveu o período de janeiro de 2014 a julho 2015, sendo utilizado nos campos de título e resumo das patentes, os seguintes descritores no INPI: "prebiótico", "prebióticos" "probiótico", "probióticos" e no USPTO e EPO: "prebiotic", "prebiotics", "probiotic", "probiotics". Não foram observados, no INPI, depósitos de residentes (empresas ou universidades). Já no USPTO foram detectados 60 depósitos e no EPO 10 títulos de interesse à pesquisa. A tecnologia da informação ofertada pela engenharia genética, biotecnologia e nanotecnologia depositada na forma de títulos e resumos das patentes em relação à intervenção nutricional precoce como alimentos funcionais, tem cada vez mais pretendido diminuir os riscos e controlar a progressão de agravos à saúde. Mas, os resumos existentes, embora atraentes e promissores neste sentido, ainda são incipientes para recomendá-los de forma segura como ferramenta terapêutica. Portanto, devem ser encarados mais como integrantes de dieta e estilos de vida saudáveis.
ERIC Educational Resources Information Center
Bharti, Neelam; Leonard, Michelle; Singh, Shailendra
2016-01-01
Online chemical databases are the largest source of chemical information and, therefore, the main resource for retrieving results from published journals, books, patents, conference abstracts, and other relevant sources. Various commercial, as well as free, chemical databases are available. SciFinder, Reaxys, and Web of Science are three major…
Hill, Andrew; Hill, Teresa; Jose, Sophie; Pozniak, Anton
2014-01-01
In other disease areas, generic drugs are normally used after patent expiry. Patents on zidovudine, lamivudine, nevirapine and efavirenz have already expired. Patents will expire for abacavir in late 2014, lopinavir/r in 2016, and tenofovir, darunavir and atazanavir in 2017. However, patents on single-tablet regimens do not expire until after 2026. The number of people taking each antiretroviral in the UK was estimated from 23,655 individuals in the UK CHIC cohort (2012 database). Costs of patented drugs were taken from the British National Formulary database, assuming a 30% discount. Costs of generic antiretrovirals were estimated using an 80% discount from patented prices, or actual costs where available. Two options were analysed: 1 - all patients use single-tablet regimens and patented versions of drugs; prices remain stable over time; 2 - all people switch from patented to generic drugs when available, after patent expiry (dates shown above). There were an estimated 67,000 people taking antiretrovirals in the UK in 2014, estimated to rise by 8% per year until 2018 (in line with previous rises). The most widely used antiretrovirals in the CHIC cohort were tenofovir (TDF) (75%), emtricitabine (FTC) (69%), efavirenz (EFV) (39%), lamivudine (3TC) (23%), abacavir (ABC) (18%), darunavir (DRV) (21%) and atazanavir (ATV) (16%). The predicted annual UK cost of generic ABC/3TC/EFV (three generic tablets once daily) was £1018 per person-year. Costs of patented single-tablet regimens ranged from £5000 to £7500 per person-year. Assuming continued use of patented antiretrovirals in the UK, the predicted total national costs of antiretroviral treatment were predicted to rise from £425 million in 2014 to £459 m in 2015, £495 m in 2016, £536 m in 2017 and £578 m in 2018. With a 100% switch to generics, total predicted costs were £337 m in 2014, £364 m in 2015, £382 m in 2016, £144 m in 2017 and £169 m in 2018. The total predicted saving over five years from a switch to generics was £1.1 billion. Systematic switching from patented to generic antiretrovirals could potentially save approximately £1.1 billion in the UK over the next five years, compared with continued use of patented versions: this money could be spent on urgently needed HIV prevention programmes. Similar savings are feasible for other European countries, given parallel patent expiry dates. More detailed economic evaluation is required to show when patented single-tablet regimens provide value for money, compared to bioequivalent generic versions of 3-4 pills once daily.
Unveiling the geography of historical patents in the United States from 1836 to 1975
Petralia, Sergio; Balland, Pierre-Alexandre; Rigby, David L.
2016-01-01
It is clear that technology is a key driver of economic growth. Much less clear is where new technologies are produced and how the geography of U.S. invention has changed over the last two hundred years. Patent data report the geography, history, and technological characteristics of invention. However, those data have only recently become available in digital form and at the present time there exists no comprehensive dataset on the geography of knowledge production in the United States prior to 1975. The database presented in this paper unveils the geography of historical patents granted by the United States Patent and Trademark Office (USPTO) from 1836 to 1975. This historical dataset, HistPat, is constructed using digitalized records of original patent documents that are publicly available. We describe a methodological procedure that allows recovery of geographical information on patents from the digital records. HistPat can be used in different disciplines ranging from geography, economics, history, network science, and science and technology studies. Additionally, it is easily merged with post-1975 USPTO digital patent data to extend it until today. PMID:27576103
NASA Astrophysics Data System (ADS)
Sorce, Salvatore; Malizia, Alessio; Jiang, Pingfei; Atherton, Mark; Harrison, David
2018-04-01
One of the main time and money consuming tasks in the design of industrial devices and parts is the checking of possible patent infringements. Indeed, the great number of documents to be mined and the wide variety of technical language used to describe inventions are reasons why considerable amounts of time may be needed. On the other hand, the early detection of a possible patent conflict, in addition to reducing the risk of legal disputes, could stimulate a designers’ creativity to overcome similarities in overlapping patents. For this reason, there are a lot of existing patent analysis systems, each with its own features and access modes. We have designed a visual interface providing an intuitive access to such systems, freeing the designers from the specific knowledge of querying languages and providing them with visual clues. We tested the interface on a framework aimed at representing mechanical engineering patents; the framework is based on a semantic database and provides patent conflict analysis for early-stage designs. The interface supports a visual query composition to obtain a list of potentially overlapping designs.
Zhao, Fei-Ya; Tao, Ai-En; Xia, Cong-Long
2018-01-01
Paris is a commonly used traditional Chinese medicine (TCM), and has antitumor, antibacterial, sedative, analgesic and hemostatic effects. It has been used as an ingredient of 81 Chinese patent medicines, with a wide application and large market demand. Based on the data retrieved from state Intellectual Property Office patent database, a comprehensive analysis was made on Paris patents, so as to explore the current features of Paris patents in the aspects of domestic patent output, development trend, technology field distribution, time dimension, technology growth rate and patent applicant, and reveal the development trend of China's Paris industry. In addition, based on the current Paris resource application and development, a sustainable, multi-channel and multi-level industrial development approach was built. According to the results, studies of Paris in China are at the rapid development period, with a good development trend. However, because wild Paris resources tend to be exhausted, the studies for artificial cultivation technology should be strengthened to promote the industrial development. Copyright© by the Chinese Pharmaceutical Association.
Pharmaceutical patent applications in freeze-drying.
Ekenlebie, Edmond; Einfalt, Tomaž; Karytinos, Arianna Irò; Ingham, Andrew
2016-09-01
Injectable products are often the formulation of choice for new therapeutics; however, formulation in liquids often enhances degradation through hydrolysis. Thus, freeze-drying (lyophilization) is regularly used in pharmaceutical manufacture to reduce water activity. Here we examine its contribution to 'state of the art' and look at its future potential uses. A comprehensive search of patent databases was conducted to characterize the international patent landscape and trends in the use of freeze-drying. A total of 914 disclosures related to freeze-drying, lyophilization or drying of solid systems in pressures and temperatures equivalent to those of freeze-drying were considered over the period of 1992-2014. Current applications of sublimation technology were contrasted across two periods those with patents due to expire (1992-1993) and those currently filed. The number of freeze-drying technology patents has stabilized after initial activity across the biotechnology sector in 2011 and 2012. Alongside an increasing trend for patent submissions, freeze-drying submissions have slowed since 2002 and is indicative of a level of maturity.
Bai, Lin; Ren, Yulan; Guo, Taipin; Chen, Lin; Zhou, Yumei; Feng, Shuwei; Li, Ji; Liang, Fanrong
2016-11-12
To perform a bibliometrics analysis on patent literature regarding diagnosis and treatment devices of acupuncture in China, aiming to provide references for the development of diagnosis and treatment devices of acupuncture. Based on SooPAT, a patent database, the patent literature regarding diagnosis and treatment devices of acupuncture in China was collected. With bibliometrics methods, the annual distribution of type, quantity, classification and content of diagnosis and treatment devices of acupuncture were analyzed. The number of acupuncture diagnosis and treatment devices reached its peak in 2012 and 2013 in China. The A61N in patent and utility model patent were the most, which were mainly related to electrotherapy, magnetic therapy, radioactive therapy and ultrasound therapy, etc. The main content was acupuncture treatment devices and meridian treatment devices. The 24-01 in design patent was the most, involving fixation devices used by doctors, hospitals and laboratories, etc. Currently the majority of diagnosis and treatment devices of acupuncture is therapeutic apparatus, while the acupuncture diagnosis devices are needed.
Patent information - towards simplicity or complexity?
NASA Astrophysics Data System (ADS)
Shenton, Written By Kathleen; Norton, Peter; Onodera, Translated By Natsuo
Since the advent of online services, the ability to search and find chemical patent information has improved immeasurably. Recently, integration of a multitude of files (through file merging as well as cross-file/simultaneous searches), 'intelligent' interfaces and optical technology for large amounts of data seem to achieve greater simplicity and convenience in the retrieval of patent information. In spite of these progresses, there is more essential problem which increases complexity. It is a tendency to expand indefinitely the range of claim for chemical substances by a ultra-generic description of structure (overuse of optional substituents, variable divalent groups, repeating groups, etc.) and long listing of prophetic examples. Not only does this tendency worry producers and searchers of patent databases but also prevents truly worthy inventions in future.
NASA Astrophysics Data System (ADS)
Wang, Gangbo; Guan, Jiancheng
2011-12-01
This article contributes to the growing study on the interactions between science and technology with China's evidence in the field of nanotechnology, based on the database of United States Patent and Trademark Office. The analysis is focused during the period of 1991-2008, a rapid increasing period for the development of nanotechnology. Using the non-patent references cited by patents, we first investigate the science-technology connections in the context of Chinese nanotechnology, especially in institutional sectors and its application fields. Those patents, produced by academic researchers and directed towards basic scientific knowledge, generally cite more scientific references with a higher proportion of self-citations. It is interesting to find that patents contributed by collaborations between public organizations and corporations seldom contain scientific references. Following an interesting path on matching the data of publications and patents, we establish the author-inventor links in this emerging field. Author-inventors, who are co-active in publishing and patenting, are at the very top of the most prolific and highly cited researchers. Finally, we employ social network analysis to explore the characteristics of scientific and technological networks generated by co-authorship and co-invention data, to investigate the position and the role of patenting-publishing scientists in these research networks.
Classifying patents based on their semantic content.
Bergeaud, Antonin; Potiron, Yoann; Raimbault, Juste
2017-01-01
In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information.
Classifying patents based on their semantic content
2017-01-01
In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information. PMID:28445550
ERIC Educational Resources Information Center
Provasi, Giancarlo; Squazzoni, Flaminio; Tosio, Beatrice
2012-01-01
This paper looks at eight comparative case-studies on academic entrepreneurs in life sciences conducted in Europe in 2008. The interviewees were selected from the KEINS database that lists all academic inventors from Italy, France, Sweden and the Netherlands who have one or more patent applications registered at the European Patent Office,…
Simon, James H. M.; Claassen, Eric; Correa, Carmen E.; Osterhaus, Albert D. M. E.
2005-01-01
Patent applications that incorporate the genomic sequence of the severe acute respiratory syndrome (SARS) coronavirus, have been filed by a number of organizations. This is likely to result in a fragmentation of intellectual property (IP) rights which in turn may adversely affect the development of products, such as vaccines, to combat SARS. Placing these patent rights into a patent pool to be licensed on a non-exclusive basis may circumvent these difficulties and set a key precedent for the use of this form of mechanism in other areas of health care, leading to benefits to public health. PMID:16211163
TECHNOLOGICAL INNOVATION IN NEUROSURGERY: A QUANTITATIVE STUDY
Marcus, Hani J; Hughes-Hallett, Archie; Kwasnicki, Richard M; Darzi, Ara; Yang, Guang-Zhong; Nandi, Dipankar
2015-01-01
Object Technological innovation within healthcare may be defined as the introduction of a new technology that initiates a change in clinical practice. Neurosurgery is a particularly technologically intensive surgical discipline, and new technologies have preceded many of the major advances in operative neurosurgical technique. The aim of the present study was to quantitatively evaluate technological innovation in neurosurgery using patents and peer-reviewed publications as metrics of technology development and clinical translation respectively. Methods A patent database was searched between 1960 and 2010 using the search terms “neurosurgeon” OR “neurosurgical” OR “neurosurgery”. The top 50 performing patent codes were then grouped into technology clusters. Patent and publication growth curves were then generated for these technology clusters. A top performing technology cluster was then selected as an exemplar for more detailed analysis of individual patents. Results In all, 11,672 patents and 208,203 publications relating to neurosurgery were identified. The top performing technology clusters over the 50 years were: image guidance devices, clinical neurophysiology devices, neuromodulation devices, operating microscopes and endoscopes. Image guidance and neuromodulation devices demonstrated a highly correlated rapid rise in patents and publications, suggesting they are areas of technology expansion. In-depth analysis of neuromodulation patents revealed that the majority of high performing patents were related to Deep Brain Stimulation (DBS). Conclusions Patent and publication data may be used to quantitatively evaluate technological innovation in neurosurgery. PMID:25699414
Extracellular vesicles: the growth as diagnostics and therapeutics; a survey
Roy, Sabrina; Hochberg, Fred H.; Jones, Pamela S.
2018-01-01
ABSTRACT This article aims to document the growth in extracellular vesicle (EV) research. Here, we report the growth in EV-related studies, patents, and grants as well as emerging companies with major intent on exosomes. Four different databases were utilized for electronic searches of published literature: two general databases – Scopus/Elsevier and Web of Science (WoS), as well as two specialized US government databases – the USA Patent and Trademark Office and National Institutes of Health (NIH) of the Department of Health and Human Services. The applied combination of key words was carefully chosen to cover the most commonly used terms in titles of publications, patents and grants dealing with conceptual areas of EVs. Within the time frame from 1 January 2000 to 31 December 2016, limited to articles published in English, we identified output using search strategies based upon Scopus/Elsevier and WoS, patent filings and NIH Federal Reports of funded grants. Consistently, USA and UK universities are the most frequent among the top 15 affiliations/organizations of the authors of the identified records. There is clear evidence of upward streaming of EV-related publications. By documenting the growth of the EV field, we hope to encourage a roster of independent authorities skilled to provide peer review of manuscripts, evaluation of grant applications, support of foundation initiatives and corporate long-term planning. It is important to encourage EV research to further identify biomarkers in diseases and allow for the development of adequate diagnostic tools that could distinguish disease subpopulations and enable personalized treatment of patients. PMID:29511461
A New Methodology for Systematic Exploitation of Technology Databases.
ERIC Educational Resources Information Center
Bedecarrax, Chantal; Huot, Charles
1994-01-01
Presents the theoretical aspects of a data analysis methodology that can help transform sequential raw data from a database into useful information, using the statistical analysis of patents as an example. Topics discussed include relational analysis and a technology watch approach. (Contains 17 references.) (LRW)
Yamabhai, Inthira; Smith, Richard D
2012-08-01
Although it has been two decades since the Thai Patent Act was amended to comply with the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS), there has been little emphasis given to assessing the implications of this amendment. The purpose of this review is to summarize the health and economic impact of patent protection, with a focus on the experience of Thailand. A review of national and international empirical evidence on the health and economic implications of patents from 1980 to 2009 was undertaken. The findings illustrate the role of patent protection in four areas: price, present access, future access, and international trade and investment. Forty-three empirical studies were found, three of which were from Thai databases. Patenting does increase price, although the size of effect differs according to the methodology and country. Although weakening patent rights could increase present access, evidence suggests that strengthening patenting may benefit future access; although this is based on complex assumptions and estimations. Moreover, while patent protection appears to have a positive impact on trade flow, the implication for foreign direct investment (FDI) is equivocal. Empirical studies in Thailand, and other similar countries, are rare, compromising the robustness and generalizability of conclusions. However, evidence does suggest that patenting presents a significant inter-temporal challenge in balancing aspects of current versus future access to technologies. This underlines the urgent need to prioritize health research resources to assess the wider implications of patent protection.
The emergence of factor Xa inhibitors for the treatment of cardiovascular diseases: a patent review.
Pinto, Donald J P; Qiao, Jennifer X; Knabb, Robert M
2012-06-01
Factor Xa (FXa) is a critical enzyme in the coagulation cascade responsible for thrombin generation, the final enzyme that leads to fibrin clot formation. Significant success has recently been reported with compounds such as rivaroxaban, apixaban and edoxaban in the treatment and prevention of venous thromboembolism (VTE) and more recently in the prevention of stroke in atrial fibrillation (AF). The success these agents have demonstrated is now being reflected by a narrowing of new FXa patents over the past few years. The new patents appear to be structural modifications of previously published, small molecule inhibitors and bind in a similar manner to the FXa enzyme. SciFinder®, PubMed and Google websites were used as the main source of literature retrieval. Patent searches were conducted in the patent databases: HCAPlus, WPIX and the full text databases (USPAT2, USPATFULL, EPFULL, PCTFULL) using the following keywords: ((FXa) OR (F OR factor) (W) (Xa)) (S) (inhibit? or block? or modulat? or antagonist? or regulat?). The search was restricted to patent documents with the entry date on or after 1 January 2009. Literature and information related to clinical development was retrieved from Thomson Reuter's Pharma. A large body of Phase II and Phase III data is now available for FXa inhibitors such as rivaroxaban, apixaban, edoxaban and betrixaban. The clinical data demonstrate favorable benefit-risk profiles compared with the standards of care for short- and long-term anticoagulation (i.e., low molecular weight heparins (LMWHs) and wafarin). The potential exists that these agents will eventually be the agents of choice for the treatment of a host of cardiovascular disease states, offering improved efficacy, safety, and ease of use compared with existing anticoagulants.
Undisclosed conflicts of interest among biomedical textbook authors.
Piper, Brian J; Lambert, Drew A; Keefe, Ryan C; Smukler, Phoebe U; Selemon, Nicolas A; Duperry, Zachary R
2018-02-05
Textbooks are a formative resource for health care providers during their education and are also an enduring reference for pathophysiology and treatment. Unlike the primary literature and clinical guidelines, biomedical textbook authors do not typically disclose potential financial conflicts of interest (pCoIs). The objective of this study was to evaluate whether the authors of textbooks used in the training of physicians, pharmacists, and dentists had appreciable undisclosed pCoIs in the form of patents or compensation received from pharmaceutical or biotechnology companies. The most recent editions of six medical textbooks, Harrison's Principles of Internal Medicine ( Har PIM), Katzung and Trevor's Basic and Clinical Pharmacology ( Kat BCP), the American Osteopathic Association's Foundations of Osteopathic Medicine ( AOA FOM), Remington: The Science and Practice of Pharmacy ( Rem SPP), Koda-Kimble and Young's Applied Therapeutics ( KKY AT), and Yagiela's Pharmacology and Therapeutics for Dentistry ( Yag PTD), were selected after consulting biomedical educators for evaluation. Author names (N = 1,152, 29.2% female) were submitted to databases to examine patents (Google Scholar) and compensation (ProPublica's Dollars for Docs [PDD]). Authors were listed as inventors on 677 patents (maximum/author = 23), with three-quarters (74.9%) to Har PIM authors. Females were significantly underrepresented among patent holders. The PDD 2009-2013 database revealed receipt of US$13.2 million, the majority to (83.9%) to Har PIM. The maximum compensation per author was $869,413. The PDD 2014 database identified receipt of $6.8 million, with 50.4% of eligible authors receiving compensation. The maximum compensation received by a single author was $560,021. Cardiovascular authors were most likely to have a PDD entry and neurologic disorders authors were least likely. An appreciable subset of biomedical authors have patents and have received remuneration from medical product companies and this information is not disclosed to readers. These findings indicate that full transparency of financial pCoI should become a standard practice among the authors of biomedical educational materials.
Prins, Theo W; Scholtens, Ingrid M J; Bak, Arno W; van Dijk, Jeroen P; Voorhuijzen, Marleen M; Laurensse, Emile J; Kok, Esther J
2016-12-15
During routine monitoring for GMOs in food in the Netherlands, papaya-containing food supplements were found positive for the genetically modified (GM) elements P-35S and T-nos. The goal of this study was to identify the unknown and EU unauthorised GM papaya event(s). A screening strategy was applied using additional GM screening elements including a newly developed PRSV coat protein PCR. The detected PRSV coat protein PCR product was sequenced and the nucleotide sequence showed identity to PRSV YK strains indigenous to China and Taiwan. The GM events 16-0-1 and 18-2-4 could be identified by amplifying and sequencing events-specific sequences. Further analyses showed that both papaya event 16-0-1 and event 18-2-4 were transformed with the same construct. For use in routine analysis, derived TaqMan qPCR methods for events 16-0-1 and 18-2-4 were developed. Event 16-0-1 was detected in all samples tested whereas event 18-2-4 was detected in one sample. This study presents a strategy for combining information from different sources (literature, patent databases) and novel sequence data to identify unknown GM papaya events. Copyright © 2016 Elsevier Ltd. All rights reserved.
Gene Patents and Personalized Cancer Care: Impact of the Myriad Case on Clinical Oncology
Offit, Kenneth; Bradbury, Angela; Storm, Courtney; Merz, Jon F.; Noonan, Kevin E.; Spence, Rebecca
2013-01-01
Genomic discoveries have transformed the practice of oncology and cancer prevention. Diagnostic and therapeutic advances based on cancer genomics developed during a time when it was possible to patent genes. A case before the Supreme Court, Association for Molecular Pathology v Myriad Genetics, Inc seeks to overturn patents on isolated genes. Although the outcomes are uncertain, it is suggested here that the Supreme Court decision will have few immediate effects on oncology practice or research but may have more significant long-term impact. The Federal Circuit court has already rejected Myriad's broad diagnostic methods claims, and this is not affected by the Supreme Court decision. Isolated DNA patents were already becoming obsolete on scientific grounds, in an era when human DNA sequence is public knowledge and because modern methods of next-generation sequencing need not involve isolated DNA. The Association for Molecular Pathology v Myriad Supreme Court decision will have limited impact on new drug development, as new drug patents usually involve cellular methods. A nuanced Supreme Court decision acknowledging the scientific distinction between synthetic cDNA and genomic DNA will further mitigate any adverse impact. A Supreme Court decision to include or exclude all types of DNA from patent eligibility could impact future incentives for genomic discovery as well as the future delivery of medical care. Whatever the outcome of this important case, it is important that judicial and legislative actions in this area maximize genomic discovery while also ensuring patients' access to personalized cancer care. PMID:23766521
Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed.
Eisinger, Daniel; Tsatsaronis, George; Bundschus, Markus; Wieneke, Ulrich; Schroeder, Michael
2013-04-15
Document search on PubMed, the pre-eminent database for biomedical literature, relies on the annotation of its documents with relevant terms from the Medical Subject Headings ontology (MeSH) for improving recall through query expansion. Patent documents are another important information source, though they are considerably less accessible. One option to expand patent search beyond pure keywords is the inclusion of classification information: Since every patent is assigned at least one class code, it should be possible for these assignments to be automatically used in a similar way as the MeSH annotations in PubMed. In order to develop a system for this task, it is necessary to have a good understanding of the properties of both classification systems. This report describes our comparative analysis of MeSH and the main patent classification system, the International Patent Classification (IPC). We investigate the hierarchical structures as well as the properties of the terms/classes respectively, and we compare the assignment of IPC codes to patents with the annotation of PubMed documents with MeSH terms.Our analysis shows a strong structural similarity of the hierarchies, but significant differences of terms and annotations. The low number of IPC class assignments and the lack of occurrences of class labels in patent texts imply that current patent search is severely limited. To overcome these limits, we evaluate a method for the automated assignment of additional classes to patent documents, and we propose a system for guided patent search based on the use of class co-occurrence information and external resources.
Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed
2013-01-01
Document search on PubMed, the pre-eminent database for biomedical literature, relies on the annotation of its documents with relevant terms from the Medical Subject Headings ontology (MeSH) for improving recall through query expansion. Patent documents are another important information source, though they are considerably less accessible. One option to expand patent search beyond pure keywords is the inclusion of classification information: Since every patent is assigned at least one class code, it should be possible for these assignments to be automatically used in a similar way as the MeSH annotations in PubMed. In order to develop a system for this task, it is necessary to have a good understanding of the properties of both classification systems. This report describes our comparative analysis of MeSH and the main patent classification system, the International Patent Classification (IPC). We investigate the hierarchical structures as well as the properties of the terms/classes respectively, and we compare the assignment of IPC codes to patents with the annotation of PubMed documents with MeSH terms. Our analysis shows a strong structural similarity of the hierarchies, but significant differences of terms and annotations. The low number of IPC class assignments and the lack of occurrences of class labels in patent texts imply that current patent search is severely limited. To overcome these limits, we evaluate a method for the automated assignment of additional classes to patent documents, and we propose a system for guided patent search based on the use of class co-occurrence information and external resources. PMID:23734562
Benza, Raymond L; Farber, Harrison W; Frost, Adaani; Ghofrani, Hossein-Ardeschir; Gómez-Sánchez, Miguel A; Langleben, David; Rosenkranz, Stephan; Busse, Dennis; Meier, Christian; Nikkho, Sylvia; Hoeper, Marius M
2018-04-01
The Registry to Evaluate Early and Long-term PAH Disease Management (REVEAL) risk score (RRS) calculator was developed using data derived from the REVEAL registry, and predicts survival in patients with pulmonary arterial hypertension (PAH) based on multiple patient characteristics. Herein we applied the RRS to a pivotal PAH trial database, the 12-week PATENT-1 and open-label PATENT-2 extension studies of riociguat. We examined the effect of riociguat vs placebo on RRS in PATENT-1, and investigated the prognostic implications of change in RRS during PATENT-1 on long-term outcomes in PATENT-2. RRS was calculated post hoc for baseline and Week 12 of PATENT-1, and Week 12 of PATENT-2. Patients were grouped into risk strata by RRS. Kaplan-Meier estimates were made for survival and clinical worsening-free survival in PATENT-2 to evaluate the relationship between RRS in PATENT-1 and long-term outcomes in PATENT-2. A total of 396 patients completed PATENT-1 and participated in PATENT-2. In PATENT-1, riociguat significantly improved RRS (p = 0.031) and risk stratum (p = 0.018) between baseline and Week 12 compared with placebo. RRS at baseline, and at PATENT-1 Week 12, and change in RRS during PATENT-1 were significantly associated with survival (hazard ratios for a 1-point reduction in RRS: 0.675, 0.705 and 0.804, respectively) and clinical worsening-free survival (hazard ratios of 0.736, 0.716 and 0.753, respectively) over 2 years in PATENT-2. RRS at baseline and Week 12, and change in RRS, were significant predictors of both survival and clinical worsening-free survival. These data support the long-term predictive value of the RRS in a controlled study population. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Quantitative Analysis of Technological Innovation in Urology.
Bhatt, Nikita R; Davis, Niall F; Dalton, David M; McDermott, Ted; Flynn, Robert J; Thomas, Arun Z; Manecksha, Rustom P
2018-01-01
To assess major areas of technological innovation in urology in the last 20 years using patent and publication data. Patent and MEDLINE databases were searched between 1980 and 2012 electronically using the terms urology OR urological OR urologist AND "surgeon" OR "surgical" OR "surgery". The patent codes obtained were grouped in technology clusters, further analyzed with individual searches, and growth curves were plotted. Growth rates and patterns were analyzed, and patents were correlated with publications as a measure of scientific support and of clinical adoption. The initial search revealed 417 patents and 20,314 publications. The top 5 technology clusters in descending order were surgical instruments including urinary catheters, minimally invasive surgery (MIS), lasers, robotic surgery, and image guidance. MIS and robotic surgery were the most emergent clusters in the last 5 years. Publication and patent growth rates were closely correlated (Pearson coefficient 0.78, P <.01), but publication growth rate remained constantly higher than patent growth, suggesting validated scientific support for urologic innovation and adoption into clinical practice. Patent metrics identify emergent technological innovations and such trends are valuable to understand progress in the field of urology. New surgical technologies like robotic surgery and MIS showed exponential growth in the last decade with good scientific vigilance. Copyright © 2017 Elsevier Inc. All rights reserved.
Innovation status of gene therapy for breast cancer.
Anaya-Ruiz, Maricruz; Perez-Santos, Martin
2015-01-01
To analyze multi-source data including publications and patents, and try to draw the whole landscape of the research and development community in the field of gene therapy for breast cancer. Publications and patents were collected from the Web of science and databases of the five major patent offices of the world, respectively. Bibliometric methodologies and technology are used to investigate publications/patents, their contents and relationships. A total of 2,043 items published and 947 patents from 1994 to 2013 including "gene therapy for breast cancer" were retrieved. The top five countries in global publication share were USA, China, Germany, Japan and England. On the other hand, USA, Australia, England, South Korea and Japan were the main producers of patents. The universities and enterprises of USA had the highest amount of publication and patents. Adenovirus- and retrovirus-based gene therapies and small interfering RNA (siRNA) interference therapies were the main topics both in publications and patents. The above results show that global research in the field of gene therapy for breast cancer is increasing and the main participants in this field are USA and Canada in North America, China, Japan and South Korea in Asia, and England, Germany, and Italy in Europe. Also, this article demonstrates the usefulness of bibliometrics to address key evaluation questions and define future areas of research.
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.
Code of Federal Regulations, 2011 CFR
2011-07-01
... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...
Technological innovation in neurosurgery: a quantitative study.
Marcus, Hani J; Hughes-Hallett, Archie; Kwasnicki, Richard M; Darzi, Ara; Yang, Guang-Zhong; Nandi, Dipankar
2015-07-01
Technological innovation within health care may be defined as the introduction of a new technology that initiates a change in clinical practice. Neurosurgery is a particularly technology-intensive surgical discipline, and new technologies have preceded many of the major advances in operative neurosurgical techniques. The aim of the present study was to quantitatively evaluate technological innovation in neurosurgery using patents and peer-reviewed publications as metrics of technology development and clinical translation, respectively. The authors searched a patent database for articles published between 1960 and 2010 using the Boolean search term "neurosurgeon OR neurosurgical OR neurosurgery." The top 50 performing patent codes were then grouped into technology clusters. Patent and publication growth curves were then generated for these technology clusters. A top-performing technology cluster was then selected as an exemplar for a more detailed analysis of individual patents. In all, 11,672 patents and 208,203 publications related to neurosurgery were identified. The top-performing technology clusters during these 50 years were image-guidance devices, clinical neurophysiology devices, neuromodulation devices, operating microscopes, and endoscopes. In relation to image-guidance and neuromodulation devices, the authors found a highly correlated rapid rise in the numbers of patents and publications, which suggests that these are areas of technology expansion. An in-depth analysis of neuromodulation-device patents revealed that the majority of well-performing patents were related to deep brain stimulation. Patent and publication data may be used to quantitatively evaluate technological innovation in neurosurgery.
2013-11-21
Fanconi Anemia; Autosomal or Sex Linked Recessive Genetic Disease; Bone Marrow Hematopoiesis Failure, Multiple Congenital Abnormalities, and Susceptibility to Neoplastic Diseases.; Hematopoiesis Maintainance.
2012-01-01
Background Although it has been two decades since the Thai Patent Act was amended to comply with the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS), there has been little emphasis given to assessing the implications of this amendment. The purpose of this review is to summarize the health and economic impact of patent protection, with a focus on the experience of Thailand. Methods A review of national and international empirical evidence on the health and economic implications of patents from 1980 to 2009 was undertaken. Results The findings illustrate the role of patent protection in four areas: price, present access, future access, and international trade and investment. Forty-three empirical studies were found, three of which were from Thai databases. Patenting does increase price, although the size of effect differs according to the methodology and country. Although weakening patent rights could increase present access, evidence suggests that strengthening patenting may benefit future access; although this is based on complex assumptions and estimations. Moreover, while patent protection appears to have a positive impact on trade flow, the implication for foreign direct investment (FDI) is equivocal. Conclusions Empirical studies in Thailand, and other similar countries, are rare, compromising the robustness and generalizability of conclusions. However, evidence does suggest that patenting presents a significant inter-temporal challenge in balancing aspects of current versus future access to technologies. This underlines the urgent need to prioritize health research resources to assess the wider implications of patent protection. PMID:22849392
37 CFR 1.825 - Amendments to or replacement of sequence listing and computer readable copy thereof.
Code of Federal Regulations, 2014 CFR
2014-07-01
... of sequence listing and computer readable copy thereof. 1.825 Section 1.825 Patents, Trademarks, and... Amino Acid Sequences § 1.825 Amendments to or replacement of sequence listing and computer readable copy... copy of the computer readable form (§ 1.821(e)) including all previously submitted data with the...
37 CFR 1.825 - Amendments to or replacement of sequence listing and computer readable copy thereof.
Code of Federal Regulations, 2013 CFR
2013-07-01
... of sequence listing and computer readable copy thereof. 1.825 Section 1.825 Patents, Trademarks, and... Amino Acid Sequences § 1.825 Amendments to or replacement of sequence listing and computer readable copy... copy of the computer readable form (§ 1.821(e)) including all previously submitted data with the...
37 CFR 1.825 - Amendments to or replacement of sequence listing and computer readable copy thereof.
Code of Federal Regulations, 2012 CFR
2012-07-01
... of sequence listing and computer readable copy thereof. 1.825 Section 1.825 Patents, Trademarks, and... Amino Acid Sequences § 1.825 Amendments to or replacement of sequence listing and computer readable copy... copy of the computer readable form (§ 1.821(e)) including all previously submitted data with the...
37 CFR 1.825 - Amendments to or replacement of sequence listing and computer readable copy thereof.
Code of Federal Regulations, 2010 CFR
2010-07-01
... of sequence listing and computer readable copy thereof. 1.825 Section 1.825 Patents, Trademarks, and... Amino Acid Sequences § 1.825 Amendments to or replacement of sequence listing and computer readable copy... copy of the computer readable form (§ 1.821(e)) including all previously submitted data with the...
37 CFR 1.825 - Amendments to or replacement of sequence listing and computer readable copy thereof.
Code of Federal Regulations, 2011 CFR
2011-07-01
... of sequence listing and computer readable copy thereof. 1.825 Section 1.825 Patents, Trademarks, and... Amino Acid Sequences § 1.825 Amendments to or replacement of sequence listing and computer readable copy... copy of the computer readable form (§ 1.821(e)) including all previously submitted data with the...
Evaluation of Brazilian biotechnology patent activity from 1975 to 2010.
Dias, F; Delfim, F; Drummond, I; Carmo, A O; Barroca, T M; Horta, C C; Kalapothakis, E
2012-08-01
The analysis of patent activity is one methodology used for technological monitoring. In this paper, the activity of biotechnology-related patents in Brazil were analyzed through 30 International Patent Classification (IPC) codes published by the Organization for Economic Cooperation and Development (OECD). We developed a program to analyse the dynamics of the major patent applicants, countries and IPC codes extracted from the Brazilian Patent Office (INPI) database. We also identified Brazilian patent applicants who tried to expand protection abroad via the Patent Cooperation Treaty (PCT). We had access to all patents published online at the INPI from 1975 to July 2010, including 9,791 biotechnology patent applications in Brazil, and 163 PCTs published online at World Intellectual Property Organization (WIPO) from 1997 to December 2010. To our knowledge, there are no other online reports of biotechnology patents previous to the years analyzed here. Most of the biotechnology patents filed in the INPI (10.9%) concerned measuring or testing processes involving nucleic acids. The second and third places belonged to patents involving agro-technologies (recombinant DNA technology for plant cells and new flowering plants, i.e. angiosperms, or processes for obtaining them, and reproduction of flowering plants by tissue culture techniques). The majority of patents (87.2%) were filed by nonresidents, with USA being responsible for 51.7% of all biotechnology patents deposited in Brazil. Analyzing the resident applicants per region, we found a hub in the southeast region of Brazil. Among the resident applicants for biotechnology patents filed in the INPI, 43.5% were from São Paulo, 18.3% were from Rio de Janeiro, and 9.7% were from Minas Gerais. Pfizer, Novartis, and Sanofi were the largest applicants in Brazil, with 339, 288, and 245 biotechnology patents filed, respectively. For residents, the largest applicant was the governmental institution FIOCRUZ (Oswaldo Cruz Foundation), which filed 69 biotechnology patents within the period analyzed. The first biotechnology patent applications via PCT were submitted by Brazilians in 1997, with 3 from UFMG (university), 2 from individuals, and 1 from EMBRAPA (research institute).
UKPMC: a full text article resource for the life sciences.
McEntyre, Johanna R; Ananiadou, Sophia; Andrews, Stephen; Black, William J; Boulderstone, Richard; Buttery, Paula; Chaplin, David; Chevuru, Sandeepreddy; Cobley, Norman; Coleman, Lee-Ann; Davey, Paul; Gupta, Bharti; Haji-Gholam, Lesley; Hawkins, Craig; Horne, Alan; Hubbard, Simon J; Kim, Jee-Hyub; Lewin, Ian; Lyte, Vic; MacIntyre, Ross; Mansoor, Sami; Mason, Linda; McNaught, John; Newbold, Elizabeth; Nobata, Chikashi; Ong, Ernest; Pillai, Sharmila; Rebholz-Schuhmann, Dietrich; Rosie, Heather; Rowbotham, Rob; Rupp, C J; Stoehr, Peter; Vaughan, Philip
2011-01-01
UK PubMed Central (UKPMC) is a full-text article database that extends the functionality of the original PubMed Central (PMC) repository. The UKPMC project was launched as the first 'mirror' site to PMC, which in analogy to the International Nucleotide Sequence Database Collaboration, aims to provide international preservation of the open and free-access biomedical literature. UKPMC (http://ukpmc.ac.uk) has undergone considerable development since its inception in 2007 and now includes both a UKPMC and PubMed search, as well as access to other records such as Agricola, Patents and recent biomedical theses. UKPMC also differs from PubMed/PMC in that the full text and abstract information can be searched in an integrated manner from one input box. Furthermore, UKPMC contains 'Cited By' information as an alternative way to navigate the literature and has incorporated text-mining approaches to semantically enrich content and integrate it with related database resources. Finally, UKPMC also offers added-value services (UKPMC+) that enable grantees to deposit manuscripts, link papers to grants, publish online portfolios and view citation information on their papers. Here we describe UKPMC and clarify the relationship between PMC and UKPMC, providing historical context and future directions, 10 years on from when PMC was first launched.
UKPMC: a full text article resource for the life sciences
McEntyre, Johanna R.; Ananiadou, Sophia; Andrews, Stephen; Black, William J.; Boulderstone, Richard; Buttery, Paula; Chaplin, David; Chevuru, Sandeepreddy; Cobley, Norman; Coleman, Lee-Ann; Davey, Paul; Gupta, Bharti; Haji-Gholam, Lesley; Hawkins, Craig; Horne, Alan; Hubbard, Simon J.; Kim, Jee-Hyub; Lewin, Ian; Lyte, Vic; MacIntyre, Ross; Mansoor, Sami; Mason, Linda; McNaught, John; Newbold, Elizabeth; Nobata, Chikashi; Ong, Ernest; Pillai, Sharmila; Rebholz-Schuhmann, Dietrich; Rosie, Heather; Rowbotham, Rob; Rupp, C. J.; Stoehr, Peter; Vaughan, Philip
2011-01-01
UK PubMed Central (UKPMC) is a full-text article database that extends the functionality of the original PubMed Central (PMC) repository. The UKPMC project was launched as the first ‘mirror’ site to PMC, which in analogy to the International Nucleotide Sequence Database Collaboration, aims to provide international preservation of the open and free-access biomedical literature. UKPMC (http://ukpmc.ac.uk) has undergone considerable development since its inception in 2007 and now includes both a UKPMC and PubMed search, as well as access to other records such as Agricola, Patents and recent biomedical theses. UKPMC also differs from PubMed/PMC in that the full text and abstract information can be searched in an integrated manner from one input box. Furthermore, UKPMC contains ‘Cited By’ information as an alternative way to navigate the literature and has incorporated text-mining approaches to semantically enrich content and integrate it with related database resources. Finally, UKPMC also offers added-value services (UKPMC+) that enable grantees to deposit manuscripts, link papers to grants, publish online portfolios and view citation information on their papers. Here we describe UKPMC and clarify the relationship between PMC and UKPMC, providing historical context and future directions, 10 years on from when PMC was first launched. PMID:21062818
A survey of enabling technologies in synthetic biology
2013-01-01
Background Realizing constructive applications of synthetic biology requires continued development of enabling technologies as well as policies and practices to ensure these technologies remain accessible for research. Broadly defined, enabling technologies for synthetic biology include any reagent or method that, alone or in combination with associated technologies, provides the means to generate any new research tool or application. Because applications of synthetic biology likely will embody multiple patented inventions, it will be important to create structures for managing intellectual property rights that best promote continued innovation. Monitoring the enabling technologies of synthetic biology will facilitate the systematic investigation of property rights coupled to these technologies and help shape policies and practices that impact the use, regulation, patenting, and licensing of these technologies. Results We conducted a survey among a self-identifying community of practitioners engaged in synthetic biology research to obtain their opinions and experiences with technologies that support the engineering of biological systems. Technologies widely used and considered enabling by survey participants included public and private registries of biological parts, standard methods for physical assembly of DNA constructs, genomic databases, software tools for search, alignment, analysis, and editing of DNA sequences, and commercial services for DNA synthesis and sequencing. Standards and methods supporting measurement, functional composition, and data exchange were less widely used though still considered enabling by a subset of survey participants. Conclusions The set of enabling technologies compiled from this survey provide insight into the many and varied technologies that support innovation in synthetic biology. Many of these technologies are widely accessible for use, either by virtue of being in the public domain or through legal tools such as non-exclusive licensing. Access to some patent protected technologies is less clear and use of these technologies may be subject to restrictions imposed by material transfer agreements or other contract terms. We expect the technologies considered enabling for synthetic biology to change as the field advances. By monitoring the enabling technologies of synthetic biology and addressing the policies and practices that impact their development and use, our hope is that the field will be better able to realize its full potential. PMID:23663447
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2011 CFR
2011-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2013 CFR
2013-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2012 CFR
2012-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2010 CFR
2010-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2014 CFR
2014-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
An overview of a recent court challenge to the protection of biomarkers as intellectual property.
Hall, Stephen C; Tromp, Justin M; Jortani, Saeed A
2011-05-12
We present an intellectual property case in the United States to demonstrate the recent developments concerning patenting novel biomarker discoveries. A court struck down several patents owned by Myriad Genetics, which were related to breast cancer (BRCA1 and BRCA2). This decision can affect patent eligibility for inventions related to biomarkers, particularly genetic biomarkers. The court proceedings for the Myriad Genetics case were reviewed by two patent attorneys (SCH and JMT). Relevant discussions applicable to the scientist involved with biomarker discovery were also prepared. In this case, the Plaintiff had argued that the analysis and comparison of various gene mutations merely involved natural phenomena, and, therefore, could not be eligible for patent protection. The patent holder (Myriad) argued that the claimed gene compositions did not exist in nature, and that the claimed methods provided practical utility for science and medicine. The Court held that the patent claims did not meet patent eligibility requirements under United States patent law. It held that the patent claims at issue were merely abstract mental processes of analyzing and comparing gene sequences, and that such abstract mental processes are not patentable. On June 22, 2010, Myriad appealed the ruling. This case provides guidance to inventors in the biomarker field who may be interested in obtaining intellectual property protection for their inventive work, as well as their patent counsel. However, the case also presented unique factors that may not be present in all situations involving biomarker patents. Copyright © 2011 Elsevier B.V. All rights reserved.
Dalton, David M; Burke, Thomas P; Kelly, Enda G; Curtin, Paul D
2016-06-01
Surgery is in a constant continuum of innovation with refinement of technique and instrumentation. Arthroplasty surgery potentially represents an area with highly innovative process. This study highlights key area of innovation in knee arthroplasty over the past 35 years using patent and publication metrics. Growth rates and patterns are analyzed. Patents are correlated to publications as a measure of scientific support. Electronic patent and publication databases were searched over the interval 1980-2014 for "knee arthroplasty" OR "knee replacement." The resulting patent codes were allocated into technology clusters. Citation analysis was performed to identify any important developments missed on initial analysis. The technology clusters identified were further analyzed, individual repeat searches performed, and growth curves plotted. The initial search revealed 3574 patents and 16,552 publications. The largest technology clusters identified were Unicompartmental, Patient-Specific Instrumentation (PSI), Navigation, and Robotic knee arthroplasties. The growth in patent activity correlated strongly with publication activity (Pearson correlation value 0.892, P < .01), but was growing at a faster rate suggesting a decline in vigilance. PSI, objectively the fastest growing technology in the last 5 years, is currently in a period of exponential growth that began a decade ago. Established technologies in the study have double s-shaped patent curves. Identifying trends in emerging technologies is possible using patent metrics and is useful information for training and regulatory bodies. The decline in ratio of publications to patents and the uninterrupted growth of PSI are developments that may warrant further investigation. Copyright © 2015 Elsevier Inc. All rights reserved.
Code of Federal Regulations, 2011 CFR
2011-07-01
... from abandonment 1.135 Amino Acid Sequences. (See Nucleotide and/or Amino Acid Sequences) Appeal to... Appeals and Interference 41.47 Of rejection of an application 1.104(a) Nucleotide and/or Amino Acid...) Symbols for nucleotide and/or amino acid sequence data 1.822 T Tables in patent applications 1.58 Terminal...
Dawalbhakta, Mitali; Telang, Manasi
2017-01-01
Saffron (Crocus sativus L.) has a long history of use as a food additive and a traditional medicine for treating a number of disorders. Prominent bioactives of saffron are crocin, crocetin and safranal. The aim of this study was to carry out an extensive patent search to collect information on saffron bioactives and their derivatives as therapeutic and cosmeceutical agents. All patents related to the area of interest published globally till date have been reviewed. Moreover, a recent synthetic biology approach to cost effective and consistent production of saffron bioactives has been highlighted. A patent search strategy was designed based on keywords and concepts related to Crocus sativus L. and its bioactives- safranal, crocin and crocetin in combination with different patent classification codes relevant to the technology areas. This search strategy was employed to retrieve patents from various patent databases. The patents which focused on therapeutic or cosmetic applications and claimed compositions comprising crocin, crocetin or safranal as the main active component were selected and analysed. Maximum patenting activity was noticed towards the use of these bioactives in the treatment of neurological disorders followed by multiple uses of the same compound, use in treatment of metabolic disorders and use as cosmeceuticals. Interestingly, there were no patent records related to use of these bioactives in treating infectious disorders. Our patent analysis points out the populous and less explored uses of saffron bioactives and areas where there is further scope for research and growth. Recently developed synthetic biology approach is contributory in improving availability, consistency and cost effectiveness of saffron bioactives. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Nano-enabled drug delivery: a research profile.
Zhou, Xiao; Porter, Alan L; Robinson, Douglas K R; Shim, Min Suk; Guo, Ying
2014-07-01
Nano-enabled drug delivery (NEDD) systems are rapidly emerging as a key area for nanotechnology application. Understanding the status and developmental prospects of this area around the world is important to determine research priorities, and to evaluate and direct progress. Global research publication and patent databases provide a reservoir of information that can be tapped to provide intelligence for such needs. Here, we present a process to allow for extraction of NEDD-related information from these databases by involving topical experts. This process incorporates in-depth analysis of NEDD literature review papers to identify key subsystems and major topics. We then use these to structure global analysis of NEDD research topical trends and collaborative patterns, inform future innovation directions. This paper describes the process of how to derive nano-enabled drug delivery-related information from global research and patent databases in an effort to perform comprehensive global analysis of research trends and directions, along with collaborative patterns. Copyright © 2014 Elsevier Inc. All rights reserved.
Diniz, Tâmara Coimbra; Pinto, Tiago Coimbra Costa; Menezes, Paula Dos Passos; Silva, Juliane Cabral; Teles, Roxana Braga de Andrade; Ximenes, Rosana Christine Cavalcanti; Guimarães, Adriana Gibara; Serafini, Mairim Russo; Araújo, Adriano Antunes de Souza; Quintans Júnior, Lucindo José; Almeida, Jackson Roberto Guedes da Silva
2018-01-01
Depression is a serious mood disorder and is one of the most common mental illnesses. Despite the availability of several classes of antidepressants, a substantial percentage of patients are unresponsive to these drugs, which have a slow onset of action in addition to producing undesirable side effects. Some scientific evidence suggests that cyclodextrins (CDs) can improve the physicochemical and pharmacological profile of antidepressant drugs (ADDs). The purpose of this paper is to disclose current data technology prospects involving antidepressant drugs and cyclodextrins. Areas covered: We conducted a patent review to evaluate the antidepressive activity of the compounds complexed in CDs, and we analyzed whether these complexes improved their physicochemical properties and pharmacological action. The present review used 8 specialized patent databases for patent research, using the term 'cyclodextrin' combined with 'antidepressive agents' and its related terms. We found 608 patents. In the end, considering the inclusion criteria, 27 patents reporting the benefits of complexation of ADDs with CDs were included. Expert opinion: The use of CDs can be considered an important tool for the optimization of physicochemical and pharmacological properties of ADDs, such as stability, solubility and bioavailability.
Cyclodextrins: improving the therapeutic response of analgesic drugs: a patent review.
de Oliveira, Makson G B; Guimarães, Adriana G; Araújo, Adriano A S; Quintans, Jullyana S S; Santos, Márcio R V; Quintans-Júnior, Lucindo J
2015-01-01
Cyclodextrins (CDs) are cyclic oligosaccharides that have recently been recognized as useful tools for optimizing the delivery of such problematic drugs. CDs can be found in at least 35 pharmaceutical products, such as anticancer agents, analgesic and anti-inflammatory drugs. Besides, several studies have demonstrated that CD-complexed drugs could provide benefits in solubility, stability and also improve pharmacological response when compared with the drug alone. The patent search was conducted in the databases WIPO, Espacenet, USPTO, Derwent and INPI, using the keywords cyclodextrin, pain and its related terms (analgesia, hyperalgesia, hypernociception, nociception, antinociception, antinociceptive). We found 442 patents. Criteria such as the complexation of analgesic agents and evidence of improvement of the therapeutic effect were indispensable for the inclusion of the patent. So, 18 patents were selected. We noticed that some patents are related to the complexation of opioids, NSAIDs, as well as natural products, in different types of CDs. The use of CDs creates the prospect of developing new therapeutic options for the most effective treatment of painful conditions, allowing a reduction of dosage of analgesic drugs and the occurrence of side effects. Thus, CDs can be an important tool to improve the efficacy and pharmacological profile of analgesic drugs.
Co-processed excipients: a patent review.
Garg, Nidhi; Dureja, Harish; Kaushik, Deepak
2013-04-01
The introduction of high speed tableting machines and the preference of direct compression as a method of tableting have increased the demands on the functionality of excipients mainly in terms of flowability and compressibility. Co-processed excipients, where in, excipients are combined by virtue of sub-particle level interaction have provided an attractive tool for developing high functionality excipients. The multifold advantages offered by co-processed excipients such as production of synergism in functionality of individual components, reduction of company's regulatory concern because of absence of chemical change during co-processing and improvement in physico-chemical properties have expanded their use in the pharmaceutical industry. In the recent years, there has been a spurt in the number of patents filed on co-processed excipients. Hence, the present review focuses on co-processed excipients and their application in pharmaceutical industry. The worldwide databases of European patent office (http://ep.espacenet.com) and United States patent office (www.uspto.gov) were employed to collect the patents and patent applications. The advantages, limitations, basis for the selection of excipients to be co-processed, methods of co-processing and regulatory perspective of co-processed excipients are also briefly discussed.
Dencic, Ivana; Hessel, Volker; de Croon, Mart H J M; Meuldijk, Jan; van der Doelen, Christianus W J; Koch, Kasper
2012-02-13
The miniaturization of continuous processes has been of increasing interest in the past decade, and microreaction technology and flow chemistry have moved from academic and industrial research to commercial applications. With industry taking up such innovations, this trend is also reflected in the patenting behavior of companies active in this area. This review is a continuation of the review paper on microreactor patents published by Hessel et al. and indicates major changes in patenting trends since 2006. Moreover, a different patent database search algorithm is presented, which complements the algorithm explained in the previous review. In addition, the preservation of intellectual property is analyzed for multiphase reactions and particularly solid-catalyzed gas-liquid reactions in microreactors, which play an important role in the chemical and pharmaceutical industries and are reactions that benefit largely from microprocessing. Among other results, we show that the number of patents has increased in this field, with solid-catalyst design and deposition, control of the flow pattern, and ensured stable flow as the main challenges. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Patent landscape of neglected tropical diseases: an analysis of worldwide patent families.
Akinsolu, Folahanmi Tomiwa; de Paiva, Vitor Nobre; Souza, Samuel Santos; Varga, Orsolya
2017-11-14
"Neglected Tropical Diseases" (NTDs) affect millions of people in Africa, Asia and South America. The two primary ways of strategic interventions are "preventive chemotherapy and transmission control" (PCT), and "innovative and intensified disease management" (IDM). In the last 5 years, phenomenal progress has been achieved. However, it is crucial to intensify research effort into NTDs, because of the emerging drug resistance. According to the World Health Organization (WHO), the term NTDs covers 17 diseases, namely buruli ulcer, Chagas disease, dengue, dracunculiasis, echinococcosis, trematodiasis, human African trypanosomiasis, leishmaniasis, leprosy, lymphatic filariasis, onchocerciasis, rabies, schistosomiasis, soil-transmitted helminthes, taeniasis, trachoma, and yaws. The aim of this study is to map out research and development (R&D) landscape through patent analysis of these identified NTDs. To achieve this, analysis and evaluation have been conducted on patenting trends, current legal status of patent families, priority countries by earliest priority years and their assignee types, technological fields of patent families over time, and original and current patent assignees. Patent families were extracted from Patseer, an international database of patents from over 100 patent issuing authorities worldwide. Evaluation of the patents was carried out using the combination of different search terms related to each identified NTD. In this paper, a total number of 12,350 patent families were analyzed. The main countries with sources of inventions were identified to be the United States (US) and China. The main technological fields covered by NTDs patent landscape are pharmaceuticals, biotechnology, organic fine chemistry, analysis of biological materials, basic materials chemistry, and medical technology. Governmental institutions and universities are the primary original assignees. Among the NTDs, leishmaniasis, dengue, and rabies received the highest number of patent families, while human African trypanosomiasis (sleeping sickness), taeniasis, and dracunciliasis received the least. The overall trend of patent families shows an increase between 1985 and 2008, and followed by at least 6 years of stagnation. The filing pattern of patent families analyzed undoubtedly reveals slow progress on research and development of NTDs. Involving new players, such as non-governmental organizations may help to mitigate and reduce the burden of NTDs.
Identification of the Key Fields and Their Key Technical Points of Oncology by Patent Analysis
Zhang, Ting; Chen, Juan; Jia, Xiaofeng
2015-01-01
Background This paper aims to identify the key fields and their key technical points of oncology by patent analysis. Methodology/Principal Findings Patents of oncology applied from 2006 to 2012 were searched in the Thomson Innovation database. The key fields and their key technical points were determined by analyzing the Derwent Classification (DC) and the International Patent Classification (IPC), respectively. Patent applications in the top ten DC occupied 80% of all the patent applications of oncology, which were the ten fields of oncology to be analyzed. The number of patent applications in these ten fields of oncology was standardized based on patent applications of oncology from 2006 to 2012. For each field, standardization was conducted separately for each of the seven years (2006–2012) and the mean of the seven standardized values was calculated to reflect the relative amount of patent applications in that field; meanwhile, regression analysis using time (year) and the standardized values of patent applications in seven years (2006–2012) was conducted so as to evaluate the trend of patent applications in each field. Two-dimensional quadrant analysis, together with the professional knowledge of oncology, was taken into consideration in determining the key fields of oncology. The fields located in the quadrant with high relative amount or increasing trend of patent applications are identified as key ones. By using the same method, the key technical points in each key field were identified. Altogether 116,820 patents of oncology applied from 2006 to 2012 were retrieved, and four key fields with twenty-nine key technical points were identified, including “natural products and polymers” with nine key technical points, “fermentation industry” with twelve ones, “electrical medical equipment” with four ones, and “diagnosis, surgery” with four ones. Conclusions/Significance The results of this study could provide guidance on the development direction of oncology, and also help researchers broaden innovative ideas and discover new technological opportunities. PMID:26599967
Identification of the Key Fields and Their Key Technical Points of Oncology by Patent Analysis.
Zhang, Ting; Chen, Juan; Jia, Xiaofeng
2015-01-01
This paper aims to identify the key fields and their key technical points of oncology by patent analysis. Patents of oncology applied from 2006 to 2012 were searched in the Thomson Innovation database. The key fields and their key technical points were determined by analyzing the Derwent Classification (DC) and the International Patent Classification (IPC), respectively. Patent applications in the top ten DC occupied 80% of all the patent applications of oncology, which were the ten fields of oncology to be analyzed. The number of patent applications in these ten fields of oncology was standardized based on patent applications of oncology from 2006 to 2012. For each field, standardization was conducted separately for each of the seven years (2006-2012) and the mean of the seven standardized values was calculated to reflect the relative amount of patent applications in that field; meanwhile, regression analysis using time (year) and the standardized values of patent applications in seven years (2006-2012) was conducted so as to evaluate the trend of patent applications in each field. Two-dimensional quadrant analysis, together with the professional knowledge of oncology, was taken into consideration in determining the key fields of oncology. The fields located in the quadrant with high relative amount or increasing trend of patent applications are identified as key ones. By using the same method, the key technical points in each key field were identified. Altogether 116,820 patents of oncology applied from 2006 to 2012 were retrieved, and four key fields with twenty-nine key technical points were identified, including "natural products and polymers" with nine key technical points, "fermentation industry" with twelve ones, "electrical medical equipment" with four ones, and "diagnosis, surgery" with four ones. The results of this study could provide guidance on the development direction of oncology, and also help researchers broaden innovative ideas and discover new technological opportunities.
Wang, Yu-Guang; Jin, Rui; Qiang, Si-Si; Lin, Zhi-Jian; Li, Hong-Yan; Lu, Shu; Kong, Xiang-Wen
2016-01-01
Chinese patent medicines for orthopedics are among the hotspot and difficulty in the rational medication of traditional Chinese medicine (TCM), because they mostly contain toxic medicinal herbs and oriented to special patients. According to the hospital pharmacy practices and the therapeutic theories of TCM, this paper focused on a novel model of rational drug use of Chinese patent medicine for orthopedics based on the principles of ″syndrome-dosage-toxicity differentiation″. We also proposed relevant specifications for guiding their clinical use. Firstly, we proposed a list of the primary clinical application characteristics for rational drug use of orthopedic TCMs, including the syndromes of patient, the dosage of medicine and the toxic ingredients in medicine. Secondly, a database was established for recording the package inserts of all of the 81 orthopedic patent medicines in our hospital, and 2 000 retrospective recipes were analyzed for looking for the high-frequency medicines and common irrational factors. Then clinical case reports involving the adverse reactions and side effects of related drugs were searched from CNKI, VIP and WanFang databases. Then the key information for rational application of each medicine was extracted from these resources and some survey questionnaires. Finally, we established a guide named instructions for clinical use of orthopedic Chinese patient medicines (ICUOCPM) after the discusstion with experts. According to the effect after the practice in hospital for 2 months, the proposed principles of ″syndrome-dosage-toxicity differentiation″ in this paper were believed to be the core elements and the most important clinical monitoring points in TCM for orthopedic patents. It would provide innovative ideas, theoretical guarantee and data support for the development of TCM clinical pharmacy. Copyright© by the Chinese Pharmaceutical Association.
Intellectual Property: a powerful tool to develop biotech research
Giugni, Diego; Giugni, Valter
2010-01-01
Summary Today biotechnology is perhaps the most important technology field because of the strong health and food implications. However, due to the nature of said technology, there is the need of a huge amount of investments to sustain the experimentation costs. Consequently, investors aim to safeguard as much as possible their investments. Intellectual Property, and in particular patents, has been demonstrated to actually constitute a powerful tool to help them. Moreover, patents represent an extremely important means to disclose biotechnology inventions. Patentable biotechnology inventions involve products as nucleotide and amino acid sequences, microorganisms, processes or methods for modifying said products, uses for the manufacture of medicaments, etc. There are several ways to protect inventions, but all follow the three main patentability requirements: novelty, inventive step and industrial application. PMID:21255349
Value chain of nanotechnology: a comparative study of some major players
NASA Astrophysics Data System (ADS)
Wang, Gangbo; Guan, Jiancheng
2012-02-01
The article provides a general overview for the landscapes of national nanotechnology development from 1991 to 2010. More than 230,000 unique patents are identified based on a composite search strategy in the Derwent innovation index database. According to the concordance between patent classification and industry technology, some main application areas are identified to compare the positions and specializations among the leading countries. By extracting the content of the "use" subfield in the abstracts and harvesting the keywords representing characteristics of life cycle, nanotechnology patents are grouped into four categories: nanomaterials, nanointermediates, nano-enabled products, and nanotools, which can be seen as four stages of nanotechnology's value chain. These analyses enable us to identify the distributions of value chain and prolific research institutions among the leading countries. It is found that China is productive in nanomaterials and nanointermediates, rather than nano-enabled products and nanotools, which could be mainly explained by the fact that Chinese academia makes a main contribution to nanotechnology patenting. However, there is a big gap between university patenting and market demands, leading to a low rate of technology transfer or licensing.
Text mining factor analysis (TFA) in green tea patent data
NASA Astrophysics Data System (ADS)
Rahmawati, Sela; Suprijadi, Jadi; Zulhanif
2017-03-01
Factor analysis has become one of the most widely used multivariate statistical procedures in applied research endeavors across a multitude of domains. There are two main types of analyses based on factor analysis: Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA). Both EFA and CFA aim to observed relationships among a group of indicators with a latent variable, but they differ fundamentally, a priori and restrictions made to the factor model. This method will be applied to patent data technology sector green tea to determine the development technology of green tea in the world. Patent analysis is useful in identifying the future technological trends in a specific field of technology. Database patent are obtained from agency European Patent Organization (EPO). In this paper, CFA model will be applied to the nominal data, which obtain from the presence absence matrix. While doing processing, analysis CFA for nominal data analysis was based on Tetrachoric matrix. Meanwhile, EFA model will be applied on a title from sector technology dominant. Title will be pre-processing first using text mining analysis.
Review of Maxillary Expansion Appliance Activation Methods: Engineering and Clinical Perspectives
Romanyk, D. L.; Lagravere, M. O.; Toogood, R. W.; Major, P. W.; Carey, J. P.
2010-01-01
Objective. Review the reported activation methods of maxillary expansion devices for midpalatal suture separation from an engineering perspective and suggest areas of improvement. Materials and Methods. A literature search of Scopus and PubMed was used to determine current expansion methods. A U.S. and Canadian patent database search was also conducted using patent classification and keywords. Any paper presenting a new method of expansion was included. Results. Expansion methods in use, or patented, can be classified as either a screw- or spring-type, magnetic, or shape memory alloy expansion appliance. Conclusions. Each activation method presented unique advantages and disadvantages from both clinical and engineering perspectives. Areas for improvement still remain and are identified in the paper. PMID:20948570
Massarotti, Alberto; Brunco, Angelo; Sorba, Giovanni; Tron, Gian Cesare
2014-02-24
Since Professors Sharpless, Finn, and Kolb first introduced the concept of "click reactions" in 2001 as powerful tools in drug discovery, 1,4-disubstituted-1,2,3-triazoles have become important in medicinal chemistry due to the simultaneous discovery by Sharpless, Fokin, and Meldal of a perfect click 1,3-dipolar cycloaddition reaction between azides and alkynes catalyzed by copper salts. Because of their chemical features, these triazoles are proposed to be aggressive pharmacophores that participate in drug-receptor interactions while maintaining an excellent chemical and metabolic profile. Surprisingly, no virtual libraries of 1,4-disubstituted-1,2,3-triazoles have been generated for the systematic investigation of the click-chemical space. In this manuscript, a database of triazoles called ZINClick is generated from literature-reported alkynes and azides that can be synthesized within three steps from commercially available products. This combinatorial database contains over 16 million 1,4-disubstituted-1,2,3-triazoles that are easily synthesizable, new, and patentable! The structural diversity of ZINClick ( http://www.symech.it/ZINClick ) will be explored. ZINClick will also be compared to other available databases, and its application during the design of novel bioactive molecules containing triazole nuclei will be discussed.
Wang, Sai-Jun; Wu, Zhen-Feng; Yang, Ming; Wang, Ya-Qi; Hu, Peng-Yi; Jie, Xiao-Lu; Han, Fei; Wang, Fang
2014-09-01
Aromatic traditional Chinese medicines have a long history in China, with wide varieties. Volatile oils are active ingredients extracted from aromatic herbal medicines, which usually contain tens or hundreds of ingredients, with many biological activities. Therefore, volatile oils are often used in combined prescriptions and made into various efficient preparations for oral administration or external use. Based on the sources from the database of Newly Edited National Chinese Traditional Patent Medicines (the second edition), the author selected 266 Chinese patent medicines containing volatile oils in this paper, and then established an information sheet covering such items as name, dosage, dosage form, specification and usage, and main functions. Subsequently, on the basis of the multidisciplinary knowledge of pharmaceutics, traditional Chinese pharmacology and basic theory of traditional Chinese medicine, efforts were also made in the statistics of the dosage form and usage, variety of volatile oils and main functions, as well as the status analysis on volatile oils in terms of the dosage form development, prescription development, drug instruction and quality control, in order to lay a foundation for the further exploration of the market development situations of volatile oils and the future development orientation.
Surgical ligation of patent ductus arteriosus in premature infants: trends and practice variation.
Weinberg, Jacqueline G; Evans, Frank J; Burns, Kristin M; Pearson, Gail D; Kaltman, Jonathan R
2016-08-01
We sought to analyse the variation in the incidence of patent ductus arteriosus over three recent time points and characterise ductal ligation practices in preterm infants in the United States, adjusting for demographic and morbidity factors. Using the Kids' Inpatient Database from 2003, 2006, and 2009, we identified infants born at ⩽32 weeks of gestation with International Classification of Diseases, Ninth Revision diagnosis of patent ductus arteriosus and ligation code. We examined patient and hospital characteristics and identified patient and hospital variables associated with ligation. Of 182,610 preterm births, 30,714 discharges included a patent ductus arteriosus diagnosis. The rate of patent ductus arteriosus diagnosis increased from 14% in 2003 to 21% in 2009 (p<0.001). A total of 4181 ligations were performed, with an overall ligation rate of 14%. Ligation rate in infants born at ⩽28 weeks of gestation was 20% overall, increasing from 18% in 2003 to 21% in 2009 (p<0.001). The ligation rate varied by state (4-28%), and ligation was associated with earlier gestational age, associated diagnoses, hospital type, teaching hospital status, and region (p<0.001). The rates of patent ductus arteriosus diagnosis and ligation have increased in the recent years. Variation exists in the practice of patent ductus arteriosus ligation and is influenced by patient and non-patient factors.
Nucleotide sequence composition and method for detection of neisseria gonorrhoeae
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lo, A.; Yang, H.L.
1990-02-13
This patent describes a composition of matter that is specific for {ital Neisseria gonorrhoeae}. It comprises: at least one nucleotide sequence for which the ratio of the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria gonorrhoeae} to the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria meningitidis} is greater than about five. The ratio being obtained by a method described.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Form and format for... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... Code for Information Interchange (ASCII) text. No other formats shall be allowed. (3) The computer...
Are sigma modulators an effective opportunity for cancer treatment? A patent overview (1996-2016).
Collina, Simona; Bignardi, Emanuele; Rui, Marta; Rossi, Daniela; Gaggeri, Raffaella; Zamagni, Alice; Cortesi, Michela; Tesei, Anna
2017-05-01
Although several molecular targets against cancer have been identified, there is a continuous need for new therapeutic strategies. Sigma Receptors (SRs) overexpression has been recently associated with different cancer conditions. Therefore, novel anticancer agents targeting SRs may increase the specificity of therapies, overcoming some of the common drawbacks of conventional chemotherapy. Areas covered: The present review focuses on patent documents disclosing SR modulators with possible application in cancer therapy and diagnosis. The analysis reviews patents of the last two decades (1996-2016); patents were grouped according to target subtypes (S1R, S2R, pan-SRs) and relevant Applicants. The literature was searched through Espacenet, ISI Web, PatentScope and PubMed databases. Expert opinion: The number of patents related to SRs and cancer has increased in the last twenty years, confirming the importance of this receptor family as valuable target against neoplasias. Despite their short history in the cancer scenario, many SR modulators are at pre-clinical stage and one is undergoing a phase II clinical trial. SRs ligands may represent a powerful source of innovative antitumor therapeutics. Further investigation is needed for validating SR modulators as anti-cancer drugs. We strongly hope that this review could stimulate the interest of both Academia and pharmaceutical companies.
MIPS: a database for genomes and protein sequences.
Mewes, H W; Heumann, K; Kaps, A; Mayer, K; Pfeiffer, F; Stocker, S; Frishman, D
1999-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried near Munich, Germany, develops and maintains genome oriented databases. It is commonplace that the amount of sequence data available increases rapidly, but not the capacity of qualified manual annotation at the sequence databases. Therefore, our strategy aims to cope with the data stream by the comprehensive application of analysis tools to sequences of complete genomes, the systematic classification of protein sequences and the active support of sequence analysis and functional genomics projects. This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). MIPS provides access through its WWW server (http://www.mips.biochem.mpg.de) to a spectrum of generic databases, including the above mentioned as well as a database of protein families (PROTFAM), the MITOP database, and the all-against-all FASTA database. PMID:9847138
ERIC Educational Resources Information Center
Tauchert, Wolfgang; And Others
1991-01-01
Describes the PADOK-II project in Germany, which was designed to give information on the effects of linguistic algorithms on retrieval in a full-text database, the German Patent Information System (GPI). Relevance assessments are discussed, statistical evaluations are described, and searches are compared for the full-text section versus the…
Implementation of medical monitor system based on networks
NASA Astrophysics Data System (ADS)
Yu, Hui; Cao, Yuzhen; Zhang, Lixin; Ding, Mingshi
2006-11-01
In this paper, the development trend of medical monitor system is analyzed and portable trend and network function become more and more popular among all kinds of medical monitor devices. The architecture of medical network monitor system solution is provided and design and implementation details of medical monitor terminal, monitor center software, distributed medical database and two kind of medical information terminal are especially discussed. Rabbit3000 system is used in medical monitor terminal to implement security administration of data transfer on network, human-machine interface, power management and DSP interface while DSP chip TMS5402 is used in signal analysis and data compression. Distributed medical database is designed for hospital center according to DICOM information model and HL7 standard. Pocket medical information terminal based on ARM9 embedded platform is also developed to interactive with center database on networks. Two kernels based on WINCE are customized and corresponding terminal software are developed for nurse's routine care and doctor's auxiliary diagnosis. Now invention patent of the monitor terminal is approved and manufacture and clinic test plans are scheduled. Applications for invention patent are also arranged for two medical information terminals.
The Regional Structure of Technical Innovation
NASA Astrophysics Data System (ADS)
O'Neale, Dion
2014-03-01
There is strong evidence that the productivity per capita of cities and regions increases with population. One likely explanation for this phenomenon is that densely populated regions bring together otherwise unlikely combinations of individuals and organisations with diverse, specialised capabilities, leading to increased innovation and productivity. We have used the REGPAT patent database to construct a bipartite network of geographic regions and the patent classes for which those regions display a revealed comparative advantage. By analysing this network, we can infer relationships between different types of patent classes - and hence the structure of (patentable) technology. The network also provides a novel perspective for studying the combinations of technical capabilities in different geographic regions. We investigate measures such as the diversity and ubiquity of innovations within regions and find that diversity (resp. ubiquity) is positively (resp. negatively) correlated with population. We also find evidence of a nested structure for technical innovation. That is, specialised innovations tend to occur only when other more general innovations are already present.
Herbal drug patenting in India: IP potential.
Sahoo, Niharika; Manchikanti, Padmavati; Dey, Satya Hari
2011-09-01
Herbal drugs are gaining worldwide prominence due to their distinct advantages. Developing countries have started exploring the ethnopharmacological approach of drug discovery and have begun to file patents on herbal drugs. The expansion of R&D in Indian herbal research organizations and presence of manufacturing units at non-Indian sites is an indication of the capability to develop new products and processes. The present study attempts to identify innovations in the Indian herbal drug sector by analyzing the patenting trends in India, US and EU. Based on key word and IPC based search at the IPO, USPTO, Esp@cenet and WIPO databases, patent applications and grant in herbal drugs by Indian applicants/assignees was collected for the last ten years (from 1st January 2001 to 31st October 2010). From this collection patents related to human therapeutic use only were selected. Analysis was performed to identify filing trends, major applicants/assignees, disease area and major plant species used for various treatments. There is a gradual increase in patent filing through the years. In India, individual inventors have maximum applications and grants. CSIR, among research organizations and Hindustan Unilever, Avesthagen, Piramal Life Science, Sahajanand Biotech and Indus Biotech among the companies have the maximum granted patents in India, US and EU respectively. Diabetes, cancer and inflammatory disorders are the major areas for patenting in India and abroad. Recent patents are on new herbal formulations for treatment of AIDS, hepatitis, skin disorders and gastrointestinal disorders. A majority of the herbal patents applications and grants in India are with individual inventors. Claim analysis indicates that these patents include novel multi-herb compositions with synergistic action. Indian research organizations are more active than companies in filing for patents. CSIR has maximum numbers of applications not only in India but also in the US and EU. Patents by research organizations and herbal companies are on development of new processes for active compound isolation and standardization of such components in addition to new compositions for therapeutic use. Pharmaceutical companies such as Ranbaxy, Lupin and Panacea Biotec are increasingly patenting on herbal drugs. There is increased patenting activity related to diabetes, cancer, cardiovascular diseases, asthma and arthritis in India and abroad. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Marx, U; Bushnaq, H; Yalcin, E
1998-02-01
Tissue engineering is seen as an interesting field of technology which could improve medical therapy and could also be considered as a commercial opportunity for the European biotechnological industry. Research in the state of the art of science using the MedLine and the Science Citation Index databases, in the patent situation and of the industry dealing with tissue engineering was done. A special method, based on the Science Citation Index Journal Citation Report 1993, for evaluating scientific work was defined. The main countries working in the field of tissue engineering were evaluated in regard to their scientific performance and their patents. The R&D of German industry was investigated as an exemplary European country. Out of all activities, different tissues were rated with respect to the attention received from research and industry and with regard to the frequency in which patents were applied for. USA, Germany and Japan rank first in most tissues, especially liver. After comparing German patents with the German scientific and industrial work, it seems that the potential in German patents and research is underestimated by German industry and inefficiently exploited.
A Case Study of Pharmaceutical Pricing in China: Setting the Price for Off-Patent Originators.
Hu, Shanlian; Zhang, Yabing; He, Jiangjiang; Du, Lixia; Xu, Mingfei; Xie, Chunyan; Peng, Ying; Wang, Linan
2015-08-01
This article aims to define a value-based approach to pricing and reimbursement for off-patent originators using a multiple criteria decision analysis (MCDA) approach centered on a systematic analysis of current pricing and reimbursement policies in China. A drug price policy review was combined with a quantitative analysis of China's drug purchasing database. Policy preferences were identified through a MCDA performed by interviewing well-known academic experts and industry stakeholders. The study findings indicate that the current Chinese price policy includes cost-based pricing and the establishment of maximum retail prices and premiums for off-patent originators, whereas reference pricing may be adopted in the future. The literature review revealed significant differences in the dissolution profiles between originators and generics; therefore, dissolution profiles need to be improved. Market data analysis showed that the overall price ratio of generics and off-patent originators was around 0.54-0.59 in 2002-2011, with a 40% price difference, on average. Ten differentiating value attributes were identified and MCDA was applied to test the impact of three pricing policy scenarios. With the condition of implementing quality consistency regulations and controls, a reduction in the price gap between high-quality off-patent products (including originator and generics) seemed to be the preferred policy. Patents of many drugs will expire within the next 10 years; thus, pricing will be an issue of importance for off-patent originators and generic alternatives.
Origins of medical innovation: the case of coronary artery stents.
Xu, Shuai; Avorn, Jerry; Kesselheim, Aaron S
2012-11-01
Innovative medical devices make major contributions to patient welfare, and coronary stents have been among the most important device developments of recent decades. However, the origins of such breakthrough medical technologies remain poorly understood. Using a comprehensive database of patents, we identified all individuals and institutions that developed intellectual property related to stent technology early in its development process. The patents were categorized and described using a predetermined qualitative coding strategy. We found 245 granted patents related to bare metal coronary artery stents from 1984 (when the first patent issued in this field) to 1994 (after the first stents were approved). Each year showed an increase in the number of patent filings: from 1 in 1984 to 97 in 1994. The largest fraction of patents was issued to private entities (44.9% of the total). Public companies, individual inventors, and nonprofit institutions represented 31.4%, 18.0%, and 5.7%, respectively. The top 10 most-cited patents in the field were dominated by 2 private entities, Expandable Grafts Partnership and Cook Inc, organizations created by or dependent on the work of independent academic physician-inventors. Coronary artery stent technology first arose from individual physician-inventors within academic medical centers and their associated private companies. After these initial innovations were in place, the field became dominated by large public companies. This history suggests that policies aimed at encouraging transformative medical device development would have their greatest effect if focused on individual inventors and scientists performing the early stages of technology development.
Intellectual Property: a powerful tool to develop biotech research.
Giugni, Diego; Giugni, Valter
2010-09-01
Today biotechnology is perhaps the most important technology field because of the strong health and food implications. However, due to the nature of said technology, there is the need of a huge amount of investments to sustain the experimentation costs. Consequently, investors aim to safeguard as much as possible their investments. Intellectual Property, and in particular patents, has been demonstrated to actually constitute a powerful tool to help them. Moreover, patents represent an extremely important means to disclose biotechnology inventions. Patentable biotechnology inventions involve products as nucleotide and amino acid sequences, microorganisms, processes or methods for modifying said products, uses for the manufacture of medicaments, etc. There are several ways to protect inventions, but all follow the three main patentability requirements: novelty, inventive step and industrial application. © 2010 The Authors; Journal compilation © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.
Intellectual Property Materials Online/CD-ROM: What and Where.
ERIC Educational Resources Information Center
Thompson, N. J.
1992-01-01
This comprehensive review of databanks and CD-ROMs worldwide dealing with patents, trademarks, trade names, copyrights, and related legal opinions includes comments on database coverage and search features. Comparison tables of vendors' products are provided. (22 references) (EA)
Determinants of the pace of global innovation in energy technologies.
Bettencourt, Luís M A; Trancik, Jessika E; Kaur, Jasleen
2013-01-01
Understanding the factors driving innovation in energy technologies is of critical importance to mitigating climate change and addressing other energy-related global challenges. Low levels of innovation, measured in terms of energy patent filings, were noted in the 1980s and 90s as an issue of concern and were attributed to limited investment in public and private research and development (R&D). Here we build a comprehensive global database of energy patents covering the period 1970-2009, which is unique in its temporal and geographical scope. Analysis of the data reveals a recent, marked departure from historical trends. A sharp increase in rates of patenting has occurred over the last decade, particularly in renewable technologies, despite continued low levels of R&D funding. To solve the puzzle of fast innovation despite modest R&D increases, we develop a model that explains the nonlinear response observed in the empirical data of technological innovation to various types of investment. The model reveals a regular relationship between patents, R&D funding, and growing markets across technologies, and accurately predicts patenting rates at different stages of technological maturity and market development. We show quantitatively how growing markets have formed a vital complement to public R&D in driving innovative activity. These two forms of investment have each leveraged the effect of the other in driving patenting trends over long periods of time.
Determinants of the Pace of Global Innovation in Energy Technologies
Kaur, Jasleen
2013-01-01
Understanding the factors driving innovation in energy technologies is of critical importance to mitigating climate change and addressing other energy-related global challenges. Low levels of innovation, measured in terms of energy patent filings, were noted in the 1980s and 90s as an issue of concern and were attributed to limited investment in public and private research and development (R&D). Here we build a comprehensive global database of energy patents covering the period 1970–2009, which is unique in its temporal and geographical scope. Analysis of the data reveals a recent, marked departure from historical trends. A sharp increase in rates of patenting has occurred over the last decade, particularly in renewable technologies, despite continued low levels of R&D funding. To solve the puzzle of fast innovation despite modest R&D increases, we develop a model that explains the nonlinear response observed in the empirical data of technological innovation to various types of investment. The model reveals a regular relationship between patents, R&D funding, and growing markets across technologies, and accurately predicts patenting rates at different stages of technological maturity and market development. We show quantitatively how growing markets have formed a vital complement to public R&D in driving innovative activity. These two forms of investment have each leveraged the effect of the other in driving patenting trends over long periods of time. PMID:24155867
In which developing countries are patents on essential medicines being filed?
Beall, Reed F; Blanchet, Rosanne; Attaran, Amir
2017-06-26
This article is based upon data gathered during a study conducted in partnership with the World Intellectual Property Organization on the patent status of products appearing on the World Health Organization's 2013 Model List of Essential Medicines (MLEM). It is a statistical analysis aimed at answering: in which developing countries are patents on essential medicines being filed? Patent data were collected by linking those listed in the United States and Canada's medicine patent registers to corresponding patents in developing countries using two international patent databases (INPADOC and Derwent) via a commerical-grade patent search platform (Thomson Innovation). The respective supplier companies were then contacted to correct and verify our data. We next tallied the number of MLEM patents per developing country. Spearman correlations were done to assess bivariate relationships between variables, and a multivariate regression model was developed to explain the number of MLEM patents in each country using SPSS 23.0. A subset of 20 of the 375 (5%) products on the 2013 MLEM fit our inclusion criteria. The patent estate reports (i.e., the global list of patents for a given drug) varied greatly in their number with a median of 48 patents (interquartile range [IQR]: 26-76). Their geographic reach had a median of 15% of the developing countries sampled (IQR: 8-28%). The number of developing countries covered appeared to increase with the age of the patent estate (r = .433, p = 0.028). The number of MLEM patents per country was significantly positively associated with human development index (HDI), gross domestic income (GDI) per capita, total healthcare expenditure per capita, population size, the Rule of Law Index, and average education level. Population size, GDI per capita, and healthcare expenditure (in % of national expenditure) were predictors of the number of MLEM patents in countries (p = 0.001, p = 0.001, p = 0.009, respectively). Population size was the most important predictor (β = 0.59), followed by income (GDI per capita) (β = 0.32), and healthcare expenditure (β = 0.15). Holding the other factors constant, (i) 14.3 million more people, (ii) $833.33 more per capita (GDI), or (iii) 0.88% more of national spending on healthcare resulted in 1 additional essential medicine patent. Population was a powerful predictor of the number of patent filings in developing countries along with GDI and healthcare expenditure. The age and historical context of the patent estate may make a difference in the number of patents and countries covered. Broad surveillance and benchmarking of the global medicine patent landscape is valuable for detecting significant shifts that may occur over time. With improved international medicine patent transparency by companies and data available through third parties, such studies will be increasingly feasible.
Functional genomics of bio-energy plants and related patent activities.
Jiang, Shu-Ye; Ramachandran, Srinivasan
2013-04-01
With dwindling fossil oil resources and increased economic growth of many developing countries due to globalization, energy driven from an alternative source such as bio-energy in a sustainable fashion is the need of the hour. However, production of energy from biological source is relatively expensive due to low starch and sugar contents of bioenergy plants leading to lower oil yield and reduced quality along with lower conversion efficiency of feedstock. In this context genetic improvement of bio-energy plants offers a viable solution. In this manuscript, we reviewed the current status of functional genomics studies and related patent activities in bio-energy plants. Currently, genomes of considerable bio-energy plants have been sequenced or are in progress and also large amount of expression sequence tags (EST) or cDNA sequences are available from them. These studies provide fundamental data for more reliable genome annotation and as a result, several genomes have been annotated in a genome-wide level. In addition to this effort, various mutagenesis tools have also been employed to develop mutant populations for characterization of genes that are involved in bioenergy quantitative traits. With the progress made on functional genomics of important bio-energy plants, more patents were filed with a significant number of them focusing on genes and DNA sequences which may involve in improvement of bio-energy traits including higher yield and quality of starch, sugar and oil. We also believe that these studies will lead to the generation of genetically altered plants with improved tolerance to various abiotic and biotic stresses.
Tracking 20 Years of Compound-to-Target Output from Literature and Patents
Southan, Christopher; Varkonyi, Peter; Boppana, Kiran; Jagarlapudi, Sarma A.R.P.; Muresan, Sorel
2013-01-01
The statistics of drug development output and declining yield of approved medicines has been the subject of many recent reviews. However, assessing research productivity that feeds development is more difficult. Here we utilise an extensive database of structure-activity relationships extracted from papers and patents. We have used this database to analyse published compounds cumulatively linked to nearly 4000 protein target identifiers from multiple species over the last 20 years. The compound output increases up to 2005 followed by a decline that parallels a fall in pharmaceutical patenting. Counts of protein targets have plateaued but not fallen. We extended these results by exploring compounds and targets for one large pharmaceutical company. In addition, we examined collective time course data for six individual protease targets, including average molecular weight of the compounds. We also tracked the PubMed profile of these targets to detect signals related to changes in compound output. Our results show that research compound output had decreased 35% by 2012. The major causative factor is likely to be a contraction in the global research base due to mergers and acquisitions across the pharmaceutical industry. However, this does not rule out an increasing stringency of compound quality filtration and/or patenting cost control. The number of proteins mapped to compounds on a yearly basis shows less decline, indicating the cumulative published target capacity of global research is being sustained in the region of 300 proteins for large companies. The tracking of six individual targets shows uniquely detailed patterns not discernible from cumulative snapshots. These are interpretable in terms of events related to validation and de-risking of targets that produce detectable follow-on surges in patenting. Further analysis of the type we present here can provide unique insights into the process of drug discovery based on the data it actually generates. PMID:24204758
Patent and exclusivity status of essential medicines for non-communicable disease.
Mackey, Tim K; Liang, Bryan A
2012-01-01
The threat of non-communicable diseases ("NCDs") is increasingly becoming a global health crisis and are pervasive in high, middle, and low-income populations resulting in an estimated 36 million deaths per year. There is a need to assess intellectual property rights ("IPRs") that may impede generic production and availability and affordability to essential NCD medicines. Using the data sources listed below, the study design systematically eliminated NCD drugs that had no patent/exclusivity provisions on API, dosage, or administration route. The first step identified essential medicines that treat certain high disease burden NCDs. A second step examined the patent and exclusivity status of active ingredient, dosage and listed route of administration using exclusion criteria outlined in this study. We examined the patent and exclusivity status of medicines listed in the World Health Organization's ("WHO") Model List of Essential Drugs (Medicines) ("MLEM") and other WHO sources for drugs treating certain NCDs. i.e., cardiovascular and respiratory disease, cancers, and diabetes. We utilized the USA Food and Drug Administration Orange Book and the USA Patent and Trademark Office databases as references given the predominant number of medicines registered in the USA. Of the 359 MLEM medicines identified, 22% (79/359) address targeted NCDs. Of these 79, only eight required in-depth patent or exclusivity assessment. Upon further review, no NCD MLEM medicines had study patent or exclusivity protection for reviewed criteria. We find that ensuring availability and affordability of potential generic formulations of NCD MLEM medicines appears to be more complex than the presence of IPRs with API, dosage, or administration patent or exclusivity protection. Hence, more sophisticated analysis of NCD barriers to generic availability and affordability should be conducted in order to ensure equitable access to global populations for these essential medicines.
Emerging Drugs for the Treatment of Breast Cancer Brain Metastasis: A Review of Patent Literature.
Anaya-Ruiz, Maricruz; Bandala, Cindy; Martinez-Morales, Patricia; Landeta, Gerardo; Martinez-Contreras, Rebeca D; Martinez-Montiel, Nancy; Perez-Santos, Martin
2018-04-29
Despite dramatic advances in cancer treatment that lead to long-term survival, there is an increasing number of patients presenting with clinical manifestations of cerebral metastasis in breast cancer, for whom only palliative treatment options exist. The present review aims to provide to identify recent patens of breast cancer brain metastasis that may have application in improving cancer treatment. Recent patents regarding the breast cancer brain metastasis were obtained from USPTO patent databases, Esp@cenet, Patentscope and Patent Inspiration®. A total of 55 patent documents and 35 drug targets were recovered. Of these, a total of 45 patents and 10 patents were biotech drugs and chemical drugs, respectively. Among the target drugs analyzed were neurotrophin-3, protocadherin 7, CXCR4, PTEN, GABA receptor 3, L1CAM, PI3K-Akt / mTOR, VEGFR2, Claudin-5, Occludin, and NKG2A, among others. In this study we found 35 drug targets for metastasis to the brain in breast cancer, with 60% of them including only one patent, which establishes that this area of research is very recent, and that these targets have recently been linked to metastasis to the brain. On the other hand, 19 drug targets, among them VEGF, VEGFR2, CXCL12, and CXCR4, have been addressed for the first time until 6 years ago, confirming that the development of drugs for brain metastasis in breast cancer is an incipient area, but with interesting potential. Interestingly, the stage of inside the brain, was the stage with the lowest amount of drug targets, which places it as a priority for research and drug development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Extension of the COG and arCOG databases by amino acid and nucleotide sequences
Meereis, Florian; Kaufmann, Michael
2008-01-01
Background The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. Results Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . Conclusion NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document. PMID:19014535
Privacy Perspectives for Online Searchers: Confidentiality with Confidence?
ERIC Educational Resources Information Center
Duberman, Josh; Beaudet, Michael
2000-01-01
Presents issues and questions involved in online privacy from the information professional's perspective. Topics include consumer concerns; query confidentiality; securing computers from intrusion; electronic mail; search engines; patents and intellectual property searches; government's role; Internet service providers; database mining; user…
Therapeutic compositions and uses of alpha1-antitrypsin: a patent review (2012 - 2015).
Lior, Yotam; Geyra, Assaf; Lewis, Eli C
2016-05-01
Identified as a circulating serine-protease inhibitor, the genetic deficiency of which predisposes to the development of lung emphysema, alpha1-antitrypsin (AAT) has recently been found to possess various anti-inflammatory and immunomodulatory activities outside the biochemical inhibition of serine-proteases. AAT is presently extracted from human plasma to supply life-long infusions to patients with genetic AAT deficiency. However, its newly appreciated functions point to extended therapeutic uses; these, alongside modified production attempts, represent a novel and dynamic niche of drug repurposing, set apart from addressing lung emphysema in AAT-deficient individuals. The review provides a comprehensive summary of patent-protected inventions in the field of novel clinical indications for AAT and innovations in AAT production. A molecule no longer patentable per se, presents with novel clinical applications; its mechanism still unfolding. While modified protein sequences are patentable and potentially superior, they are burdened by regulatory setbacks. Thus, recent approaches in the context of AAT appear in patents that describe combinations with other drugs, redefined clinical subclasses, and unique recombinant entities, carefully skirting saturated areas of AAT patentology. It will be fascinating to follow technologies and creative patenting as AAT navigates the trying trades of pharmaceutical industry towards an increasing lineup of unmet clinical needs.
Berberine and its derivatives: a patent review (2009 - 2012).
Singh, Inder Pal; Mahajan, Shivani
2013-02-01
Berberine, a protoberberine alkaloid, and its derivatives exhibit a wide spectrum of pharmacological activities. It has been used in traditional Chinese medicine and Ayurvedic medicine and current research evidences support its use for various therapeutic areas. This review covers the patents on therapeutic activities of berberine and its derivatives in the years between 2009 and 2012. An extensive search was done to collect the patent information using European Patent Office database and SciFinder. The therapeutic areas covered include cancer, inflammation, infectious diseases, cardiovascular, metabolic disorders, and miscellaneous areas such as polycystic ovary syndrome, allergic diseases, and so on. Berberine along with its derivatives or in combination with other pharmaceutically active compounds or in the form of formulations has applications in various therapeutic areas such as cancer, inflammation, diabetes, depression, hypertension, and various infectious areas. Berberine has demonstrated wide physiological functions and has great potential to give a multipotent drug if some inherent problems on poor bioavailability and solubility are taken care of. Additionally, polyherbal formulations with berberine-containing plants as major ingredients can be successfully developed.
Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics.
Deutsch, Eric W; Sun, Zhi; Campbell, David S; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S; Moritz, Robert L
2016-11-04
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/ .
Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics
Deutsch, Eric W.; Sun, Zhi; Campbell, David S.; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S.; Moritz, Robert L.
2016-01-01
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances – a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ~20,000 primary isoforms plus contaminants to a very large database that includes almost all non-redundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/. PMID:27577934
DOE technology information management system database study report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Widing, M.A.; Blodgett, D.W.; Braun, M.D.
1994-11-01
To support the missions of the US Department of Energy (DOE) Special Technologies Program, Argonne National Laboratory is defining the requirements for an automated software system that will search electronic databases on technology. This report examines the work done and results to date. Argonne studied existing commercial and government sources of technology databases in five general areas: on-line services, patent database sources, government sources, aerospace technology sources, and general technology sources. First, it conducted a preliminary investigation of these sources to obtain information on the content, cost, frequency of updates, and other aspects of their databases. The Laboratory then performedmore » detailed examinations of at least one source in each area. On this basis, Argonne recommended which databases should be incorporated in DOE`s Technology Information Management System.« less
Natural compounds for solar photoprotection: a patent review.
Serafini, Mairim R; Guimarães, Adriana G; Quintans, Jullyana S S; Araújo, Adriano A S; Nunes, Paula S; Quintans-Júnior, Lucindo J
2015-04-01
Ultraviolet irradiation has deleterious effects on human skin, including tanning, sunburn, cancer and connective tissue degradation (photoaging). Botanical antioxidants have been shown to be associated with reduced incidence of photocarcinogenesis and photoaging through their photoprotective profile. Here, the authors summarized therapeutic patent applications concerning the employment of medicinal plants on the technological development of a formulation with photoprotective or photoaging application. So, the patent search was conducted in the databases WIPO, Espacenet, USPTO and Derwent, using the keywords - photoaging, photoprotection and the IPC A61K 8/97 (cosmetics or similar cleaning supplies obtained from vegetable origin, for example, plant extracts) and A61K 36/00 (medicinal preparations of undetermined constitution containing material from algae, lichens, fungi or plants, or derivatives thereof, for example, traditional herbal medicines). We found 180 patents, out of which 25 were evaluated using inclusion criteria as application of natural products with photoprotective or photoaging application. We found that some patents related to the cosmetic compositions for improving skin wrinkle and either preventing or reducing the signs of photoaging and sunburn. The cosmetic compositions are manufactured in the form of a lotion, gel, soluble liquid, cream, essence, oil-in-water-type or water-in-oil-type formulation, containing the vegetal extracts as an active ingredient.
Nguyen, Kim; Kempfle, Judith S; Jung, David H; McKenna, Charles E
2017-02-01
Inner ear disorders such as hearing loss, tinnitus, and Ménière's disease significantly impact the quality of life of affected individuals. Treatment of such disorders is an ongoing challenge. Current clinical approaches relieve symptoms but do not fully restore hearing, and the search for more effective therapeutic methods represents an area of urgent current interest. Areas covered: Thirty four patents and patent applications published from 2011 to 2015 were selected from the database of the U.S. Patent and Trademark Office (USPTO) and World Intellectual Property Organization (WIPO), covering new approaches for the treatment of inner ear disorders described in the patent literature: 1) identification of new therapeutic agents, 2) development of sustained release formulations, and 3) medical devices that facilitate delivery of such agents to the inner ear. Expert opinion: The search for effective treatments of inner ear disorders is ongoing. Increased understanding of the molecular mechanisms of hearing loss, Ménière's disease, and tinnitus is driving development of new therapeutic agents. However, delivery of these agents to the inner ear is a continuing challenge. At present, combination of a suitable drug with an appropriate mode of drug delivery is the key focus of innovative research to cure inner ear disorders.
'Click chemistry' for diagnosis: a patent review on exploitation of its emerging trends.
Mandhare, Anita; Banerjee, Paromita; Bhutkar, Smita; Hirwani, Rajkumar
2014-12-01
Click chemistry is the novel synthetic approach towards developing reactions with large thermodynamic driving forces to give almost complete conversion of new molecular reagents to a single product. Thus, click chemistry describes the chemistry for making carbon-heteroatom-carbon bonds in benign solvents, especially in water, and having a plethora of chemical and biological applications. This has played an important role in early detection of diseases, real-time monitoring of drug delivery and investigating the biomolecular functions in vivo. This review aims at highlighting the research advancements in click chemistry published in the patent literature and categorizing the patents according to the technological progress. An extensive search was carried out to collect and analyze the patent information claiming the use of click chemistry in biotechnology, especially for diagnosis. The study further concentrates on licensing of the click chemistry patents and defining the recent breakthroughs. Different databases like Espacenet, ISI Web of Science, Patbase and Thomson Innovation are used to compile the relevant literature. In recent years, considerable development in the click concept has encouraged researchers in using click reactions in almost every branch of industry that uses chemistry. Click chemistry for chemical ligation has been immensely explored in the field of biotechnology especially for detection, diagnosis and therapeutics.
Chemical entity recognition in patents by combining dictionary-based and statistical approaches
Akhondi, Saber A.; Pons, Ewoud; Afzal, Zubair; van Haagen, Herman; Becker, Benedikt F.H.; Hettne, Kristina M.; van Mulligen, Erik M.; Kors, Jan A.
2016-01-01
We describe the development of a chemical entity recognition system and its application in the CHEMDNER-patent track of BioCreative 2015. This community challenge includes a Chemical Entity Mention in Patents (CEMP) recognition task and a Chemical Passage Detection (CPD) classification task. We addressed both tasks by an ensemble system that combines a dictionary-based approach with a statistical one. For this purpose the performance of several lexical resources was assessed using Peregrine, our open-source indexing engine. We combined our dictionary-based results on the patent corpus with the results of tmChem, a chemical recognizer using a conditional random field classifier. To improve the performance of tmChem, we utilized three additional features, viz. part-of-speech tags, lemmas and word-vector clusters. When evaluated on the training data, our final system obtained an F-score of 85.21% for the CEMP task, and an accuracy of 91.53% for the CPD task. On the test set, the best system ranked sixth among 21 teams for CEMP with an F-score of 86.82%, and second among nine teams for CPD with an accuracy of 94.23%. The differences in performance between the best ensemble system and the statistical system separately were small. Database URL: http://biosemantics.org/chemdner-patents PMID:27141091
Konnichi Wa, Nihon (Hello, Japan!): Best Databases for Business, Technology and News.
ERIC Educational Resources Information Center
Hoetker, Glenn
1994-01-01
Describes online information sources for Japanese business, scientific, and technical developments. Highlights include English language materials versus the need for translation from Japanese; government research; scientific and technical information; patent information; corporate financial information; business information from newswires and…
Patent databases and analytical tools for space technology commercialization (Part 2)
NASA Astrophysics Data System (ADS)
Hulsey, William N., III
2002-07-01
A shift in the space industry has occurred that requires technology developers to understand the basics of the intellectual property laws; Global harmonization facilitates this understanding; internet-based tools enable knowledge of these rights and the facts affecting them.
System, method and apparatus for generating phrases from a database
NASA Technical Reports Server (NTRS)
McGreevy, Michael W. (Inventor)
2004-01-01
A phrase generation is a method of generating sequences of terms, such as phrases, that may occur within a database of subsets containing sequences of terms, such as text. A database is provided and a relational model of the database is created. A query is then input. The query includes a term or a sequence of terms or multiple individual terms or multiple sequences of terms or combinations thereof. Next, several sequences of terms that are contextually related to the query are assembled from contextual relations in the model of the database. The sequences of terms are then sorted and output. Phrase generation can also be an iterative process used to produce sequences of terms from a relational model of a database.
2017-01-01
This research proposes an innovative data model to determine the landscape of emerging technologies. It is based on a competitive technology intelligence methodology that incorporates the assessment of scientific publications and patent analysis production, and is further supported by experts’ feedback. It enables the definition of the growth rate of scientific and technological output in terms of the top countries, institutions and journals producing knowledge within the field as well as the identification of main areas of research and development by analyzing the International Patent Classification codes including keyword clusterization and co-occurrence of patent assignees and patent codes. This model was applied to the evolving domain of 3D bioprinting. Scientific documents from the Scopus and Web of Science databases, along with patents from 27 authorities and 140 countries, were retrieved. In total, 4782 scientific publications and 706 patents were identified from 2000 to mid-2016. The number of scientific documents published and patents in the last five years showed an annual average growth of 20% and 40%, respectively. Results indicate that the most prolific nations and institutions publishing on 3D bioprinting are the USA and China, including the Massachusetts Institute of Technology (USA), Nanyang Technological University (Singapore) and Tsinghua University (China), respectively. Biomaterials and Biofabrication are the predominant journals. The most prolific patenting countries are China and the USA; while Organovo Holdings Inc. (USA) and Tsinghua University (China) are the institutions leading. International Patent Classification codes reveal that most 3D bioprinting inventions intended for medical purposes apply porous or cellular materials or biologically active materials. Knowledge clusters and expert drivers indicate that there is a research focus on tissue engineering including the fabrication of organs, bioinks and new 3D bioprinting systems. Our model offers a guide to researchers to understand the knowledge production of pioneering technologies, in this case 3D bioprinting. PMID:28662187
Rodríguez-Salvador, Marisela; Rio-Belver, Rosa María; Garechana-Anacabe, Gaizka
2017-01-01
This research proposes an innovative data model to determine the landscape of emerging technologies. It is based on a competitive technology intelligence methodology that incorporates the assessment of scientific publications and patent analysis production, and is further supported by experts' feedback. It enables the definition of the growth rate of scientific and technological output in terms of the top countries, institutions and journals producing knowledge within the field as well as the identification of main areas of research and development by analyzing the International Patent Classification codes including keyword clusterization and co-occurrence of patent assignees and patent codes. This model was applied to the evolving domain of 3D bioprinting. Scientific documents from the Scopus and Web of Science databases, along with patents from 27 authorities and 140 countries, were retrieved. In total, 4782 scientific publications and 706 patents were identified from 2000 to mid-2016. The number of scientific documents published and patents in the last five years showed an annual average growth of 20% and 40%, respectively. Results indicate that the most prolific nations and institutions publishing on 3D bioprinting are the USA and China, including the Massachusetts Institute of Technology (USA), Nanyang Technological University (Singapore) and Tsinghua University (China), respectively. Biomaterials and Biofabrication are the predominant journals. The most prolific patenting countries are China and the USA; while Organovo Holdings Inc. (USA) and Tsinghua University (China) are the institutions leading. International Patent Classification codes reveal that most 3D bioprinting inventions intended for medical purposes apply porous or cellular materials or biologically active materials. Knowledge clusters and expert drivers indicate that there is a research focus on tissue engineering including the fabrication of organs, bioinks and new 3D bioprinting systems. Our model offers a guide to researchers to understand the knowledge production of pioneering technologies, in this case 3D bioprinting.
Current and future developments in patents for quantitative trait loci in dairy cattle.
Weller, Joel I
2007-01-01
Many studies have proposed that rates of genetic gain in dairy cattle can be increased by direct selection on the individual quantitative loci responsible for the genetic variation in these traits, or selection on linked genetic markers. The development of DNA-level genetic markers has made detection of QTL nearly routine in all major livestock species. The studies that attempted to detect genes affecting quantitative traits can be divided into two categories: analysis of candidate genes, and genome scans based on within-family genetic linkage. To date, 12 patent cooperative treaty (PCT) and US patents have been registered for DNA sequences claimed to be associated with effects on economic traits in dairy cattle. All claim effects on milk production, but other traits are also included in some of the claims. Most of the sequences found by the candidate gene approach are of dubious validity, and have been repeated in only very few independent studies. The two missense mutations on chromosomes 6 and 14 affecting milk concentration derived from genome scans are more solidly based, but the claims are also disputed. A few PCT in dairy cattle are commercialized as genetic tests where commercial dairy farmers are the target market.
Parallel Worlds of Public and Commercial Bioactive Chemistry Data
2014-01-01
The availability of structures and linked bioactivity data in databases is powerfully enabling for drug discovery and chemical biology. However, we now review some confounding issues with the divergent expansions of public and commercial sources of chemical structures. These are associated with not only expanding patent extraction but also increasingly large vendor collections amassed via different selection criteria between SciFinder from Chemical Abstracts Service (CAS) and major public sources such as PubChem, ChemSpider, UniChem, and others. These increasingly massive collections may include both real and virtual compounds, as well as so-called prophetic compounds from patents. We address a range of issues raised by the challenges faced resolving the NIH probe compounds. In addition we highlight the confounding of prior-art searching by virtual compounds that could impact the composition of matter patentability of a new medicinal chemistry lead. Finally, we propose some potential solutions. PMID:25415348
Semantic encoding of relational databases in wireless networks
NASA Astrophysics Data System (ADS)
Benjamin, David P.; Walker, Adrian
2005-03-01
Semantic Encoding is a new, patented technology that greatly increases the speed of transmission of distributed databases over networks, especially over ad hoc wireless networks, while providing a novel method of data security. It reduces bandwidth consumption and storage requirements, while speeding up query processing, encryption and computation of digital signatures. We describe the application of Semantic Encoding in a wireless setting and provide an example of its operation in which a compression of 290:1 would be achieved.
Bousfield, David; McEntyre, Johanna; Velankar, Sameer; Papadatos, George; Bateman, Alex; Cochrane, Guy; Kim, Jee-Hyub; Graef, Florian; Vartak, Vid; Alako, Blaise; Blomberg, Niklas
2016-01-01
Data from open access biomolecular data resources, such as the European Nucleotide Archive and the Protein Data Bank are extensively reused within life science research for comparative studies, method development and to derive new scientific insights. Indicators that estimate the extent and utility of such secondary use of research data need to reflect this complex and highly variable data usage. By linking open access scientific literature, via Europe PubMedCentral, to the metadata in biological data resources we separate data citations associated with a deposition statement from citations that capture the subsequent, long-term, reuse of data in academia and industry. We extend this analysis to begin to investigate citations of biomolecular resources in patent documents. We find citations in more than 8,000 patents from 2014, demonstrating substantial use and an important role for data resources in defining biological concepts in granted patents to both academic and industrial innovators. Combined together our results indicate that the citation patterns in biomedical literature and patents vary, not only due to citation practice but also according to the data resource cited. The results guard against the use of simple metrics such as citation counts and show that indicators of data use must not only take into account citations within the biomedical literature but also include reuse of data in industry and other parts of society by including patents and other scientific and technical documents such as guidelines, reports and grant applications.
Scali, Marta; Pusch, Tim P; Breedveld, Paul; Dodou, Dimitra
2017-03-01
High accuracy and precision in reaching target locations inside the human body is necessary for the success of percutaneous procedures, such as tissue sample removal (biopsy), brachytherapy, and localized drug delivery. Flexible steerable needles may allow the surgeon to reach targets deep inside solid organs while avoiding sensitive structures (e.g. blood vessels). This article provides a systematic classification of possible mechanical solutions for three-dimensional steering through solid organs. A scientific and patent literature search of steerable instrument designs was conducted using Scopus and Web of Science Derwent Innovations Index patent database, respectively. First, we distinguished between mechanisms in which deflection is induced by the pre-defined shape of the instrument versus mechanisms in which an actuator changes the deflection angle of the instrument on demand. Second, we distinguished between mechanisms deflecting in one versus two planes. The combination of deflection method and number of deflection planes led to eight logically derived mechanical solutions for three-dimensional steering, of which one was dismissed because it was considered meaningless. Next, we classified the instrument designs retrieved from the scientific and patent literature into the identified solutions. We found papers and patents describing instrument designs for six of the seven solutions. We did not find papers or patents describing instruments that steer in one-plane on-demand via an actuator and in a perpendicular plane with a pre-defined deflection angle via a bevel tip or a pre-curved configuration.
Bousfield, David; McEntyre, Johanna; Velankar, Sameer; Papadatos, George; Bateman, Alex; Cochrane, Guy; Kim, Jee-Hyub; Graef, Florian; Vartak, Vid; Alako, Blaise; Blomberg, Niklas
2016-01-01
Data from open access biomolecular data resources, such as the European Nucleotide Archive and the Protein Data Bank are extensively reused within life science research for comparative studies, method development and to derive new scientific insights. Indicators that estimate the extent and utility of such secondary use of research data need to reflect this complex and highly variable data usage. By linking open access scientific literature, via Europe PubMedCentral, to the metadata in biological data resources we separate data citations associated with a deposition statement from citations that capture the subsequent, long-term, reuse of data in academia and industry. We extend this analysis to begin to investigate citations of biomolecular resources in patent documents. We find citations in more than 8,000 patents from 2014, demonstrating substantial use and an important role for data resources in defining biological concepts in granted patents to both academic and industrial innovators. Combined together our results indicate that the citation patterns in biomedical literature and patents vary, not only due to citation practice but also according to the data resource cited. The results guard against the use of simple metrics such as citation counts and show that indicators of data use must not only take into account citations within the biomedical literature but also include reuse of data in industry and other parts of society by including patents and other scientific and technical documents such as guidelines, reports and grant applications. PMID:27092246
USDA-ARS?s Scientific Manuscript database
The ARS Microbial Genome Sequence Database (http://199.133.98.43), a web-based database server, was established utilizing the BIGSdb (Bacterial Isolate Genomics Sequence Database) software package, developed at Oxford University, as a tool to manage multi-locus sequence data for the family Streptomy...
Tips, Techniques, and Words of Wisdom.
ERIC Educational Resources Information Center
Garman, Nancy, Comp.
1990-01-01
Presents suggestions from online searchers for using online services. Topics discussed include decreasing costs by using less expensive files; modifying searches on Dialog; use of controlled vocabularies and free text; using a variety of databases; the importance of search intermediaries understanding the topic; and patent searching. (LRW)
Ivanenkov, Yan A; Aladinskiy, Vladimir A; Bushkov, Nikolay A; Ayginin, Andrey A; Majouga, Alexander G; Ivachtchenko, Alexandre V
2017-04-01
Non-structural 5A (NS5A) protein has achieved a considerable attention as an attractive target for the treatment of hepatitis C (HCV). A number of novel NS5A inhibitors have been reported to date. Several drugs having favorable ADME properties and mild side effects were launched into the pharmaceutical market. For instance, daclatasvir was launched in 2014, elbasvir is currently undergoing registration, ledipasvir was launched in 2014 as a fixed-dose combination with sofosbuvir (NS5B inhibitor). Areas covered: Thomson integrity database and SciFinder database were used as a valuable source to collect the patents on small-molecule NS5A inhibitors. All the structures were ranked by the date of priority. Patent holder and antiviral activity for each scaffold claimed were summarized and presented in a convenient manner. A particular focus was placed on the best-in-class bis-pyrrolidine-containing NS5A inhibitors. Expert opinion: Several first generation NS5A inhibitors have recently progressed into advanced clinical trials and showed superior efficacy in reducing viral load in infected subjects. Therapy schemes of using these agents in combination with other established antiviral drugs with complementary mechanisms of action can address the emergence of resistance and poor therapeutic outcome frequently attributed to antiviral drugs.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-15
... (EPO) as the lead, to propose a revised standard for the filing of nucleotide and/or amino acid.... ST.25 uses a controlled vocabulary of feature keys to describe nucleic acid and amino acid sequences... patent data purposes. The XML standard also includes four qualifiers for amino acids. These feature keys...
ESTuber db: an online database for Tuber borchii EST sequences.
Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo
2007-03-08
The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
Fister, Karin; Fister, Iztok; Murovec, Jana; Bohanec, Borut
2017-02-01
Plant breeders' rights are undergoing dramatic changes due to changes in patent rights in terms of plant variety rights protection. Although differences in the interpretation of »breeder's exemption«, termed research exemption in the 1991 UPOV, did exist in the past in some countries, allowing breeders to use protected varieties as parents in the creation of new varieties of plants, current developments brought about by patenting conventionally bred varieties with the European Patent Office (such as EP2140023B1) have opened new challenges. Legal restrictions on germplasm availability are therefore imposed on breeders while, at the same time, no practical information on how to distinguish protected from non-protected varieties is given. We propose here a novel approach that would solve this problem by the insertion of short DNA stretches (labels) into protected plant varieties by genetic transformation. This information will then be available to breeders by a simple and standardized procedure. We propose that such a procedure should consist of using a pair of universal primers that will generate a sequence in a PCR reaction, which can be read and translated into ordinary text by a computer application. To demonstrate the feasibility of such approach, we conducted a case study. Using the Agrobacterium tumefaciens transformation protocol, we inserted a stretch of DNA code into Nicotiana benthamiana. We also developed an on-line application that enables coding of any text message into DNA nucleotide code and, on sequencing, decoding it back into text. In the presented case study, a short command line coding the phrase »Hello world« was transformed into a DNA sequence that was inserted in the plant genome. The encoded message was reconstructed from the resulting T1 seedlings with 100 % accuracy. The feasibility and possible other applications of this approach are discussed.
Compressing DNA sequence databases with coil.
White, W Timothy J; Hendy, Michael D
2008-05-20
Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression - an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression - the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.
Compressing DNA sequence databases with coil
White, W Timothy J; Hendy, Michael D
2008-01-01
Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794
MIPS: a database for protein sequences, homology data and yeast genome information.
Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F
1997-01-01
The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
The bibliography contains citations of selected patents concerning activated charcoal filters and their applications in water treatment, pollution control, and industrial processes. Filtering methods and equipment for air and water purification, industrial distillation and extraction, industrial leaching, and filtration of toxic materials and contaminants are described. Applications include drinking water purification, filtering beverages, production of polymer materials, solvent and metal recovery, waste conversion, automotive fuel and exhaust systems, swimming pool filtration, tobacco smoke filters, kitchen ventilators, medical filtration treatment, and odor absorbing materials. (Contains 250 citations and includes a subject term index and title list.)
A survey of chemical information systems
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Shaikh, Aneesa Bashir
1985-01-01
A survey of the features, functions, and characteristics of a fairly wide variety of chemical information storage and retrieval systems currently in operation is given. The types of systems (together with an identification of the specific systems) addressed within this survey are as follows: patents and bibliographies (Derwent's Patent System; IFI Comprehensive Database; PULSAR); pharmacology and toxicology (Chemfile; PAGODE; CBF; HEEDA; NAPRALERT; MAACS); the chemical information system (CAS Chemical Registry System; SANSS; MSSS; CSEARCH; GINA; NMRLIT; CRYST; XTAL; PDSM; CAISF; RTECS Search System; AQUATOX; WDROP; OHMTADS; MLAB; Chemlab); spectra (OCETH; ASTM); crystals (CRYSRC); and physical properties (DETHERM). Summary characteristics and current trends in chemical information systems development are also examined.
Converting Enzymes into Tools of Industrial Importance.
Prasad, Shivcharan; Roy, Ipsita
2018-01-01
Enzymes have applications in numerous biotechnological products and processes that are commonly used in the production of food and beverages, cleaning supplies, clothing, paper products, transportation fuels, pharmaceuticals, and monitoring devices. Enzymes, however, are optimized to function under physiological conditions. Any change in reaction conditions results in their activity as well as stability being compromised. Hence, most of the natural biomolecules are not suitable for industrial applications. Modifications are required to develop efficient and successful reagents as per demand. Protein engineering can be applied to cope up with these situations. This review describes some of the novel uses/unusual properties of enzymes as biological catalysts. It explains the different ways in which enzymes can be and have been used under non-native conditions. Different strategies have been discussed regarding stabilization of enzyme as well optimum conditions of its uses in different industries. The following patents databases were consulted: European Patent Office (EPO), the United States Patent and Trademark Office (USPTO), Patent scope Search International and National Patent Collections (WIPO) and Google Patents. The review illustrates the width of the umbrella of applications covered by biocatalysts. Employing the tools of solvent and protein engineering, viz. non-aqueous media, additives, immobilization, mutagenesis, to name a few; biotechnology has been able to make enzyme catalyzed processes an essential components of the industrialist's armoury. The article lists a number of successful examples, both of patented technology as well as biocatalysts which are currently being used in the industry, to highlight the accomplishments of technologies which have been adopted till now for making enzyme technology industrially viable. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Pruitt, Kim D.; Tatusova, Tatiana; Maglott, Donna R.
2005-01-01
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset representing the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses. Nucleotide and protein sequences are explicitly linked, and the sequences are linked to other resources including the NCBI Map Viewer and Gene. Sequences are annotated to include coding regions, conserved domains, variation, references, names, database cross-references, and other features using a combined approach of collaboration and other input from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff. PMID:15608248
FARME DB: a functional antibiotic resistance element database
Wallace, James C.; Port, Jesse A.; Smith, Marissa N.; Faustman, Elaine M.
2017-01-01
Antibiotic resistance (AR) is a major global public health threat but few resources exist that catalog AR genes outside of a clinical context. Current AR sequence databases are assembled almost exclusively from genomic sequences derived from clinical bacterial isolates and thus do not include many microbial sequences derived from environmental samples that confer resistance in functional metagenomic studies. These environmental metagenomic sequences often show little or no similarity to AR sequences from clinical isolates using standard classification criteria. In addition, existing AR databases provide no information about flanking sequences containing regulatory or mobile genetic elements. To help address this issue, we created an annotated database of DNA and protein sequences derived exclusively from environmental metagenomic sequences showing AR in laboratory experiments. Our Functional Antibiotic Resistant Metagenomic Element (FARME) database is a compilation of publically available DNA sequences and predicted protein sequences conferring AR as well as regulatory elements, mobile genetic elements and predicted proteins flanking antibiotic resistant genes. FARME is the first database to focus on functional metagenomic AR gene elements and provides a resource to better understand AR in the 99% of bacteria which cannot be cultured and the relationship between environmental AR sequences and antibiotic resistant genes derived from cultured isolates. Database URL: http://staff.washington.edu/jwallace/farme PMID:28077567
Impact of Gene Patents and Licensing Practices on Access to Genetic Testing for Cystic Fibrosis
Chandrasekharan, Subhashini; Heaney, Christopher; James, Tamara; Conover, Chris; Cook-Deegan, Robert
2010-01-01
Cystic fibrosis (CF) is one of the most commonly tested autosomal recessive disorders in the US. Clinical CF is associated with mutations in the CFTR gene, of which the most common mutation among Caucasians, ΔF508, was identified in 1989. The University of Michigan, Johns Hopkins University, and the Hospital for Sick Children, where much of the initial research occurred, hold key patents for CF genetic sequences, mutations and methods for detecting them. Several patents including the one that covers detection of the ΔF508 mutation are jointly held by the University of Michigan and the Hospital for Sick Children in Toronto, with Michigan administering patent licensing in the US. The University of Michigan broadly licenses the ΔF508 patent for genetic testing with over 60 providers of genetic testing to date. Genetic testing is now used in newborn screening, diagnosis, and reproductive decisions. Interviews with key researchers and intellectual property managers, a survey of laboratories’ prices for CF genetic testing, a review of literature on CF tests’ cost effectiveness, and a review of the developing market for CF testing provide no evidence that patents have significantly hindered access to genetic tests for CF or prevented financially cost-effective screening. Current licensing practices for cystic fibrosis (CF) genetic testing appear to facilitate both academic research and commercial testing. More than one thousand different CFTR mutations have been identified, and research continues to determine their clinical significance. Patents have been nonexclusively licensed for diagnostic use, and have been variably licensed for gene transfer and other therapeutic applications. The Cystic Fibrosis Foundation has been engaged in licensing decisions, making CF a model of collaborative and cooperative patenting and licensing practice. PMID:20393308
Brassica ASTRA: an integrated database for Brassica genomic research.
Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David
2005-01-01
Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
Protein Information Resource: a community resource for expert annotation of protein data
Barker, Winona C.; Garavelli, John S.; Hou, Zhenglin; Huang, Hongzhan; Ledley, Robert S.; McGarvey, Peter B.; Mewes, Hans-Werner; Orcutt, Bruce C.; Pfeiffer, Friedhelm; Tsugita, Akira; Vinayaka, C. R.; Xiao, Chunlin; Yeh, Lai-Su L.; Wu, Cathy
2001-01-01
The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classification-driven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively cross-referenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP. PMID:11125041
Savidor, Alon; Barzilay, Rotem; Elinger, Dalia; Yarden, Yosef; Lindzen, Moshit; Gabashvili, Alexandra; Adiv Tal, Ophir; Levin, Yishai
2017-06-01
Traditional "bottom-up" proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler." As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
THGS: a web-based database of Transmembrane Helices in Genome Sequences
Fernando, S. A.; Selvarani, P.; Das, Soma; Kumar, Ch. Kiran; Mondal, Sukanta; Ramakumar, S.; Sekar, K.
2004-01-01
Transmembrane Helices in Genome Sequences (THGS) is an interactive web-based database, developed to search the transmembrane helices in the user-interested gene sequences available in the Genome Database (GDB). The proposed database has provision to search sequence motifs in transmembrane and globular proteins. In addition, the motif can be searched in the other sequence databases (Swiss-Prot and PIR) or in the macromolecular structure database, Protein Data Bank (PDB). Further, the 3D structure of the corresponding queried motif, if it is available in the solved protein structures deposited in the Protein Data Bank, can also be visualized using the widely used graphics package RASMOL. All the sequence databases used in the present work are updated frequently and hence the results produced are up to date. The database THGS is freely available via the world wide web and can be accessed at http://pranag.physics.iisc.ernet.in/thgs/ or http://144.16.71.10/thgs/. PMID:14681375
MIPS: a database for protein sequences and complete genomes.
Mewes, H W; Hani, J; Pfeiffer, F; Frishman, D
1998-01-01
The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled. PMID:9399795
Patent and Exclusivity Status of Essential Medicines for Non-Communicable Disease
Mackey, Tim K.; Liang, Bryan A.
2012-01-01
Objective The threat of non-communicable diseases (“NCDs”) is increasingly becoming a global health crisis and are pervasive in high, middle, and low-income populations resulting in an estimated 36 million deaths per year. There is a need to assess intellectual property rights (“IPRs”) that may impede generic production and availability and affordability to essential NCD medicines. Methods Using the data sources listed below, the study design systematically eliminated NCD drugs that had no patent/exclusivity provisions on API, dosage, or administration route. The first step identified essential medicines that treat certain high disease burden NCDs. A second step examined the patent and exclusivity status of active ingredient, dosage and listed route of administration using exclusion criteria outlined in this study. Materials We examined the patent and exclusivity status of medicines listed in the World Health Organization’s (“WHO”) Model List of Essential Drugs (Medicines) (“MLEM”) and other WHO sources for drugs treating certain NCDs. i.e., cardiovascular and respiratory disease, cancers, and diabetes. We utilized the USA Food and Drug Administration Orange Book and the USA Patent and Trademark Office databases as references given the predominant number of medicines registered in the USA. Results Of the 359 MLEM medicines identified, 22% (79/359) address targeted NCDs. Of these 79, only eight required in-depth patent or exclusivity assessment. Upon further review, no NCD MLEM medicines had study patent or exclusivity protection for reviewed criteria. Conclusions We find that ensuring availability and affordability of potential generic formulations of NCD MLEM medicines appears to be more complex than the presence of IPRs with API, dosage, or administration patent or exclusivity protection. Hence, more sophisticated analysis of NCD barriers to generic availability and affordability should be conducted in order to ensure equitable access to global populations for these essential medicines. PMID:23226453
Overview of Flaxseed Patent Applications for the Reduction of Cholesterol Levels.
Ribas, Simone A; Grando, Rafaela L; Zago, Lilia; Carvajal, Elvira; Fierro, Iolanda M
2016-01-01
Flaxseed is becoming an increasingly widely used food ingredient. The rising interest of the food industry in this nutraceutical is primarily because of functional nutrients, such as alpha-linolenic acid and lignans, which have health benefits due to their lipid-lowering properties. The objective of this study was to provide an overview of the patenting of flaxseed products with cholesterol-lowering effects. Patent applications filed by country of origin were retrieved from the Derwent Innovations Index®database. A total of 307 patent documents were identified, of which 184 claim the use of flaxseed or parts of the flax plant in the product formulation, for their lipid-lowering effect when consumed by humans. A few of the patent applications contain claims for new products based on flaxseed in isolation, including the preparation of foods designed to inhibit the production of cholesterol. Most of the claims were for flaxseed in the form of oil and in association with other lipid-lowering compounds, mainly for the food industry, in the form of dietary supplements or baked products designed to raise their high-density lipoprotein content, and for treating heart problems. China and the United States are the leading countries of flax-related applications. These results may have important implications for the production of functional food products that meet specific societal demands. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Chemical entity recognition in patents by combining dictionary-based and statistical approaches.
Akhondi, Saber A; Pons, Ewoud; Afzal, Zubair; van Haagen, Herman; Becker, Benedikt F H; Hettne, Kristina M; van Mulligen, Erik M; Kors, Jan A
2016-01-01
We describe the development of a chemical entity recognition system and its application in the CHEMDNER-patent track of BioCreative 2015. This community challenge includes a Chemical Entity Mention in Patents (CEMP) recognition task and a Chemical Passage Detection (CPD) classification task. We addressed both tasks by an ensemble system that combines a dictionary-based approach with a statistical one. For this purpose the performance of several lexical resources was assessed using Peregrine, our open-source indexing engine. We combined our dictionary-based results on the patent corpus with the results of tmChem, a chemical recognizer using a conditional random field classifier. To improve the performance of tmChem, we utilized three additional features, viz. part-of-speech tags, lemmas and word-vector clusters. When evaluated on the training data, our final system obtained an F-score of 85.21% for the CEMP task, and an accuracy of 91.53% for the CPD task. On the test set, the best system ranked sixth among 21 teams for CEMP with an F-score of 86.82%, and second among nine teams for CPD with an accuracy of 94.23%. The differences in performance between the best ensemble system and the statistical system separately were small.Database URL: http://biosemantics.org/chemdner-patents. © The Author(s) 2016. Published by Oxford University Press.
Scientometrics of drug discovery efforts: pain-related molecular targets.
Kissin, Igor
2015-01-01
The aim of this study was to make a scientometric assessment of drug discovery efforts centered on pain-related molecular targets. The following scientometric indices were used: the popularity index, representing the share of articles (or patents) on a specific topic among all articles (or patents) on pain over the same 5-year period; the index of change, representing the change in the number of articles (or patents) on a topic from one 5-year period to the next; the index of expectations, representing the ratio of the number of all types of articles on a topic in the top 20 journals relative to the number of articles in all (>5,000) biomedical journals covered by PubMed over a 5-year period; the total number of articles representing Phase I-III trials of investigational drugs over a 5-year period; and the trial balance index, a ratio of Phase I-II publications to Phase III publications. Articles (PubMed database) and patents (US Patent and Trademark Office database) on 17 topics related to pain mechanisms were assessed during six 5-year periods from 1984 to 2013. During the most recent 5-year period (2009-2013), seven of 17 topics have demonstrated high research activity (purinergic receptors, serotonin, transient receptor potential channels, cytokines, gamma aminobutyric acid, glutamate, and protein kinases). However, even with these seven topics, the index of expectations decreased or did not change compared with the 2004-2008 period. In addition, publications representing Phase I-III trials of investigational drugs (2009-2013) did not indicate great enthusiasm on the part of the pharmaceutical industry regarding drugs specifically designed for treatment of pain. A promising development related to the new tool of molecular targeting, ie, monoclonal antibodies, for pain treatment has not yet resulted in real success. This approach has not yet demonstrated clinical effectiveness (at least with nerve growth factor) much beyond conventional analgesics, when its potential cost is more than an order of magnitude higher than that of conventional treatments. This scientometric assessment demonstrated a lack of real breakthrough developments.
Scientometrics of drug discovery efforts: pain-related molecular targets
Kissin, Igor
2015-01-01
The aim of this study was to make a scientometric assessment of drug discovery efforts centered on pain-related molecular targets. The following scientometric indices were used: the popularity index, representing the share of articles (or patents) on a specific topic among all articles (or patents) on pain over the same 5-year period; the index of change, representing the change in the number of articles (or patents) on a topic from one 5-year period to the next; the index of expectations, representing the ratio of the number of all types of articles on a topic in the top 20 journals relative to the number of articles in all (>5,000) biomedical journals covered by PubMed over a 5-year period; the total number of articles representing Phase I–III trials of investigational drugs over a 5-year period; and the trial balance index, a ratio of Phase I–II publications to Phase III publications. Articles (PubMed database) and patents (US Patent and Trademark Office database) on 17 topics related to pain mechanisms were assessed during six 5-year periods from 1984 to 2013. During the most recent 5-year period (2009–2013), seven of 17 topics have demonstrated high research activity (purinergic receptors, serotonin, transient receptor potential channels, cytokines, gamma aminobutyric acid, glutamate, and protein kinases). However, even with these seven topics, the index of expectations decreased or did not change compared with the 2004–2008 period. In addition, publications representing Phase I–III trials of investigational drugs (2009–2013) did not indicate great enthusiasm on the part of the pharmaceutical industry regarding drugs specifically designed for treatment of pain. A promising development related to the new tool of molecular targeting, ie, monoclonal antibodies, for pain treatment has not yet resulted in real success. This approach has not yet demonstrated clinical effectiveness (at least with nerve growth factor) much beyond conventional analgesics, when its potential cost is more than an order of magnitude higher than that of conventional treatments. This scientometric assessment demonstrated a lack of real breakthrough developments. PMID:26170624
The LANL hemorrhagic fever virus database, a new platform for analyzing biothreat viruses.
Kuiken, Carla; Thurmond, Jim; Dimitrijevic, Mira; Yoon, Hyejin
2012-01-01
Hemorrhagic fever viruses (HFVs) are a diverse set of over 80 viral species, found in 10 different genera comprising five different families: arena-, bunya-, flavi-, filo- and togaviridae. All these viruses are highly variable and evolve rapidly, making them elusive targets for the immune system and for vaccine and drug design. About 55,000 HFV sequences exist in the public domain today. A central website that provides annotated sequences and analysis tools will be helpful to HFV researchers worldwide. The HFV sequence database collects and stores sequence data and provides a user-friendly search interface and a large number of sequence analysis tools, following the model of the highly regarded and widely used Los Alamos HIV database [Kuiken, C., B. Korber, and R.W. Shafer, HIV sequence databases. AIDS Rev, 2003. 5: p. 52-61]. The database uses an algorithm that aligns each sequence to a species-wide reference sequence. The NCBI RefSeq database [Sayers et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 39, D38-D51.] is used for this; if a reference sequence is not available, a Blast search finds the best candidate. Using this method, sequences in each genus can be retrieved pre-aligned. The HFV website can be accessed via http://hfv.lanl.gov.
Does patent foramen ovale closure have an anti-arrhythmic effect? A meta-analysis.
Jarral, Omar A; Saso, Srdjan; Vecht, Joshua A; Harling, Leanne; Rao, Christopher; Ahmed, Kamran; Gatzoulis, Michael A; Malik, Iqbal S; Athanasiou, Thanos
2011-11-17
Atrial tachyarrhythmias are associated with patent foramen ovale. The objective was to determine the anti-arrhythmic effect of patent foramen ovale closure on pre-existing atrial tachyarrhythmias. Medline, EMBASE, Cochrane Library, and Google Scholar databases were searched between 1967 and 2010. The search was expanded using the 'related articles' function and reference lists of key studies. All studies reporting pre- and post-closure incidence (or prevalence) of atrial tachyarrhythmia in the same patient population were included. Random and fixed effect meta-analyses were used to aggregate the data. Six studies were identified including 2570 patients who underwent percutaneous closure. Atrial fibrillation was in fact the only AT reported in all studies. Meta-analysis using a fixed effects model demonstrated a significant reduction in the prevalence of atrial fibrillation with an OR of 0.43 (95% CI 0.26-0.71). When using the random-effects model, OR was 0.44 (95% CI 0.18-1.04) with a statistically significant trend demonstrated (test for overall effect: Z=1.87, p=0.06). Closure of a patent foramen ovale may be associated with reduction in the prevalence of atrial fibrillation. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Revealing biological information using data structuring and automated learning.
Mohorianu, Irina; Moulton, Vincent
2010-11-01
The intermediary steps between a biological hypothesis, concretized in the input data, and meaningful results, validated using biological experiments, commonly employ bioinformatics tools. Starting with storage of the data and ending with a statistical analysis of the significance of the results, every step in a bioinformatics analysis has been intensively studied and the resulting methods and models patented. This review summarizes the bioinformatics patents that have been developed mainly for the study of genes, and points out the universal applicability of bioinformatics methods to other related studies such as RNA interference. More specifically, we overview the steps undertaken in the majority of bioinformatics analyses, highlighting, for each, various approaches that have been developed to reveal details from different perspectives. First we consider data warehousing, the first task that has to be performed efficiently, optimizing the structure of the database, in order to facilitate both the subsequent steps and the retrieval of information. Next, we review data mining, which occupies the central part of most bioinformatics analyses, presenting patents concerning differential expression, unsupervised and supervised learning. Last, we discuss how networks of interactions of genes or other players in the cell may be created, which help draw biological conclusions and have been described in several patents.
The Video PATSEARCH System: An Interview with Peter Urbach.
ERIC Educational Resources Information Center
Videodisc/Videotext, 1982
1982-01-01
The Video PATSEARCH system consists of a microcomputer with a special keyboard and two display screens which accesses the PATSEARCH database of United States government patents on the Bibliographic Retrieval Services (BRS) search system. The microcomputer retrieves text from BRS and matching graphics from an analog optical videodisc. (Author/JJD)
[The Glivec® case: the first example of a global debate on the drug patent system].
Moital, Inês; Bosch, Fèlix; Farré, Magí; Maddaleno, Mariano; Baños, Josep-E
2014-01-01
To describe the sequence of events involving the Glivec® case in India and to analyze the opinions generated in distinct settings. We performed a systematic search for articles concerning the imatinib (Glivec®) patent in India. We selected those sources that described the events, decisions of the authorities involved, and press and scientific opinions. Dates and arguments presented by the involved parties were clearly identified. Of 886 documents initially obtained, we selected 40 documents published between 2003 and 2013. Most of them were press news and commentaries. The process lasted 7 years, starting in 2006 when the Indian Patent Office rejected the patent application filed by Novartis. It ended in 2013 when the Indian Supreme Court upheld this decision. It was argued that the Indian Patent Law would facilitate access to medicines in the Third World and the final decision has received support by the general population. Although the court's final decision has been supported by several institutions, an objective analysis should also take into account the arguments of the pharmaceutical companies and other entities. The Glivec® case gave rise to an intense debate on the appropriateness of international standards on patents, their applicability and how they should be adopted in each country. This case, as well as other cases, should serve to stimulate reflection on the international patent system and to achieve scenarios in which the health of the poorest populations is protected but also balanced against intellectual property protection and innovation. Copyright © 2014 SESPAS. Published by Elsevier Espana. All rights reserved.
Research-tool patents: issues for health in the developing world.
Barton, John H.
2002-01-01
The patent system is now reaching into the tools of medical research, including gene sequences themselves. Many of the new patents can potentially preempt large areas of medical research and lay down legal barriers to the development of a broad category of products. Researchers must therefore consider redesigning their research to avoid use of patented techniques, or expending the effort to obtain licences from those who hold the patents. Even if total licence fees can be kept low, there are enormous negotiation costs, and one "hold-out" may be enough to lead to project cancellation. This is making it more difficult to conduct research within the developed world, and poses important questions for the future of medical research for the benefit of the developing world. Probably the most important implication for health in the developing world is the possible general slowing down and complication of medical research. To the extent that these patents do slow down research, they weaken the contribution of the global research community to the creation and application of medical technology for the benefit of developing nations. The patents may also complicate the granting of concessional prices to developing nations - for pharmaceutical firms that seek to offer a concessional price may have to negotiate arrangements with research-tool firms, which may lose royalties as a result. Three kinds of response are plausible. One is to develop a broad or global licence to permit the patented technologies to be used for important applications in the developing world. The second is to change technical patent law doctrines. Such changes could be implemented in developed and developing nations and could be quite helpful while remaining consistent with TRIPS. The third is to negotiate specific licence arrangements, under which specific research tools are used on an agreed basis for specific applications. These negotiations are difficult and expensive, requiring both scientific and legal skills. But they will be an unavoidable part of international medical research. PMID:11953790
Xiao, Y; Yuan, L; Liu, Y; Sun, X; Cheng, J; Wang, T; Li, F; Luo, R; Zhao, X
2015-02-01
A large number of traditional Chinese patent medicines (TCPMs) are widely used to treat migraine in China. However, it is uncertain whether there is robust evidence on the effects of TCPMs for migraine. A meta-analysis of randomized, double-blind, placebo-controlled trials was performed to evaluate the efficacy and safety of TCPMs in patients with migraine. Comprehensive searches were conducted on the Medline database, Cochrane Library, the China National Knowledge Infrastructure database, the Chinese Biomedical Literature database and the Wanfang database up to December 2013. Summary estimates, including 95% confidence intervals (CIs), were calculated for frequency of migraine attacks, response rate and headache intensity. A total of seven trials including 582 participants with migraine met the selection criteria. TCPM was significantly more likely to reduce the frequency of migraine attacks compared with placebo (standardized mean difference -0.54; 95% CI -0.72, -0.36; P < 0.001). TCPM was associated with an improvement of response rate compared with placebo (summary relative risk 4.63, 95% CI 2.74, 7.80, P < 0.001; therapeutic gain 24.1%; number needed to treat 4.1). Headache intensity was attenuated by TCPM compared with placebo (standardized mean difference -1.33; 95% CI -1.79, -0.87; P < 0.001). The adverse events of TCPM were no different from those of placebo. TCPMs are effective and well tolerated in the prophylactic treatment of migraine. © 2014 EAN.
Sequencing artifacts in the type A influenza database and attempts to correct them
USDA-ARS?s Scientific Manuscript database
Currently over 300,000 Type A influenza gene sequences representing over 50,000 strains are available in publicly available databases. However, the quality of the sequences submitted are determined by the contributor and many sequence errors are present in the databases, which can affect the result...
Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike
2018-01-01
ABSTRACT Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection. PMID:29564396
Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S
2018-01-01
Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection.
The Histone Database: an integrated resource for histones and histone fold-containing proteins
Mariño-Ramírez, Leonardo; Levine, Kevin M.; Morales, Mario; Zhang, Suiyuan; Moreland, R. Travis; Baxevanis, Andreas D.; Landsman, David
2011-01-01
Eukaryotic chromatin is composed of DNA and protein components—core histones—that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins. Database URL: The Histone Sequence Database is freely available and can be accessed at http://research.nhgri.nih.gov/histones/. PMID:22025671
A Knowledge Database on Thermal Control in Manufacturing Processes
NASA Astrophysics Data System (ADS)
Hirasawa, Shigeki; Satoh, Isao
A prototype version of a knowledge database on thermal control in manufacturing processes, specifically, molding, semiconductor manufacturing, and micro-scale manufacturing has been developed. The knowledge database has search functions for technical data, evaluated benchmark data, academic papers, and patents. The database also displays trends and future roadmaps for research topics. It has quick-calculation functions for basic design. This paper summarizes present research topics and future research on thermal control in manufacturing engineering to collate the information to the knowledge database. In the molding process, the initial mold and melt temperatures are very important parameters. In addition, thermal control is related to many semiconductor processes, and the main parameter is temperature variation in wafers. Accurate in-situ temperature measurment of wafers is important. And many technologies are being developed to manufacture micro-structures. Accordingly, the knowledge database will help further advance these technologies.
Fore, Joe; Wiechers, Ilse R; Cook-Deegan, Robert
2006-01-01
Introduction Polymerase chain reaction (PCR) was a seminal genomic technology discovered, developed, and patented in an industry setting. Since the first of its core patents expired in March, 2005, we are in a position to view the entire lifespan of the patent, examining how the intellectual property rights have impacted its use in the biomedical community. Given its essential role in the world of molecular biology and its commercial success, the technology can serve as a case study for evaluating the effects of patenting biological research tools on biomedical research. Case description Following its discovery, the technique was subjected to two years of in-house development, during which issues of inventorship and publishing/patenting strategies caused friction between members of the development team. Some have feared that this delay impeded subsequent research and may have been due to trade secrecy or the desire for obtaining lucrative intellectual property rights. However, our analysis of the history indicates that the main reasons for the delay were benign and were primarily due to difficulties in perfecting the PCR technique. Following this initial development period, the technology was made widely available, but was subject to strict licensing terms and patent protection, leading to an extensive litigation history. Discussion and evaluation PCR has earned approximately $2 billion in royalties for the various rights-holders while also becoming an essential research tool. However, using citation trend analysis, we are able to see that PCR's patented status did not preclude it from being adopted in a similar manner as other non-patented genomic research tools (specifically, pBR322 cloning vector and Maxam-Gilbert sequencing). Conclusion Despite the heavy patent protection and rigid licensing schemes, PCR seems to have disseminated so widely because of the practices of the corporate entities which have controlled these patents, namely through the use of business partnerships and broad corporate licensing, adaptive licensing strategies, and a "rational forbearance" from suing researchers for patent infringement. While far from definitive, our analysis seems to suggest that, at least in the case of PCR, patenting of genomic research tools need not impede their dissemination, if the technology is made available through appropriate business practices. PMID:16817955
The MAR databases: development and implementation of databases specific for marine metagenomics
Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen
2018-01-01
Abstract We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. PMID:29106641
Using SQL Databases for Sequence Similarity Searching and Analysis.
Pearson, William R; Mackey, Aaron J
2017-09-13
Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
The US EPA has patented a mold ID technology (#6,387,652) licensed by 15 companies in the US and EU. This technology is based upon DNA sequences. In conjunction with HUD, this technology will be used in a National Survey of Homes.
Wang, Penghao; Wilson, Susan R
2013-01-01
Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.
Mackey, Aaron J; Pearson, William R
2004-10-01
Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Owens, John
2009-01-01
Technological advances in the acquisition of DNA and protein sequence information and the resulting onrush of data can quickly overwhelm the scientist unprepared for the volume of information that must be evaluated and carefully dissected to discover its significance. Few laboratories have the luxury of dedicated personnel to organize, analyze, or consistently record a mix of arriving sequence data. A methodology based on a modern relational-database manager is presented that is both a natural storage vessel for antibody sequence information and a conduit for organizing and exploring sequence data and accompanying annotation text. The expertise necessary to implement such a plan is equal to that required by electronic word processors or spreadsheet applications. Antibody sequence projects maintained as independent databases are selectively unified by the relational-database manager into larger database families that contribute to local analyses, reports, interactive HTML pages, or exported to facilities dedicated to sophisticated sequence analysis techniques. Database files are transposable among current versions of Microsoft, Macintosh, and UNIX operating systems.
Canada's contribution to global research in cardiovascular diseases.
Nguyen, Hai V; de Oliveira, Claire; Wijeysundera, Harindra C; Wong, William W L; Woo, Gloria; Grootendorst, Paul; Liu, Peter P; Krahn, Murray D
2013-06-01
The burden of cardiovascular disease (CVD) in Canada and other developed countries is growing, in part because of the aging of the population and the alarming rise of obesity. Studying Canada's contribution to the global body of CVD research output will shed light on the effectiveness of investments in Canadian CVD research and inform if Canada has been responding to its CVD burden. Search was conducted using the Web-of-Science database for publications during 1981 through 2010 on major areas and specific interventions in CVD. Search was also conducted using Canadian and US online databases for patents issued between 1981 and 2010. Search data were used to estimate the proportions of the world's pool of research publications and of patents conducted by researchers based in Canada. The results indicate that Canada contributed 6% of global research in CVD during 1981 through 2010. Further, Canada's contribution shows a strong upward trend during the period. Based on patent data, Canada's contribution level was similar (5%-7%). Canada's contribution to the global pool of CVD research is on par with France and close to the UK, Japan, and Germany. Canada's contribution in global CVD research is higher than its average contribution in all fields of research (6% vs 3%). As the burden of chronic diseases including CVD rises with Canada's aging population, the increase in Canadian research into CVD is encouraging. Copyright © 2013 Canadian Cardiovascular Society. Published by Elsevier Inc. All rights reserved.
The LANL hemorrhagic fever virus database, a new platform for analyzing biothreat viruses
Kuiken, Carla; Thurmond, Jim; Dimitrijevic, Mira; Yoon, Hyejin
2012-01-01
Hemorrhagic fever viruses (HFVs) are a diverse set of over 80 viral species, found in 10 different genera comprising five different families: arena-, bunya-, flavi-, filo- and togaviridae. All these viruses are highly variable and evolve rapidly, making them elusive targets for the immune system and for vaccine and drug design. About 55 000 HFV sequences exist in the public domain today. A central website that provides annotated sequences and analysis tools will be helpful to HFV researchers worldwide. The HFV sequence database collects and stores sequence data and provides a user-friendly search interface and a large number of sequence analysis tools, following the model of the highly regarded and widely used Los Alamos HIV database [Kuiken, C., B. Korber, and R.W. Shafer, HIV sequence databases. AIDS Rev, 2003. 5: p. 52–61]. The database uses an algorithm that aligns each sequence to a species-wide reference sequence. The NCBI RefSeq database [Sayers et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 39, D38–D51.] is used for this; if a reference sequence is not available, a Blast search finds the best candidate. Using this method, sequences in each genus can be retrieved pre-aligned. The HFV website can be accessed via http://hfv.lanl.gov. PMID:22064861
Sequencing artifacts in the type A influenza databases and attempts to correct them.
Suarez, David L; Chester, Nikki; Hatfield, Jason
2014-07-01
There are over 276 000 influenza gene sequences in public databases, with the quality of the sequences determined by the contributor. As part of a high school class project, influenza sequences with possible errors were identified in the public databases based on the size of the gene being longer than expected, with the hypothesis that these sequences would have an error. Students contacted sequence submitters alerting them of the possible sequence issue(s) and requested they the suspect sequence(s) be correct as appropriate. Type A influenza viruses were screened, and gene segments longer than the accepted size were identified for further analysis. Attention was placed on sequences with additional nucleotides upstream or downstream of the highly conserved non-coding ends of the viral segments. A total of 1081 sequences were identified that met this criterion. Three types of errors were commonly observed: non-influenza primer sequence wasn't removed from the sequence; PCR product was cloned and plasmid sequence was included in the sequence; and Taq polymerase added an adenine at the end of the PCR product. Internal insertions of nucleotide sequence were also commonly observed, but in many cases it was unclear if the sequence was correct or actually contained an error. A total of 215 sequences, or 22.8% of the suspect sequences, were corrected in the public databases in the first year of the student project. Unfortunately 138 additional sequences with possible errors were added to the databases in the second year. Additional awareness of the need for data integrity of sequences submitted to public databases is needed to fully reap the benefits of these large data sets. © 2014 The Authors. Influenza and Other Respiratory Viruses Published by John Wiley & Sons Ltd.
Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R
2018-05-01
Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
MIPS: a database for genomes and protein sequences
Mewes, H. W.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Mayer, K.; Mokrejs, M.; Morgenstern, B.; Münsterkötter, M.; Rudd, S.; Weil, B.
2002-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz–Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91–93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155–158; Barker et al. (2001) Nucleic Acids Res., 29, 29–32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de). PMID:11752246
MIPS: a database for genomes and protein sequences.
Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B
2002-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).
Fourment, Mathieu; Gibbs, Mark J
2008-02-05
Viruses of the Bunyaviridae have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database. The VirusBanker database contains Bunyaviridae sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. Bunyaviridae sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview. VirusBanker allows large datasets of aligned nucleotide and protein sequences from the Bunyaviridae to be compiled and winnowed rapidly using criteria that are formulated heuristically.
Problem? "No Problem!" Solving Technical Contradictions
ERIC Educational Resources Information Center
Kutz, K. Scott; Stefan, Victor
2007-01-01
TRIZ (pronounced TREES), the Russian acronym for the theory of inventive problem solving, enables a person to focus his attention on finding genuine, potential solutions in contrast to searching for ideas that "may" work through a happenstance way. It is a patent database-backed methodology that helps to reduce time spent on the problem,…
ERIC Educational Resources Information Center
Roth, Dana Lincoln
1985-01-01
This article expresses concerns about online searches being run by inexperienced searchers or nonchemists for other nonchemists or students. Studies concerning problems in the use of Chemical Condensates Database, Chemical Abstracts, Medline, and patent information are highlighted. Examples of searches yielding unsatisfactory results are noted.…
A public HTLV-1 molecular epidemiology database for sequence management and data mining.
Araujo, Thessika Hialla Almeida; Souza-Brito, Leandro Inacio; Libin, Pieter; Deforche, Koen; Edwards, Dustin; de Albuquerque-Junior, Antonio Eduardo; Vandamme, Anne-Mieke; Galvao-Castro, Bernardo; Alcantara, Luiz Carlos Junior
2012-01-01
It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype.
Corruption of genomic databases with anomalous sequence.
Lamperti, E D; Kittelberger, J M; Smith, T F; Villa-Komaroff, L
1992-06-11
We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%.
The MAR databases: development and implementation of databases specific for marine metagenomics.
Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P
2018-01-04
We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
[Prescription rules of preparations containing Crataegi Fructus in Chinese patent drug].
Geng, Ya; Ma, Yue-Xiang; Xu, Hai-Yu; Li, Jun-Fang; Tang, Shi-Huan; Yang, Hong-Jun
2016-08-01
To analyze the prescription rules of preparations containing Crataegi Fructus in the drug standards of the People's Republic of China Ministry of Public Health-Chinese Patent Drug(hereinafter referred to as Chinese patent drug), and provide some references for clinical application and the research and development of new medicines. Based on TCMISS(V2.5), the prescriptions containing Crataegi Fructus in Chinese patent drug were collected to build the database; association rules, frequency statistics and other data mining methods were used to analyze the disease syndrome, common drug compatibility and prescription rules. There were a total of 308 prescriptions containing Crataegi Fructus, involving 499 kinds of Chinese medicines, 34 commonly used drug combinations, and mainly for 18 kinds of diseases. Drug combination analysis was done with "Crataegi Fructus-Citri Reticulatae Pericarpium" and "Crataegi Fructus-Poria" as the high-frequency herb pairs and with "stagnation" and "diarrhea" as the high-frequency diseases. The results indicated that the Crataegi Fructus in different herb pairs had a roughly same function, and its therapy effect was different in different diseases. The prescriptions containing Crataegi Fructus in Chinese patent drug had the effect of digestion, and they were widely used in clinical application, often used together with spleen-strengthening medicines to achieve different treatment effects; the prescription rules reflected the prescription characteristics of Crataegi Fructus for different diseases, providing a basis for its clinically scientific application and the research and development of new medicines. Copyright© by the Chinese Pharmaceutical Association.
[Legal decisions on access to medicines in Pernambuco, Northeastern Brazil].
Stamford, Artur; Cavalcanti, Maísa
2012-10-01
To analyze decisions from the legal system concerning the population's access to medicines within the Brazilian Public Health System through judicial channels, with regard to decision-making criteria and possible political and economic pressure. This was a descriptive retrospective study on documents with a quantitative and qualitative approach. Data were gathered from the State of Pernambuco Superintendency for Pharmaceutical Care, and the data sources used were 105 lawsuits and administrative reports between January and June 2009. It was ascertained which medications have a patent or patent request in the database of the Brazilian Patent Office (INPI), in order to identify the frequency with which patents feature in lawsuits. The data obtained were classified according to Anatomical and Therapeutic Chemical System. To analyze the judicial decisions, the theory of autopoietic social systems was used. There were lawsuits involving 134 medications, with an estimated value of R$ 4.5 million for attending the treatments requested. 70.9% of the medications had a patent or a patent request and they were concentrated in three therapeutic classes: antineoplastic and immunomodulating agents; digestive tract and metabolism; and sensory organs. Six central ideas within judges' decision-making criteria were identified (the federal constitution and medical prescriptions), along with pressure between the legal, economic and political systems concerning access to medications. The analysis on judicial decisions based on the theory of autopoietic social systems made it possible to identify mutual stimulation (dependency) between the legal system and other social systems in relation to the issue of citizens' access to medications. This dependency was represented by the federal constitution and intellectual property. The federal constitution and medical prescription were identified as decision-making criteria in lawsuits. Intellectual property represented possible political and economic pressure, especially in cases of launching medications into the market.
NASA Astrophysics Data System (ADS)
Jafari, Mostafa; Zarghami, Hamid Reza
2016-07-01
This paper investigates the global nanotechnology and nanoscience (NN) indicators in a developmental context, during three 5-year periods from 2000 to 2014. Through bibliometric analyses of the longitudinal data from well-known databases, the growth patterns of NN articles and patents were investigated. Furthermore, the causal relationships among these indicators and some characteristics of the 105 countries studied were examined using regression and correlation analyses leading to the identification of the top 20 "science and innovation giants," in terms of all indicators, as well as the existence of significant, yet different, correlations among the indicators in developing and developed countries. In general, China's growth rate (GR) in NN publications was found to surpass USA, from 2010 to 2014, leading to a change in the ranking of the top countries and moving China, with about 25 % of world's NN articles, to top. A different trend was distinguished for patents in the area of nanotechnology, where USA, as the origin of over half of the world's granted patents, has been the undisputed leader. The shares of developing countries (i.e., the percent ratios of the number of nanotech patents granted to the citizens of developing countries over the total number of nanotech patents granted worldwide) was found to be incompatible with the countries' shares in the total NN articles, indicating a poor correlation between the two factors. However, developing countries were found to be superior in the GR of both NN articles and patents. Finally, the top countries identified can be regarded as suitable for comparative studies, and benchmarking by researchers and policy makers.
MIPS: analysis and annotation of proteins from whole genomes
Mewes, H. W.; Amid, C.; Arnold, R.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Münsterkötter, M.; Pagel, P.; Strack, N.; Stümpflen, V.; Warfsmann, J.; Ruepp, A.
2004-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein–protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de). PMID:14681354
MIPS: analysis and annotation of proteins from whole genomes.
Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A
2004-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
AgdbNet – antigen sequence database software for bacterial typing
Jolley, Keith A; Maiden, Martin CJ
2006-01-01
Background Bacterial typing schemes based on the sequences of genes encoding surface antigens require databases that provide a uniform, curated, and widely accepted nomenclature of the variants identified. Due to the differences in typing schemes, imposed by the diversity of genes targeted, creating these databases has typically required the writing of one-off code to link the database to a web interface. Here we describe agdbNet, widely applicable web database software that facilitates simultaneous BLAST querying of multiple loci using either nucleotide or peptide sequences. Results Databases are described by XML files that are parsed by a Perl CGI script. Each database can have any number of loci, which may be defined by nucleotide and/or peptide sequences. The software is currently in use on at least five public databases for the typing of Neisseria meningitidis, Campylobacter jejuni and Streptococcus equi and can be set up to query internal isolate tables or suitably-configured external isolate databases, such as those used for multilocus sequence typing. The style of the resulting website can be fully configured by modifying stylesheets and through the use of customised header and footer files that surround the output of the script. Conclusion The software provides a rapid means of setting up customised Internet antigen sequence databases. The flexible configuration options enable typing schemes with differing requirements to be accommodated. PMID:16790057
Wan, Jian-bo; He, Chengwei; Hu, Yuanjia
2016-01-01
Despite the existence of available therapies, the Hepatitis B virus infection continues to be one of the most serious threats to human health, especially in developing countries such as China and India. To shed light on the improvement of current therapies and development of novel anti-HBV drugs, we thoroughly investigated 212 US patents of anti-HBV drugs and analyzed the technology flow in research and development of anti-HBV drugs based on data from IMS LifeCycle databases. Moreover, utilizing the patent citation method, which is an effective indicator of technology flow, we constructed patent citation network models and performed network analysis in order to reveal the features of different technology clusters. As a result, we identified the stagnant status of anti-HBV drug development and pointed the way for development of domestic pharmaceuticals in developing countries. We also discussed about therapeutic vaccines as the potential next generation therapy for HBV infection. Lastly, we depicted the cooperation between entities and found that novel forms of cooperation added diversity to the conventional form of cooperation within the pharmaceutical industry. In summary, our study provides inspiring insights for investors, policy makers, researchers, and other readers interested in anti-HBV drug development. PMID:27727319
New drugs or alternative therapy to blurring the symptoms of fibromyalgia-a patent review.
Oliveira, Marlange A; Guimarães, Adriana G; Araújo, Adriano A S; Quintans-Júnior, Lucindo J; Quintans, Jullyana S S
2017-10-01
Fibromyalgia (FM) is a musculoskeletal condition characterized by chronic widespread pain, tenderness and often accompanied by other comorbid conditions such as depression, anxiety, chronic fatigue, among others. Now, we aimed to survey the recent patents describing new drugs or alternative therapy for FM. Areas covered: This review covers the therapeutic patents published between 2010 and 2017 from specialized search databases (WIPO, DERWENT, INPI, ESPANET and USPTO) that report the discovery of new drugs or pharmacologic alternative for the treatment of FM. Expert opinion: New therapeutic substances have been proposed in the last seven years. At least as it has been found in our survey, most are still in the pre-clinical phase of the study, and its clinical applicability is unclear. However, other therapeutic approaches were found in patents such as well-established drugs in the market in combination or drug repositioning that combines the 'new analgesic' effects with the old side effects. Hence, it is a safe approach for pharmaceutical market, but poorer to patients who need a radical innovation. So, there is the emerging need for further studies on the safety and efficacy of such therapeutic measures and the search for improvement of side effects, as well as the development of new drugs that are unorthodox for different FM symptoms.
Ribosomal S6 kinase (RSK) modulators: a patent review.
Ludwik, Katarzyna A; Lannigan, Deborah A
2016-09-01
The p90 ribosomal S6 kinases (RSK) are a family of Ser/Thr protein kinases that are downstream effectors of MEK1/2-ERK1/2. Increased RSK activation is implicated in the etiology of multiple pathologies, including numerous types of cancers, cardiovascular disease, liver and lung fibrosis, and infections. The review summarizes the patent and scientific literature on small molecule modulators of RSK and their potential use as therapeutics. The patents were identified using World Intellectual Property Organization and United States Patent and Trademark Office databases. The compounds described are predominantly RSK inhibitors, but a RSK activator is also described. The majority of the inhibitors are not RSK-specific. Based on the overwhelming evidence that RSK is involved in a number of diseases that have high mortalities it seems surprising that there are no RSK modulators that have pharmacokinetic properties suitable for in vivo use. MEK1/2 inhibitors are in the clinic, but the efficacy of these compounds appears to be limited by their side effects. We hypothesize that targeting the downstream effectors of MEK1/2, like RSK, are an untapped source of drug targets and that they will generate less side effects than MEK1/2 inhibitors because they regulate fewer effectors.
Therapeutic and cosmetic applications of mangiferin: a patent review.
Telang, Manasi; Dhulap, Sivakami; Mandhare, Anita; Hirwani, Rajkumar
2013-12-01
Mangiferin, a natural C-glucoside xanthone [2-C-β-D-glucopyranosyl-1, 3, 6, 7-tetrahydroxyxanthone], is abundantly present in young leaves and stem bark of the mango tree. The xanthonoid structure of mangiferin with C-glycosyl linkage and polyhydroxy components contributes to its free radical-scavenging ability, leading to a potent antioxidant effect as well as multiple biological activities. An extensive search was carried out to collect patent information on mangiferin and its derivatives using various patent databases spanning all priority years to date. The patents claiming therapeutic and cosmetic applications of mangiferin and its derivatives were analyzed in detail. The technology areas covered in this article include metabolic disorders, cosmeceuticals, multiple uses of the same compound, miscellaneous uses, infectious diseases, inflammation, cancer and autoimmune disorders, and neurological disorders. Mangiferin has the potential to modulate multiple molecular targets including nuclear factor-kappa B (NF-κB) signaling and cyclooxygenase-2 (COX-2) protein expression. Mangiferin exhibits antioxidant, antidiabetic, antihyperuricemic, antiviral, anticancer and antiinflammatory activities. The molecular structure of mangiferin fulfils the four Lipinski's requisites reported to favor high bioavailability by oral administration. There is no evidence of adverse side effects of mangiferin so far. Mangiferin could thus be a promising candidate for development of a multipotent drug.
Dammarane triterpenoids for pharmaceutical use: a patent review (2005 - 2014).
Cao, Jiaqing; Zhang, Xiaoshu; Qu, Fanzhi; Guo, Zhenghong; Zhao, Yuqing
2015-07-01
Dammarane triterpenoids, the main secondary metabolites of Panax ginseng, are very important natural compounds with remarkable biological activity. They could be isolated from the plants of Panax or other genus, as well as through the modifications of certain natural products. This review is a collection of a number of patents (2005 - 2014) that describe the dammarane triterpenoids for therapeutic or preventive uses on numerous common diseases. In this review, patents from 2005 to 2014 on chemical structures and treatment of different diseases by dammarane triterpenoids have been summarized. The SciFinder and the World Intellectual Property Organisation databases have been used as main sources for the search. In the last decade, over 90 patents concerning dammarane derivatives for pharmaceutical have been published. These types of compounds could be used as agents for prevention and treatment of various kinds of diseases, such as cancer, diabetes mellitus and metabolic syndrome, hyperlipidemia, cardiovascular and cerebrovascular disease, aging, neurodegenerative disease, bone disease, liver disease, kidney disease, gastrointestinal disease, depression-type mental illness and skin aging. Rare plants, except for Panax genus, which contain dammarane triterpenoids should be studied extensively. In addition, more dammarane triterpenoids with good biological activity, especially the aglycones possessing novel side chain, should be prepared using chemical modification. Finally, pharmacological effects of dammarane triterpenoids should be further studied.
BioWarehouse: a bioinformatics database warehouse toolkit
Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D
2006-01-01
Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for bioinformatics. PMID:16556315
BioWarehouse: a bioinformatics database warehouse toolkit.
Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D
2006-03-23
This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.
Clinical testing of BRCA1 and BRCA2: a worldwide snapshot of technological practices.
Toland, Amanda Ewart; Forman, Andrea; Couch, Fergus J; Culver, Julie O; Eccles, Diana M; Foulkes, William D; Hogervorst, Frans B L; Houdayer, Claude; Levy-Lahad, Ephrat; Monteiro, Alvaro N; Neuhausen, Susan L; Plon, Sharon E; Sharan, Shyam K; Spurdle, Amanda B; Szabo, Csilla; Brody, Lawrence C
2018-01-01
Clinical testing of BRCA1 and BRCA2 began over 20 years ago. With the expiration and overturning of the BRCA patents, limitations on which laboratories could offer commercial testing were lifted. These legal changes occurred approximately the same time as the widespread adoption of massively parallel sequencing (MPS) technologies. Little is known about how these changes impacted laboratory practices for detecting genetic alterations in hereditary breast and ovarian cancer genes. Therefore, we sought to examine current laboratory genetic testing practices for BRCA1 / BRCA2 . We employed an online survey of 65 questions covering four areas: laboratory characteristics, details on technological methods, variant classification, and client-support information. Eight United States (US) laboratories and 78 non-US laboratories completed the survey. Most laboratories (93%; 80/86) used MPS platforms to identify variants. Laboratories differed widely on: (1) technologies used for large rearrangement detection; (2) criteria for minimum read depths; (3) non-coding regions sequenced; (4) variant classification criteria and approaches; (5) testing volume ranging from 2 to 2.5 × 10 5 tests annually; and (6) deposition of variants into public databases. These data may be useful for national and international agencies to set recommendations for quality standards for BRCA1/BRCA2 clinical testing. These standards could also be applied to testing of other disease genes.
Fourment, Mathieu; Gibbs, Mark J
2008-01-01
Background Viruses of the Bunyaviridae have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database. Results The VirusBanker database contains Bunyaviridae sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. Bunyaviridae sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview. Conclusion VirusBanker allows large datasets of aligned nucleotide and protein sequences from the Bunyaviridae to be compiled and winnowed rapidly using criteria that are formulated heuristically. PMID:18251994
Comet: an open-source MS/MS sequence database search tool.
Eng, Jimmy K; Jahan, Tahmina A; Hoopmann, Michael R
2013-01-01
Proteomics research routinely involves identifying peptides and proteins via MS/MS sequence database search. Thus the database search engine is an integral tool in many proteomics research groups. Here, we introduce the Comet search engine to the existing landscape of commercial and open-source database search tools. Comet is open source, freely available, and based on one of the original sequence database search tools that has been widely used for many years. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Lee, Jennifer F.; Hesselberth, Jay R.; Meyers, Lauren Ancel; Ellington, Andrew D.
2004-01-01
The aptamer database is designed to contain comprehensive sequence information on aptamers and unnatural ribozymes that have been generated by in vitro selection methods. Such data are not normally collected in ‘natural’ sequence databases, such as GenBank. Besides serving as a storehouse of sequences that may have diagnostic or therapeutic utility, the database serves as a valuable resource for theoretical biologists who describe and explore fitness landscapes. The database is updated monthly and is publicly available at http://aptamer.icmb.utexas.edu/. PMID:14681367
JICST Factual Database JICST DNA Database
NASA Astrophysics Data System (ADS)
Shirokizawa, Yoshiko; Abe, Atsushi
Japan Information Center of Science and Technology (JICST) has started the on-line service of DNA database in October 1988. This database is composed of EMBL Nucleotide Sequence Library and Genetic Sequence Data Bank. The authors outline the database system, data items and search commands. Examples of retrieval session are presented.
NCBI-compliant genome submissions: tips and tricks to save time and money.
Pirovano, Walter; Boetzer, Marten; Derks, Martijn F L; Smit, Sandra
2017-03-01
Genome sequences nowadays play a central role in molecular biology and bioinformatics. These sequences are shared with the scientific community through sequence databases. The sequence repositories of the International Nucleotide Sequence Database Collaboration (INSDC, comprising GenBank, ENA and DDBJ) are the largest in the world. Preparing an annotated sequence in such a way that it will be accepted by the database is challenging because many validation criteria apply. In our opinion, it is an undesirable situation that researchers who want to submit their sequence need either a lot of experience or help from partners to get the job done. To save valuable time and money, we list a number of recommendations for people who want to submit an annotated genome to a sequence database, as well as for tool developers, who could help to ease the process. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.
Zeng, Victor; Extavour, Cassandra G
2012-01-01
The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental biology, physiology, evolutionary biology, ecology, comparative genomics and phylogenomics. Database URL: asgard.rc.fas.harvard.edu.
78 FR 8547 - Government-Owned Inventions; Availability for Licensing
Federal Register 2010, 2011, 2012, 2013, 2014
2013-02-06
... sequencing approaches to analyze the entire G protein coupled receptor (GPCR) gene family in melanoma, the... of the patent applications. SUPPLEMENTARY INFORMATION: Mutations in the G Protein Coupled Receptor... the pathway it activates, mitogen-activated protein kinase (MEK), for the treatment of melanoma...
[Establish research model of post-marketing clinical safety evaluation for Chinese patent medicine].
Zheng, Wen-ke; Liu, Zhi; Lei, Xiang; Tian, Ran; Zheng, Rui; Li, Nan; Ren, Jing-tian; Du, Xiao-xi; Shang, Hong-cai
2015-09-01
The safety of Chinese patent medicine has become a focus of social. It is necessary to carry out work on post-marketing clinical safety evaluation for Chinese patent medicine. However, there have no criterions to guide the related research, it is urgent to set up a model and method to guide the practice for related research. According to a series of clinical research, we put forward some views, which contained clear and definite the objective and content of clinical safety evaluation, the work flow should be determined, make a list of items for safety evaluation project, and put forward the three level classification of risk control. We set up a model of post-marketing clinical safety evaluation for Chinese patent medicine. Based this model, the list of items can be used for ranking medicine risks, and then take steps for different risks, aims to lower the app:ds:risksrisk level. At last, the medicine can be managed by five steps in sequence. The five steps are, collect risk signal, risk recognition, risk assessment, risk management, and aftereffect assessment. We hope to provide new ideas for the future research.
Trends for nanotechnology development in China, Russia, and India
NASA Astrophysics Data System (ADS)
Liu, Xuan; Zhang, Pengzhu; Li, Xin; Chen, Hsinchun; Dang, Yan; Larson, Catherine; Roco, Mihail C.; Wang, Xianwen
2009-11-01
China, Russia, and India are playing an increasingly important role in global nanotechnology research and development (R&D). This paper comparatively inspects the paper and patent publications by these three countries in the Thomson Science Citation Index Expanded (SCI) database and United States Patent and Trademark Office (USPTO) database (1976-2007). Bibliographic, content map, and citation network analyses are used to evaluate country productivity, dominant research topics, and knowledge diffusion patterns. Significant and consistent growth in nanotechnology papers are noted in the three countries. Between 2000 and 2007, the average annual growth rate was 31.43% in China, 11.88% in Russia, and 33.51% in India. During the same time, the growth patterns were less consistent in patent publications: the corresponding average rates are 31.13, 10.41, and 5.96%. The three countries' paper impact measured by the average number of citations has been lower than the world average. However, from 2000 to 2007, it experienced rapid increases of about 12.8 times in China, 8 times in India, and 1.6 times in Russia. The Chinese Academy of Sciences (CAS), the Russian Academy of Sciences (RAS), and the Indian Institutes of Technology (IIT) were the most productive institutions in paper publication, with 12,334, 6,773, and 1,831 papers, respectively. The three countries emphasized some common research topics such as "Quantum dots," "Carbon nanotubes," "Atomic force microscopy," and "Scanning electron microscopy," while Russia and India reported more research on nano-devices as compared with China. CAS, RAS, and IIT played key roles in the respective domestic knowledge diffusion.
SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.
Korpar, Matija; Šošić, Martin; Blažeka, Dino; Šikić, Mile
2015-01-01
In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result-the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4-5 times faster than SSEARCH, 6-25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases.
Using the structure-function linkage database to characterize functional domains in enzymes.
Brown, Shoshana; Babbitt, Patricia
2014-12-12
The Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) is a Web-accessible database designed to link enzyme sequence, structure, and functional information. This unit describes the protocols by which a user may query the database to predict the function of uncharacterized enzymes and to correct misannotated functional assignments. The information in this unit is especially useful in helping a user discriminate functional capabilities of a sequence that is only distantly related to characterized sequences in publicly available databases. Copyright © 2014 John Wiley & Sons, Inc.
Specialized microbial databases for inductive exploration of microbial genome sequences
Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine
2005-01-01
Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474
Brandon Schlautman; Vera Pfeiffer; Juan Zalapa; Johanne Brunet
2014-01-01
Numerous microsatellite markers were developed for Aquilegia formosafrom sequences deposited within the Expressed Sequence Tag (EST), Genomic Survey Sequence (GSS), and Nucleotide databases in NCBI. Microsatellites (SSRs) were identified and primers were designed for 9 SSR containing sequences in the Nucleotide database, 3803 sequences in the EST...
Reference System of DNA and Protein Sequences on CD-ROM
NASA Astrophysics Data System (ADS)
Nasu, Hisanori; Ito, Toshiaki
DNASIS-DBREF31 is a database for DNA and Protein sequences in the form of optical Compact Disk (CD) ROM, developed and commercialized by Hitachi Software Engineering Co., Ltd. Both nucleic acid base sequences and protein amino acid sequences can be retrieved from a single CD-ROM. Existing database is offered in the form of on-line service, floppy disks, or magnetic tape, all of which have some problems or other, such as usability or storage capacity. DNASIS-DBREF31 newly adopt a CD-ROM as a database device to realize a mass storage and personal use of the database.
Transterm—extended search facilities and improved integration with other databases
Jacobs, Grant H.; Stockwell, Peter A.; Tate, Warren P.; Brown, Chris M.
2006-01-01
Transterm has now been publicly available for >10 years. Major changes have been made since its last description in this database issue in 2002. The current database provides data for key regions of mRNA sequences, a curated database of mRNA motifs and tools to allow users to investigate their own motifs or mRNA sequences. The key mRNA regions database is derived computationally from Genbank. It contains 3′ and 5′ flanking regions, the initiation and termination signal context and coding sequence for annotated CDS features from Genbank and RefSeq. The database is non-redundant, enabling summary files and statistics to be prepared for each species. Advances include providing extended search facilities, the database may now be searched by BLAST in addition to regular expressions (patterns) allowing users to search for motifs such as known miRNA sequences, and the inclusion of RefSeq data. The database contains >40 motifs or structural patterns important for translational control. In this release, patterns from UTRsite and Rfam are also incorporated with cross-referencing. Users may search their sequence data with Transterm or user-defined patterns. The system is accessible at . PMID:16381889
NASA Astrophysics Data System (ADS)
Ignat, V.
2016-08-01
Advanced industrial countries are affected by technology theft. German industry annually loses more than 50 billion euros. The main causes are industrial espionage and fraudulent copying patents and industrial products. Many Asian countries are profiteering saving up to 65% of production costs. Most affected are small medium enterprises, who do not have sufficient economic power to assert themselves against some powerful countries. International organizations, such as Interpol and World Customs Organization - WCO - work together to combat international economic crime. Several methods of protection can be achieved by registering patents or specific technical methods for recognition of product originality. They have developed more suitable protection, like Hologram, magnetic stripe, barcode, CE marking, digital watermarks, DNA or Nano-technologies, security labels, radio frequency identification, micro color codes, matrix code, cryptographic encodings. The automotive industry has developed the method “Manufactures against Product Piracy”. A sticker on the package features original products and it uses a Data Matrix verifiable barcode. The code can be recorded with a smartphone camera. The smartphone is connected via Internet to a database, where the identification numbers of the original parts are stored.
Decelle, Johan; Romac, Sarah; Stern, Rowena F; Bendif, El Mahdi; Zingone, Adriana; Audic, Stéphane; Guiry, Michael D; Guillou, Laure; Tessier, Désiré; Le Gall, Florence; Gourvil, Priscillia; Dos Santos, Adriana L; Probert, Ian; Vaulot, Daniel; de Vargas, Colomban; Christen, Richard
2015-11-01
Photosynthetic eukaryotes have a critical role as the main producers in most ecosystems of the biosphere. The ongoing environmental metabarcoding revolution opens the perspective for holistic ecosystems biological studies of these organisms, in particular the unicellular microalgae that often lack distinctive morphological characters and have complex life cycles. To interpret environmental sequences, metabarcoding necessarily relies on taxonomically curated databases containing reference sequences of the targeted gene (or barcode) from identified organisms. To date, no such reference framework exists for photosynthetic eukaryotes. In this study, we built the PhytoREF database that contains 6490 plastidial 16S rDNA reference sequences that originate from a large diversity of eukaryotes representing all known major photosynthetic lineages. We compiled 3333 amplicon sequences available from public databases and 879 sequences extracted from plastidial genomes, and generated 411 novel sequences from cultured marine microalgal strains belonging to different eukaryotic lineages. A total of 1867 environmental Sanger 16S rDNA sequences were also included in the database. Stringent quality filtering and a phylogeny-based taxonomic classification were applied for each 16S rDNA sequence. The database mainly focuses on marine microalgae, but sequences from land plants (representing half of the PhytoREF sequences) and freshwater taxa were also included to broaden the applicability of PhytoREF to different aquatic and terrestrial habitats. PhytoREF, accessible via a web interface (http://phytoref.fr), is a new resource in molecular ecology to foster the discovery, assessment and monitoring of the diversity of photosynthetic eukaryotes using high-throughput sequencing. © 2015 John Wiley & Sons Ltd.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2008-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.
2008-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov PMID:18073190
RECOVIR Software for Identifying Viruses
NASA Technical Reports Server (NTRS)
Chakravarty, Sugoto; Fox, George E.; Zhu, Dianhui
2013-01-01
Most single-stranded RNA (ssRNA) viruses mutate rapidly to generate a large number of strains with highly divergent capsid sequences. Determining the capsid residues or nucleotides that uniquely characterize these strains is critical in understanding the strain diversity of these viruses. RECOVIR (an acronym for "recognize viruses") software predicts the strains of some ssRNA viruses from their limited sequence data. Novel phylogenetic-tree-based databases of protein or nucleic acid residues that uniquely characterize these virus strains are created. Strains of input virus sequences (partial or complete) are predicted through residue-wise comparisons with the databases. RECOVIR uses unique characterizing residues to identify automatically strains of partial or complete capsid sequences of picorna and caliciviruses, two of the most highly diverse ssRNA virus families. Partition-wise comparisons of the database residues with the corresponding residues of more than 300 complete and partial sequences of these viruses resulted in correct strain identification for all of these sequences. This study shows the feasibility of creating databases of hitherto unknown residues uniquely characterizing the capsid sequences of two of the most highly divergent ssRNA virus families. These databases enable automated strain identification from partial or complete capsid sequences of these human and animal pathogens.
Tang, Shi-Huan; Shen, Dan; Yang, Hong-Jun
2017-08-24
To analyze the composition rules of oral prescriptions in the treatment of headache, stomachache and dysmenorrhea recorded in National Standard for Chinese Patent Drugs (NSCPD) enacted by Ministry of Public Health of China and then make comparison between them to better understand pain treatment in different regions of human body. Constructed NSCPD database had been constructed in 2014. Prescriptions treating the three pain-related diseases were searched and screened from the database. Then data mining method such as association rules analysis and complex system entropy method integrated in the data mining software Traditional Chinese Medicine Inheritance Support System (TCMISS) were applied to process the data. Top 25 drugs with high frequency in the treatment of each disease were selected, and 51, 33 and 22 core combinations treating headache, stomachache and dysmenorrhea respectively were mined out as well. The composition rules of the oral prescriptions for treating headache, stomachache and dysmenorrhea recorded in NSCPD has been summarized. Although there were similarities between them, formula varied according to different locations of pain. It can serve as an evidence and reference for clinical treatment and new drug development.
Avalanche for shape and feature-based virtual screening with 3D alignment
NASA Astrophysics Data System (ADS)
Diller, David J.; Connell, Nancy D.; Welsh, William J.
2015-11-01
This report introduces a new ligand-based virtual screening tool called Avalanche that incorporates both shape- and feature-based comparison with three-dimensional (3D) alignment between the query molecule and test compounds residing in a chemical database. Avalanche proceeds in two steps. The first step is an extremely rapid shape/feature based comparison which is used to narrow the focus from potentially millions or billions of candidate molecules and conformations to a more manageable number that are then passed to the second step. The second step is a detailed yet still rapid 3D alignment of the remaining candidate conformations to the query conformation. Using the 3D alignment, these remaining candidate conformations are scored, re-ranked and presented to the user as the top hits for further visualization and evaluation. To provide further insight into the method, the results from two prospective virtual screens are presented which show the ability of Avalanche to identify hits from chemical databases that would likely be missed by common substructure-based or fingerprint-based search methods. The Avalanche method is extended to enable patent landscaping, i.e., structural refinements to improve the patentability of hits for deployment in drug discovery campaigns.
Use status and metabolism of realgar in Chinese patent medicine.
Li, Yongfang; Wang, Da; Xu, Yuanyuan; Liu, Boying; Zheng, Yi; Yang, Boyi; Fan, Shujun; Zhi, Xueyuan; Zheng, Quanmei; Sun, Guifan
2015-04-08
Realgar is widely used in combination with other herbs as Chinese patent medicine to treat a wide range of diseases in China. It is also a well known arsenical toxicant. Chronic arsenic poisoning events caused by long-term usage of realgar-containing medicines have been reported in literatures. Given to the paradoxical role of realgar, comprehensive outline of its usage status in Chinese patent medicine might provide basal data for evaluating its toxicology risks in populations. Unfortunately, the relevant information is limited. Also, a metabolic process after intake of realgar-containing medicine in humans is poorly understood. The Traditional Chinese Patent Medicine Prescription Database was reviewed to get the information on the usage status of realgar. Realgar powder was dissolved in different pH-value solutions (1, 3, 5, 7, 9 and 11) to determine the soluble arsenic concentrations from realgar. Ten volunteers aged 24-26 years old were recruited to take four pills of Niu Huang Jie Du Pian (NHJDP), a very common Chinese patent medicine with realgar, to analyze the arsenic metabolism after exposure to realgar-containing medicine. The four pills were taken according to the medical instruction. Concentrations of soluble arsenic from realgar and urinary arsenic metabolites in humans were determined by hydride generation atomic absorption spectrometry. A total of 191 (2.25%) realgar-containing traditional Chinese patent medicines were obtained from the database, and almost 86.91% of them were for oral application. 73 (38.22%) medicines were found to be available for children. The mass fraction of arsenic in realgar-containing medicine ranged from 0.11% to 27.52%. According to medical instructions, the amount of average daily arsenic intake ranged from 0.47 to 2895.53mg. Nearly 86% medicines with daily intake of arsenic >10mg. Only inorganic arsenic (iAs) was detected from realgar in dissolution experiment and the levels of soluble iAs increased with pH values. After intake NHJDP, arsenic excretion in urine significantly increased, with a maximum excretion of iAs and monomethylarsonic acid at 6h post-ingestion and a peak excretion of dimethylarsinic acid at 9h post-ingestion. Arsenic methylation capacity was decreased after intake NHJDP. Females carried a more efficient arsenic methylation process than males. Realgar is widely used in traditional Chinese medicine. The arsenic solubility from realgar may be enhanced under alkaline conditions. The levels of urinary arsenic metabolites significantly increased while the arsenic methylation capacity significantly decreased after intaking realgar-containing medicine, which may suggest that a potential health hazard exists if people use arsenical medicines for long-term. Copyright © 2015. Published by Elsevier Ireland Ltd.
O-GLYCBASE Version 3.0: a revised database of O-glycosylated proteins.
Hansen, J E; Lund, O; Nilsson, J; Rapacki, K; Brunak, S
1998-01-01
O-GLYCBASE is a revised database of information on glycoproteins and their O-linked glycosylation sites. Entries are compiled and revised from the literature, and from the sequence databases. Entries include information about species, sequence, glycosylation sites and glycan type and is fully cross-referenced. Compared to version 2.0 the number of entries has increased by 20%. Sequence logos displaying the acceptor specificity patterns for the GalNAc, mannose and GlcNAc transferases are shown. The O-GLYCBASE database is available through the WWW at http://www.cbs.dtu. dk/databases/OGLYCBASE/ PMID:9399880
Sphingosine kinase inhibitors: a review of patent literature (2006-2015).
Lynch, Kevin R; Thorpe, S Brandon; Santos, Webster L
2016-12-01
Sphingosine kinase (SphK1 & SphK2) is the sole source of the pleiotropic lipid mediator, sphingosine-1-phosphate (S1P). S1P has been implicated in a variety of diseases such as cancer, Alzheimer's disease, sickle cell disease and fibrosis and thus the biosynthetic route to S1P is a logical target for drug discovery. Areas covered: In this review, the authors consider the SphK inhibitor patent literature from 2006-2016 Q1 with the emphasis on composition of matter utility patents. The Espacenet database was queried with the search term 'sphingosine AND kinase' to identify relevant literature. Expert opinion: Early inhibitor discovery focused on SphK1 with a bias towards oncology indications. Structurally, the reported inhibitors occupy the sphingosine 'J-shaped' binding pocket. The lack of cytotoxicity with improved SphK1 inhibitors raises doubt about the enzyme as an oncology target. SphK2 inhibitors are featured in more recent patent applications. Interestingly, both SphK1 and SphK2 inhibition and gene 'knockout' share opposing effects on circulating S1P levels: SphK1 inhibition/gene ablation decreases, while SphK2 inhibition/gene ablation increases, blood S1P. As understanding of S1P's physiological roles increases and more drug-like SphK inhibitors emerge, inhibiting one or both SphK isotypes could provide unique strategies for treating disease.
Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio
2017-06-01
The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss
Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia
2011-01-01
SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/ PMID:22120661
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss.
Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia
2011-01-01
SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/
Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT+TREMBL.
Apweiler, R; Gateau, A; Contrino, S; Martin, M J; Junker, V; O'Donovan, C; Lang, F; Mitaritonna, N; Kappus, S; Bairoch, A
1997-01-01
SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. Ongoing genome sequencing projects have dramatically increased the number of protein sequences to be incorporated into SWISS-PROT. Since we do not want to dilute the quality standards of SWISS-PROT by incorporating sequences without proper sequence analysis and annotation, we cannot speed up the incorporation of new incoming data indefinitely. However, as we also want to make the sequences available as fast as possible, we introduced TREMBL (TRanslation of EMBL nucleotide sequence database), a supplement to SWISS-PROT. TREMBL consists of computer-annotated entries in SWISS-PROT format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except for CDS already included in SWISS-PROT. While TREMBL is already of immense value, its computer-generated annotation does not match the quality of SWISS-PROTs. The main difference is in the protein functional information attached to sequences. With this in mind, we are dedicating substantial effort to develop and apply computer methods to enhance the functional information attached to TREMBL entries.
A 5.8S nuclear ribosomal RNA gene sequence database: applications to ecology and evolution
NASA Technical Reports Server (NTRS)
Cullings, K. W.; Vogler, D. R.
1998-01-01
We complied a 5.8S nuclear ribosomal gene sequence database for animals, plants, and fungi using both newly generated and GenBank sequences. We demonstrate the utility of this database as an internal check to determine whether the target organism and not a contaminant has been sequenced, as a diagnostic tool for ecologists and evolutionary biologists to determine the placement of asexual fungi within larger taxonomic groups, and as a tool to help identify fungi that form ectomycorrhizae.
E-MSD: an integrated data resource for bioinformatics.
Velankar, S; McNeil, P; Mittard-Runte, V; Suarez, A; Barrell, D; Apweiler, R; Henrick, K
2005-01-01
The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the worldwide Protein Data Bank (wwPDB) and to work towards the integration of various bioinformatics data resources. One of the major obstacles to the improved integration of structural databases such as MSD and sequence databases like UniProt is the absence of up to date and well-maintained mapping between corresponding entries. We have worked closely with the UniProt group at the EBI to clean up the taxonomy and sequence cross-reference information in the MSD and UniProt databases. This information is vital for the reliable integration of the sequence family databases such as Pfam and Interpro with the structure-oriented databases of SCOP and CATH. This information has been made available to the eFamily group (http://www.efamily.org.uk/) and now forms the basis of the regular interchange of information between the member databases (MSD, UniProt, Pfam, Interpro, SCOP and CATH). This exchange of annotation information has enriched the structural information in the MSD database with annotation from wider sequence-oriented resources. This work was carried out under the 'Structure Integration with Function, Taxonomy and Sequences (SIFTS)' initiative (http://www.ebi.ac.uk/msd-srv/docs/sifts) in the MSD group.
Martin, Stanton L; Blackmon, Barbara P; Rajagopalan, Ravi; Houfek, Thomas D; Sceeles, Robert G; Denn, Sheila O; Mitchell, Thomas K; Brown, Douglas E; Wing, Rod A; Dean, Ralph A
2002-01-01
We have created a federated database for genome studies of Magnaporthe grisea, the causal agent of rice blast disease, by integrating end sequence data from BAC clones, genetic marker data and BAC contig assembly data. A library of 9216 BAC clones providing >25-fold coverage of the entire genome was end sequenced and fingerprinted by HindIII digestion. The Image/FPC software package was then used to generate an assembly of 188 contigs covering >95% of the genome. The database contains the results of this assembly integrated with hybridization data of genetic markers to the BAC library. AceDB was used for the core database engine and a MySQL relational database, populated with numerical representations of BAC clones within FPC contigs, was used to create appropriately scaled images. The database is being used to facilitate sequencing efforts. The database also allows researchers mapping known genes or other sequences of interest, rapid and easy access to the fundamental organization of the M.grisea genome. This database, MagnaportheDB, can be accessed on the web at http://www.cals.ncsu.edu/fungal_genomics/mgdatabase/int.htm.
Jagtap, Pratik; Goslinga, Jill; Kooren, Joel A; McGowan, Thomas; Wroblewski, Matthew S; Seymour, Sean L; Griffin, Timothy J
2013-04-01
Large databases (>10(6) sequences) used in metaproteomic and proteogenomic studies present challenges in matching peptide sequences to MS/MS data using database-search programs. Most notably, strict filtering to avoid false-positive matches leads to more false negatives, thus constraining the number of peptide matches. To address this challenge, we developed a two-step method wherein matches derived from a primary search against a large database were used to create a smaller subset database. The second search was performed against a target-decoy version of this subset database merged with a host database. High confidence peptide sequence matches were then used to infer protein identities. Applying our two-step method for both metaproteomic and proteogenomic analysis resulted in twice the number of high confidence peptide sequence matches in each case, as compared to the conventional one-step method. The two-step method captured almost all of the same peptides matched by the one-step method, with a majority of the additional matches being false negatives from the one-step method. Furthermore, the two-step method improved results regardless of the database search program used. Our results show that our two-step method maximizes the peptide matching sensitivity for applications requiring large databases, especially valuable for proteogenomics and metaproteomics studies. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Structator: fast index-based search for RNA sequence-structure patterns
2011-01-01
Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several stem-loop substructures can be described by multiple sequence-structure patterns and their matches are efficiently handled by a novel chaining method. Beyond our algorithmic contributions, we provide with Structator a complete and robust open-source software solution for index-based search of RNA sequence-structure patterns. The Structator software is available at http://www.zbh.uni-hamburg.de/Structator. PMID:21619640
Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita
2010-12-31
Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification.
Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita
2010-01-01
Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification. PMID:21347215
SinEx DB: a database for single exon coding sequences in mammalian genomes.
Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S
2016-01-01
Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.
muBLASTP: database-indexed protein sequence search on multicore CPUs.
Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun
2016-11-04
The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.
Zhang, Xiao-Meng; Li, Fan; Zhang, Bing; Chen, Xiao-Fen; Piao, Jing-Zhu
2018-01-01
The common Aconitum herbs in clinical application mainly include Aconiti Radix(Chuanwu), Aconiti Kusnezoffii Radix(Caowu) and Aconiti Lateralis Radix Praeparaia(Fuzi), all of which have toxicity. Therefore, the safety of using Chinese patent drugs including Aconitum herbs has become an hot topic in clinical controversy. Based on the data-mining methods, this study explored the characteristics and causes of adverse drug reactions/events (ADR/ADE) of the Chinese patent drugs including Aconitum, in order to provide pharmacovigilance and rational drug use suggestions for clinical application. The detailed ADR/ADE reports about the Chinese patent drugs including Aconitum herbs were retrieved in the domestic literature databases since 1984 to now. The information extraction and data-mining were conducted based on the platforms of Microsoft office Excel 2016, Clementine 12.0 and Cytoscape 3.3.0. Finally, 78 detailed ADR/ADE reports involving a total of 30 varieties were included. 92.31% ADR/ADE were surely or likely led by the Chinese patent drugs including Aconitum, mostly involving multiple system/organ damages with good prognosis, and even 1 case of death. The incidence of included ADRs/ADEs was associated with various factors such as the patient idiosyncratic, drug toxicity, as well as clinical medication. The patient age was most closely related to ADR/ADEs, and those aged from 60 to 69 were more easily suffered from the ADRs/ADEs of Chinese patent drugs including Aconitum. The probability of ADR/ADEs for the drugs including Chuanwu or Caowu was greater than that of Fuzi, and the using beyond the instructions dose was the most important potential safety hazard in the clinical medication process. For the regular and characteristics of ADR/ADEs led by Chinese patent drugs including Aconitum, special attention shall be paid to the elder patients or with the patients with allergies; strictly control the dosage and course of treatment, strengthen the safety medication education to public, and avoid misuse or abuse to ensure rational drug use. Copyright© by the Chinese Pharmaceutical Association.
BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson, William R
2014-01-01
BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
The Protein Information Resource: an integrated public resource of functional annotation of proteins
Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.
2002-01-01
The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247
Padliya, Neerav D; Garrett, Wesley M; Campbell, Kimberly B; Tabb, David L; Cooper, Bret
2007-11-01
LC-MS/MS has demonstrated potential for detecting plant pathogens. Unlike PCR or ELISA, LC-MS/MS does not require pathogen-specific reagents for the detection of pathogen-specific proteins and peptides. However, the MS/MS approach we and others have explored does require a protein sequence reference database and database-search software to interpret tandem mass spectra. To evaluate the limitations of database composition on pathogen identification, we analyzed proteins from cultured Ustilago maydis, Phytophthora sojae, Fusarium graminearum, and Rhizoctonia solani by LC-MS/MS. When the search database did not contain sequences for a target pathogen, or contained sequences to related pathogens, target pathogen spectra were reliably matched to protein sequences from nontarget organisms, giving an illusion that proteins from nontarget organisms were identified. Our analysis demonstrates that when database-search software is used as part of the identification process, a paradox exists whereby additional sequences needed to detect a wide variety of possible organisms may lead to more cross-species protein matches and misidentification of pathogens.
NASA Patent Abstracts Bibliography: A Continuing Bibliography. Supplement 58
NASA Technical Reports Server (NTRS)
2001-01-01
This report lists reports, articles and other documents recently announced in the NASA STI Database. Several thousand inventions result each year from the aeronautical and space research supported by the National Aeronautics and Space Administration. The inventions having important use in government programs or significant commercial potential are usually patented by NASA. These inventions cover practically all fields of technology and include many that have useful and valuable commercial application. NASA inventions best serve the interests of the United States when their benefits are available to the public. In many instances, the granting of nonexclusive or exclusive licenses for the practice of these inventions may assist in the accomplishment of this objective. This bibliography is published as a service to companies, firms, and individuals seeking new, licensable products for the commercial market. The NASA Patent Abstracts Bibliography is a semiannual NASA publication containing comprehensive abstracts of NASA owned inventions covered by U.S. patents. The citations included in the bibliography arrangement of citations were originally published in NASA's Scientific and Technical Aerospace Reports (STAR) and cover STAR announcements made since May 1969. The citations published in this issue cover the period July 2000 through December 2000. This issue includes 10 major subject divisions separated into 76 specific categories and one general category/division. This scheme was devised in 1975 and revised in 1987 in lieu of the 34 category divisions which were utilized in supplements (01) through (06) covering STAR abstracts from May 1969 through January 1974. Each entry consists of a STAR citation accompanied by an abstract and, when appropriate, a key illustration taken from the patent or application for patent. Entries are arranged by subject category in ascending order. A typical citation and abstract presents the various data elements included in most records cited. This appears after the table of contents.
Huo, Xiao-qian; He, Yu-su; Qiao, Lian-sheng; Sun, Zhi-yi; Zhang, Yan-ling
2014-12-01
The combined application of statins that inhibit HMG-CoA reductase and fibrates that activate PPAR-α can produce a better lipid-lowering effect than the simple application, but with stronger adverse reactions at the same time. In the treatment of hyperlipidemia, the combined administration of TCMs and HMG-CoA reductase inhibitor in treating hyperlipidemia shows stable efficacy and less adverse reactions, and provides a new option for the combined application of drugs. In this article, the pharmacophore technology was used to search chemical components of TCMs, trace their source herbs, and determine the potential common TCMs that could activate PPAR-α. Because there is no hyperlipidemia-related medication reference in modern TCM classics, to ensure the high safety and efficacy of all selected TCMs, we selected TCMs that are proved to be combined with statins in the World Traditional/Natural Medicine Patent Database, analyzed corresponding drugs in pharmacophore results based on that, and finally obtained common TCMs that can be applied in PPAR-α and combined with statins. Specifically, the pharmacophore model was based on eight receptor-ligand complexes of PPAR-α. The Receptor-Ligand Pharmacophore Generation module in the DS program was used to build the model, optimize with the Screen Library module, and get the best sub-pharmacophore, which consisted of two hydrogen bond acceptor, three hydrophobic groups and 19 excluded volumes, with the identification effectiveness index value N of 2. 82 and the comprehensive evaluation index CAI value of 1. 84. The model was used to screen the TCMD database, hit 5,235 kinds of chemical components and 1 193 natural animals and plants, and finally determine 62 TCMs. Through patent retrieval, we found 38 TCMs; After comparing with the virtual screening results, we finally got seven TCMs.
Genomics dataset of unidentified disclosed isolates.
Rekadwad, Bhagwan N
2016-09-01
Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.
Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B.; Tóth, Gábor; Ortutay, Csaba P.; Patthy, László
2005-01-01
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21 061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically. PMID:15608291
Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B; Tóth, Gábor; Ortutay, Csaba P; Patthy, László
2005-01-01
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.
The Universal Protein Resource (UniProt): an expanding universe of protein information.
Wu, Cathy H; Apweiler, Rolf; Bairoch, Amos; Natale, Darren A; Barker, Winona C; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J; Mazumder, Raja; O'Donovan, Claire; Redaschi, Nicole; Suzek, Baris
2006-01-01
The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/.
BIOPEP database and other programs for processing bioactive peptide sequences.
Minkiewicz, Piotr; Dziuba, Jerzy; Iwaniak, Anna; Dziuba, Marta; Darewicz, Małgorzata
2008-01-01
This review presents the potential for application of computational tools in peptide science based on a sample BIOPEP database and program as well as other programs and databases available via the World Wide Web. The BIOPEP application contains a database of biologically active peptide sequences and a program enabling construction of profiles of the potential biological activity of protein fragments, calculation of quantitative descriptors as measures of the value of proteins as potential precursors of bioactive peptides, and prediction of bonds susceptible to hydrolysis by endopeptidases in a protein chain. Other bioactive and allergenic peptide sequence databases are also presented. Programs enabling the construction of binary and multiple alignments between peptide sequences, the construction of sequence motifs attributed to a given type of bioactivity, searching for potential precursors of bioactive peptides, and the prediction of sites susceptible to proteolytic cleavage in protein chains are available via the Internet as are other approaches concerning secondary structure prediction and calculation of physicochemical features based on amino acid sequence. Programs for prediction of allergenic and toxic properties have also been developed. This review explores the possibilities of cooperation between various programs.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2010-01-01
GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bi-monthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI homepage: www.ncbi.nlm.nih.gov.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2009-01-01
GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank(R) staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika
2010-01-27
Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set ofmore » tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.« less
Gaby, John Christian; Buckley, Daniel H
2014-01-01
We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karpinets, Tatiana V; Park, Byung; Syed, Mustafa H
2010-01-01
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire non-redundant sequences of the CAZy database. Themore » second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains (DUF) and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit (CAT), and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.« less
Gaby, John Christian; Buckley, Daniel H.
2014-01-01
We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm PMID:24501396
Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C
2010-12-01
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.
Extracting and connecting chemical structures from text sources using chemicalize.org.
Southan, Christopher; Stracz, Andras
2013-04-23
Exploring bioactive chemistry requires navigating between structures and data from a variety of text-based sources. While PubChem currently includes approximately 16 million document-extracted structures (15 million from patents) the extent of public inter-document and document-to-database links is still well below any estimated total, especially for journal articles. A major expansion in access to text-entombed chemistry is enabled by chemicalize.org. This on-line resource can process IUPAC names, SMILES, InChI strings, CAS numbers and drug names from pasted text, PDFs or URLs to generate structures, calculate properties and launch searches. Here, we explore its utility for answering questions related to chemical structures in documents and where these overlap with database records. These aspects are illustrated using a common theme of Dipeptidyl Peptidase 4 (DPPIV) inhibitors. Full-text open URL sources facilitated the download of over 1400 structures from a DPPIV patent and the alignment of specific examples with IC50 data. Uploading the SMILES to PubChem revealed extensive linking to patents and papers, including prior submissions from chemicalize.org as submitting source. A DPPIV medicinal chemistry paper was completely extracted and structures were aligned to the activity results table, as well as linked to other documents via PubChem. In both cases, key structures with data were partitioned from common chemistry by dividing them into individual new PDFs for conversion. Over 500 structures were also extracted from a batch of PubMed abstracts related to DPPIV inhibition. The drug structures could be stepped through each text occurrence and included some converted MeSH-only IUPAC names not linked in PubChem. Performing set intersections proved effective for detecting compounds-in-common between documents and merged extractions. This work demonstrates the utility of chemicalize.org for the exploration of chemical structure connectivity between documents and databases, including structure searches in PubChem, InChIKey searches in Google and the chemicalize.org archive. It has the flexibility to extract text from any internal, external or Web source. It synergizes with other open tools and the application is undergoing continued development. It should thus facilitate progress in medicinal chemistry, chemical biology and other bioactive chemistry domains.
A meta-analysis of bacterial diversity in the feces of cattle
USDA-ARS?s Scientific Manuscript database
In this study, we conducted a meta-analysis on 16S rRNA gene sequences of bovine fecal origin that are publicly available in the RDP database. A total of 13663 sequences including 603 isolate sequences were identified in the RDP database (Release 11, Update 1), where 13447 sequences were assigned t...
Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan
2016-01-01
Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.
PlantCAZyme: a database for plant carbohydrate-active enzymes
Ekstrom, Alexander; Taujale, Rahil; McGinn, Nathan; Yin, Yanbin
2014-01-01
PlantCAZyme is a database built upon dbCAN (database for automated carbohydrate active enzyme annotation), aiming to provide pre-computed sequence and annotation data of carbohydrate active enzymes (CAZymes) to plant carbohydrate and bioenergy research communities. The current version contains data of 43 790 CAZymes of 159 protein families from 35 plants (including angiosperms, gymnosperms, lycophyte and bryophyte mosses) and chlorophyte algae with fully sequenced genomes. Useful features of the database include: (i) a BLAST server and a HMMER server that allow users to search against our pre-computed sequence data for annotation purpose, (ii) a download page to allow batch downloading data of a specific CAZyme family or species and (iii) protein browse pages to provide an easy access to the most comprehensive sequence and annotation data. Database URL: http://cys.bios.niu.edu/plantcazyme/ PMID:25125445
Plasmodium ovale infection in Malaysia: first imported case.
Lim, Yvonne A L; Mahmud, Rohela; Chew, Ching Hoong; T, Thiruventhiran; Chua, Kek Heng
2010-10-08
Plasmodium ovale infection is rarely reported in Malaysia. This is the first imported case of P. ovale infection in Malaysia which was initially misdiagnosed as Plasmodium vivax. Peripheral blood sample was first examined by Giemsa-stained microscopy examination and further confirmed using a patented in-house multiplex PCR followed by sequencing. Initial results from peripheral blood smear examination diagnosed P. vivax infection. However further analysis using a patented in-house multiplex PCR followed by sequencing confirmed the presence of P. ovale. Given that Anopheles maculatus and Anopheles dirus, vectors of P. ovale are found in Malaysia, this finding has significant implication on Malaysia's public health sector. The current finding should serve as an alert to epidemiologists, clinicians and laboratory technicians in the possibility of finding P. ovale in Malaysia. P. ovale should be considered in the differential diagnosis of imported malaria cases in Malaysia due to the exponential increase in the number of visitors from P. ovale endemic regions and the long latent period of P. ovale. It is also timely that conventional diagnosis of malaria via microscopy should be coupled with more advanced molecular tools for effective diagnosis.
RNAcentral: an international database of ncRNA sequences
Williams, Kelly Porter
2014-10-28
The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.
An Integrated Molecular Database on Indian Insects.
Pratheepa, Maria; Venkatesan, Thiruvengadam; Gracy, Gandhi; Jalali, Sushil Kumar; Rangheswaran, Rajagopal; Antony, Jomin Cruz; Rai, Anil
2018-01-01
MOlecular Database on Indian Insects (MODII) is an online database linking several databases like Insect Pest Info, Insect Barcode Information System (IBIn), Insect Whole Genome sequence, Other Genomic Resources of National Bureau of Agricultural Insect Resources (NBAIR), Whole Genome sequencing of Honey bee viruses, Insecticide resistance gene database and Genomic tools. This database was developed with a holistic approach for collecting information about phenomic and genomic information of agriculturally important insects. This insect resource database is available online for free at http://cib.res.in. http://cib.res.in/.
Renard, Bernhard Y.; Xu, Buote; Kirchner, Marc; Zickmann, Franziska; Winter, Dominic; Korten, Simone; Brattig, Norbert W.; Tzur, Amit; Hamprecht, Fred A.; Steen, Hanno
2012-01-01
Currently, the reliable identification of peptides and proteins is only feasible when thoroughly annotated sequence databases are available. Although sequencing capacities continue to grow, many organisms remain without reliable, fully annotated reference genomes required for proteomic analyses. Standard database search algorithms fail to identify peptides that are not exactly contained in a protein database. De novo searches are generally hindered by their restricted reliability, and current error-tolerant search strategies are limited by global, heuristic tradeoffs between database and spectral information. We propose a Bayesian information criterion-driven error-tolerant peptide search (BICEPS) and offer an open source implementation based on this statistical criterion to automatically balance the information of each single spectrum and the database, while limiting the run time. We show that BICEPS performs as well as current database search algorithms when such algorithms are applied to sequenced organisms, whereas BICEPS only uses a remotely related organism database. For instance, we use a chicken instead of a human database corresponding to an evolutionary distance of more than 300 million years (International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716). We demonstrate the successful application to cross-species proteomics with a 33% increase in the number of identified proteins for a filarial nematode sample of Litomosoides sigmodontis. PMID:22493179
E-MSD: an integrated data resource for bioinformatics
Velankar, S.; McNeil, P.; Mittard-Runte, V.; Suarez, A.; Barrell, D.; Apweiler, R.; Henrick, K.
2005-01-01
The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the worldwide Protein Data Bank (wwPDB) and to work towards the integration of various bioinformatics data resources. One of the major obstacles to the improved integration of structural databases such as MSD and sequence databases like UniProt is the absence of up to date and well-maintained mapping between corresponding entries. We have worked closely with the UniProt group at the EBI to clean up the taxonomy and sequence cross-reference information in the MSD and UniProt databases. This information is vital for the reliable integration of the sequence family databases such as Pfam and Interpro with the structure-oriented databases of SCOP and CATH. This information has been made available to the eFamily group (http://www.efamily.org.uk/) and now forms the basis of the regular interchange of information between the member databases (MSD, UniProt, Pfam, Interpro, SCOP and CATH). This exchange of annotation information has enriched the structural information in the MSD database with annotation from wider sequence-oriented resources. This work was carried out under the ‘Structure Integration with Function, Taxonomy and Sequences (SIFTS)’ initiative (http://www.ebi.ac.uk/msd-srv/docs/sifts) in the MSD group. PMID:15608192
MAGIC database and interfaces: an integrated package for gene discovery and expression.
Cordonnier-Pratt, Marie-Michèle; Liang, Chun; Wang, Haiming; Kolychev, Dmitri S; Sun, Feng; Freeman, Robert; Sullivan, Robert; Pratt, Lee H
2004-01-01
The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC) Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs), and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.
Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.
Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing
2018-04-06
Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.
Contamination of sequence databases with adaptor sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoshikawa, Takeo; Sanders, A.R.; Detera-Wadleigh, S.D.
Because of the exponential increase in the amount of DNA sequences being added to the public databases on a daily basis, it has become imperative to identify sources of contamination rapidly. Previously, contaminations of sequence databases have been reported to alert the scientific community to the problem. These contaminations can be divided into two categories. The first category comprises host sequences that have been difficult for submitters to manage or control. Examples include anomalous sequences derived from Escherichia coli, which are inserted into the chromosomes (and plasmids) of the bacterial hosts. Insertion sequences are highly mobile and are capable ofmore » transposing themselves into plasmids during cloning manipulation. Another example of the first category is the infection with yeast genomic DNA or with bacterial DNA of some commercially available cDNA libraries from Clontech. The second category of database contamination is due to the inadvertent inclusion of nonhost sequences. This category includes incorporation of cloning-vector sequences and multicloning sites in the database submission. M13-derived artifacts have been common, since M13-based vectors have been widely used for subcloning DNA fragments. Recognizing this problem, the National Center for Biotechnology Information (NCBI) started to screen, in April 1994, all sequences directly submitted to GenBank, against a set of vector data retrieved from GenBank by use of key-word searches, such as {open_quotes}vector.{close_quotes} In this report, we present evidence for another sequence artifact that is widespread but that, to our knowledge, has not yet been reported. 11 refs., 1 tab.« less
ESTree db: a Tool for Peach Functional Genomics
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo
2005-01-01
Background The ESTree db represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. Results The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. Conclusion The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig. PMID:16351742
ESTree db: a tool for peach functional genomics.
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo
2005-12-01
The ESTree db http://www.itb.cnr.it/estree/ represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig.
Jordan, James B; Tu, Xiang
2008-01-01
The aim of this review is to critically examine the clinical trial research on Traditional Chinese Medicine (TCM) as an intervention in treating heroin addiction in People's Republic of China. This review examines Chinese-language-only publications for the patent medicines: Shenfu Tuodu, Fukang Pian, and Shifu Sheng. Other compound medicines will be reviewed in future publications. A systematic review of the literature was conducted in Western and Chinese databases. Most trials were excluded because they did not declare randomization and had poor methodology or reporting. The majority of clinical evidence in the random controlled trials demonstrates good evidence for TCM patent medicines in heroin addiction treatment. When compared to typical Western medications, TCMs demonstrate fewer side-effects, in addition to equal measures of treatment efficacy and safety.
Govindaraj, Mahalingam
2015-01-01
The number of sequenced crop genomes and associated genomic resources is growing rapidly with the advent of inexpensive next generation sequencing methods. Databases have become an integral part of all aspects of science research, including basic and applied plant and animal sciences. The importance of databases keeps increasing as the volume of datasets from direct and indirect genomics, as well as other omics approaches, keeps expanding in recent years. The databases and associated web portals provide at a minimum a uniform set of tools and automated analysis across a wide range of crop plant genomes. This paper reviews some basic terms and considerations in dealing with crop plant databases utilization in advancing genomic era. The utilization of databases for variation analysis with other comparative genomics tools, and data interpretation platforms are well described. The major focus of this review is to provide knowledge on platforms and databases for genome-based investigations of agriculturally important crop plants. The utilization of these databases in applied crop improvement program is still being achieved widely; otherwise, the end for sequencing is not far away. PMID:25874133
Domain fusion analysis by applying relational algebra to protein sequence and domain databases
Truong, Kevin; Ikura, Mitsuhiko
2003-01-01
Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020
Büssow, Konrad; Hoffmann, Steve; Sievert, Volker
2002-12-19
Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping
K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale
1998-01-01
DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
Application of cytochrome b DNA sequences for the authentication of endangered snake species.
Wong, Ka-Lok; Wang, Jun; But, Paul Pui-Hay; Shaw, Pang-Chui
2004-01-06
In order to enforce the conservation program and curbing the illegal trading and consumption of endangered snake species, the value of cytochrome b sequence in the authentication of snake species was evaluated. As an illustration, DNA was extracted, selected cytochrome b DNA sequences amplified and sequenced from six snakes commonly consumed in Hong Kong. Cataloging with sequences available in public, a cytochrome b database containing 90 species of snakes was constructed. In this database, sequence homology between snakes ranged from 70.68 to 95.11%. On the other hand, intraspecific variation of three tested snakes was 0-0.98%. Using the database, we were able to determine the identity of six meat samples confiscated by the Agriculture, Fisheries and Conservation Department, HKSAR.
The Importance of Biological Databases in Biological Discovery.
Baxevanis, Andreas D; Bateman, Alex
2015-06-19
Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.
Orexin research: patent news from 2016.
Boss, Christoph; Roch, Catherine
2017-10-01
The orexin system consists of two G-protein-coupled receptors, orexin 1 and orexin 2 and two endogenous ligands, orexin A and orexin B . It is evolutionarily highly conserved. It is involved in the promotion of wakefulness as well as in anxiety and addictive disorders. In addition, its activation via the Ox1 receptor triggers apoptosis in several cancer cell lines. Dual orexin receptor antagonists are successfully used to treat primary insomnia. The major open questions are now related to the clinical validation of Ox1 selective antagonists. A strong rationale exists for orexin agonism in the treatment of narcolepsy with cataplexy. Areas covered: The patent applications from Thomson Reuters Integrity Database added in 2016 are summarized and discussed together with the most important findings published in the scientific literature. Expert opinion: The large number of patents shows the continuing interest in the orexin receptors as targets. The structural scope covered is narrow. Questions about novelty and inventiveness are evident. The additional information published on X-ray structures on both orexin receptors opens new ways of optimizing antagonists. It might also influence the efforts in the identification of orexin receptor agonists. Being potential treatments for narcolepsy with cataplexy.
Improvement on upper limb body-powered prostheses (1921-2016): A systematic review.
Hashim, Nur Afiqah; Abd Razak, Nasrul Anuar; Abu Osman, Noor Azuan; Gholizadeh, Hossein
2018-01-01
Body-powered prostheses are known for their advantages of cost, reliability, training period, maintenance, and proprioceptive feedback. This study primarily aims to analyze the work related to the improvement of upper limb body-powered prostheses prior to 2016. A systematic review conducted via the search of the Web of Science electronic database, Google Scholar, and Google Patents identified 155 papers from 1921 to 2016. Sackett's initial rules of evidence were used to determine the levels of evidence, and only papers categorized in the design and development category and patents were analyzed. A total of 40 papers in the sixth level of "Design and Development" of an upper limb body-powered prosthesis were found. Approximately 81% were categorized under mechanical alteration. Most papers were patent-type documents (48%), with the Journal of Rehabilitation Research and Development publishing most of the articles related to the design and development of body-powered prostheses. Papers in the scope of the study were published once every 3 years in almost a century, proving that only a few studies were conducted to improve body-powered arms compared with myoelectric technology. Further research should be carried out mainly in areas that have received less attention.
Liu, Kunmeng; Lin, Hui-Heng; Pi, Rongbiao; Mak, Shinghung; Han, Yifan; Hu, Yuanjia
2018-04-01
Today, over 20 million people suffer from Alzheimer's disease (AD) worldwide. AD has become a critical issue to human health, especially in aging societies, and therefore it is a research hotspot in the global scientific community. The technology flow method differs from traditional reviews generating an informative overview of the research and development (R&D) landscape in a specific technological area. We need such an updated method to get a general overview of the R&D of anti-AD drugs in light of the dramatic developments in this area in recent years. Areas covered: This study collects patent data from the Integrity database. A total of 399 patents with 821 internal citation pairs in the US from 1978 to 2017 were analyzed. Patent citation network analysis was used to visualize the technology relationship. Expert opinion: For better production of anti-AD drugs, governments should emphasize the multi-target drug design, provide policy support for private companies, and encourage multilateral cooperation. The β-amyloid peptide (Aβ) theory leaves much to be desired; neurotransmitter and tau protein hypotheses are worth further examination. The use of old drugs for new indications is promising, as are traditional herbal medicines.
Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng
2017-05-10
Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .
Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A
2011-01-01
PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Thomas, Paul D; Kejariwal, Anish; Campbell, Michael J; Mi, Huaiyu; Diemer, Karen; Guo, Nan; Ladunga, Istvan; Ulitsky-Lazareva, Betty; Muruganujan, Anushya; Rabkin, Steven; Vandergriff, Jody A; Doremieux, Olivier
2003-01-01
The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. The ontology terms and protein families and subfamilies, as well as Drosophila gene c;assifications, can be browsed and searched for free. Due to outstanding contractual obligations, access to human gene classifications and to protein family trees and multiple sequence alignments will temporarily require a nominal registration fee. PANTHER is publicly available on the web at http://panther.celera.com.
Version VI of the ESTree db: an improved tool for peach transcriptome analysis
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Merelli, Ivan; Barale, Francesca; Milanesi, Luciano; Stella, Alessandra; Pozzi, Carlo
2008-01-01
Background The ESTree database (db) is a collection of Prunus persica and Prunus dulcis EST sequences that in its current version encompasses 75,404 sequences from 3 almond and 19 peach libraries. Nine peach genotypes and four peach tissues are represented, from four fruit developmental stages. The aim of this work was to implement the already existing ESTree db by adding new sequences and analysis programs. Particular care was given to the implementation of the web interface, that allows querying each of the database features. Results A Perl modular pipeline is the backbone of sequence analysis in the ESTree db project. Outputs obtained during the pipeline steps are automatically arrayed into the fields of a MySQL database. Apart from standard clustering and annotation analyses, version VI of the ESTree db encompasses new tools for tandem repeat identification, annotation against genomic Rosaceae sequences, and positioning on the database of oligomer sequences that were used in a peach microarray study. Furthermore, known protein patterns and motifs were identified by comparison to PROSITE. Based on data retrieved from sequence annotation against the UniProtKB database, a script was prepared to track positions of homologous hits on the GO tree and build statistics on the ontologies distribution in GO functional categories. EST mapping data were also integrated in the database. The PHP-based web interface was upgraded and extended. The aim of the authors was to enable querying the database according to all the biological aspects that can be investigated from the analysis of data available in the ESTree db. This is achieved by allowing multiple searches on logical subsets of sequences that represent different biological situations or features. Conclusions The version VI of ESTree db offers a broad overview on peach gene expression. Sequence analyses results contained in the database, extensively linked to external related resources, represent a large amount of information that can be queried via the tools offered in the web interface. Flexibility and modularity of the ESTree analysis pipeline and of the web interface allowed the authors to set up similar structures for different datasets, with limited manual intervention. PMID:18387211
cDNA encoding a polypeptide including a hevein sequence
Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil
1993-02-16
A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.
Ruppitsch, W; Stöger, A; Indra, A; Grif, K; Schabereiter-Gurtner, C; Hirschl, A; Allerberger, F
2007-03-01
In a bioterrorism event a rapid tool is needed to identify relevant dangerous bacteria. The aim of the study was to assess the usefulness of partial 16S rRNA gene sequence analysis and the suitability of diverse databases for identifying dangerous bacterial pathogens. For rapid identification purposes a 500-bp fragment of the 16S rRNA gene of 28 isolates comprising Bacillus anthracis, Brucella melitensis, Burkholderia mallei, Burkholderia pseudomallei, Francisella tularensis, Yersinia pestis, and eight genus-related and unrelated control strains was amplified and sequenced. The obtained sequence data were submitted to three public and two commercial sequence databases for species identification. The most frequent reason for incorrect identification was the lack of the respective 16S rRNA gene sequences in the database. Sequence analysis of a 500-bp 16S rDNA fragment allows the rapid identification of dangerous bacterial species. However, for discrimination of closely related species sequencing of the entire 16S rRNA gene, additional sequencing of the 23S rRNA gene or sequencing of the 16S-23S rRNA intergenic spacer is essential. This work provides comprehensive information on the suitability of partial 16S rDNA analysis and diverse databases for rapid and accurate identification of dangerous bacterial pathogens.
Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models
2017-01-01
We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder–decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step toward solving the challenging problem of computational retrosynthetic analysis. PMID:29104927
Meiler, Arno; Klinger, Claudia; Kaufmann, Michael
2012-09-08
The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC's NUCOCOG dataset as the largest one available for that purpose thus far. Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills.
2012-01-01
Background The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Results Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. Conclusions Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills. PMID:22958836
Entomopathogen ID: a curated sequence resource for entomopathogenic fungi
USDA-ARS?s Scientific Manuscript database
We report the development of a publicly accessible, curated database of Hypocrealean entomopathogenic fungi sequence data. The goal is to provide a platform for users to easily access sequence data from reference strains. The database can be used to accurately identify unknown entomopathogenic fungi...
GOBASE—a database of mitochondrial and chloroplast information
O'Brien, Emmet A.; Badidi, Elarbi; Barbasiewicz, Ania; deSousa, Cristina; Lang, B. Franz; Burger, Gertraud
2003-01-01
GOBASE is a relational database containing integrated sequence, RNA secondary structure and biochemical and taxonomic information about organelles. GOBASE release 6 (summer 2002) contains over 130 000 mitochondrial sequences, an increase of 37% over the previous release, and more than 30 000 chloroplast sequences in a new auxiliary database. To handle this flood of new data, we have designed and implemented GOpop, a Java system for population and verification of the database. We have also implemented a more powerful and flexible user interface using the PHP programming language. http://megasun.bch.umontreal.ca/gobase/gobase.html. PMID:12519975
Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.
2007-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (). PMID:17202161
Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P
1996-01-01
We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. As in previous editions the genetic names are consistently associated to each sequence with a known and confirmed ORF. If necessary, synonyms are given in the case of allelic duplicated sequences. Although the first publication of a sequence gives-according to our rules-the genetic name of a gene, in some instances more commonly used names are given to avoid nomenclature problems and the use of ancient designations which are no longer used. In these cases the old designation is given as synonym. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, SWISSPROT and EMBL accession numbers. New entries will also contain the name from the systematic sequencing efforts. Since the release of LISTA4.1 we update the database continuously. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. This release includes reports from full Smith and Watermann peptide-level searches against a non-redundant protein sequence database. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). The database is available by FTP and on World Wide Web. PMID:8594599
RNAcentral: A comprehensive database of non-coding RNA sequences
Williams, Kelly Porter; Lau, Britney Yan
2016-10-28
RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less
RNAcentral: A comprehensive database of non-coding RNA sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Kelly Porter; Lau, Britney Yan
RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less
Development and applications of the EntomopathogenID MLSA database for use in agricultural systems
USDA-ARS?s Scientific Manuscript database
The current study reports the development and application of a publicly accessible, curated database of Hypocrealean entomopathogenic fungi sequence data. The goal was to provide a platform for users to easily access sequence data from reference strains. The database can be used to accurately identi...
Abdulkadir, Mohammed; Abdulkadir, Zainab
2016-06-01
Congenital heart diseases cause significant childhood morbidity and mortality. Several restricted studies have been conducted on the epidemiology in Nigeria. No truly nationwide data on patterns of congenital heart disease exists. To determine the patterns of congenital heart disease in children in Nigeria and examine trends in the occurrence of individual defects across 5 decades. We searched PubMed database, Google scholar, TRIP database, World Health Organisation libraries and reference lists of selected articles for studies on patterns of congenital heart disease among children in Nigeria between 1964 and 2015. Two researchers reviewed the papers independently and extracted the data. Seventeen studies were selected that included 2,953 children with congenital heart disease. The commonest congenital heart diseases in Nigeria are ventricular septal defect (40.6%), patent ductus arteriosus (18.4%), atrial septal defect (11.3%) and tetralogy of Fallot (11.8%). There has been a 6% increase in the burden of VSD in every decade for the 5 decades studied and a decline in the occurrence of pulmonary stenosis. Studies conducted in Northern Nigeria demonstrated higher proportions of atrial septal defects than patent ductus arteriosus. Ventricular septal defects are the commonest congenital heart diseases in Nigeria with a rising burden.
Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.
Chen, Qingyu; Zobel, Justin; Zhang, Xiuzhen; Verspoor, Karin
2016-01-01
First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases. We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material.
Predictive genomics DNA profiling for athletic performance.
Kambouris, Marios; Ntalouka, Foteini; Ziogas, Georgios; Maffulli, Nicola
2012-12-01
Genes control biological processes such as muscle, cartilage and bone formation, muscle energy production and metabolism (mitochondriogenesis, lactic acid removal), blood and tissue oxygenation (erythropoiesis, angiogenesis, vasodilatation), all essential in sport and athletic performance. DNA sequence variations in such genes confer genetic advantages that can be exploited, or genetic 'barriers' that could be overcome to achieve optimal athletic performance. Predictive Genomic DNA Profiling for athletic performance reveals genetic variations that may be associated with better suitability for endurance, strength and speed sports, vulnerability to sports-related injuries and individualized nutritional requirements. Knowledge of genetic 'suitability' in respect to endurance capacity or strength and speed would lead to appropriate sport and athletic activity selection. Knowledge of genetic advantages and barriers would 'direct' an individualized training program, nutritional plan and nutritional supplementation to achieving optimal performance, overcoming 'barriers' that results from intense exercise and pressure under competition with minimum waste of time and energy and avoidance of health risks (hypertension, cardiovascular disease, inflammation, and musculoskeletal injuries) related to exercise, training and competition. Predictive Genomics DNA profiling for Athletics and Sports performance is developing into a tool for athletic activity and sport selection and for the formulation of individualized and personalized training and nutritional programs to optimize health and performance for the athlete. Human DNA sequences are patentable in some countries, while in others DNA testing methodologies [unless proprietary], are non patentable. On the other hand, gene and variant selection, genotype interpretation and the risk and suitability assigning algorithms based on the specific Genomic variants used are amenable to patent protection.
Ebbie: automated analysis and storage of small RNA cloning data using a dynamic web server
Ebhardt, H Alexander; Wiese, Kay C; Unrau, Peter J
2006-01-01
Background DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2-5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study. Results Ebbie is a semi-automated smRNA cloning data processing algorithm, which initially searches for any substring within a DNA sequencing text file, which is flanked by two constant strings. The substring, also termed smRNA or insert, is stored in a MySQL and BlastN database. These inserts are then compared using BlastN to locally installed databases allowing the rapid comparison of the insert to both the growing smRNA database and to other static sequence databases. Our laboratory used Ebbie to analyze scores of DNA sequencing data originating from an smRNA cloning project[6]. Through its built-in instant analysis of all inserts using BlastN, we were able to quickly identify 33 groups of smRNAs from ~700 database entries. This clustering allowed the easy identification of novel and highly expressed clusters of smRNAs. Ebbie is available under GNU GPL and currently implemented on Conclusion Ebbie was designed for medium sized smRNA cloning projects with about 1,000 database entries [6-8].Ebbie can be used for any type of sequence analysis where two constant primer regions flank a sequence of interest. The reliable storage of inserts, and their annotation in a MySQL database, BlastN[9] comparison of new inserts to dynamic and static databases make it a powerful new tool in any laboratory using DNA sequencing. Ebbie also prevents manual mistakes during the excision process and speeds up annotation and data-entry. Once the server is installed locally, its access can be restricted to protect sensitive new DNA sequencing data. Ebbie was primarily designed for smRNA cloning projects, but can be applied to a variety of RNA and DNA cloning projects[2,3,10,11]. PMID:16584563
Brandstätter, Anita; Peterson, Christine T; Irwin, Jodi A; Mpoke, Solomon; Koech, Davy K; Parson, Walther; Parsons, Thomas J
2004-10-01
Large forensic mtDNA databases which adhere to strict guidelines for generation and maintenance, are not available for many populations outside of the United States and western Europe. We have established a high quality mtDNA control region sequence database for urban Nairobi as both a reference database for forensic investigations, and as a tool to examine the genetic variation of Kenyan sequences in the context of known African variation. The Nairobi sequences exhibited high variation and a low random match probability, indicating utility for forensic testing. Haplogroup identification and frequencies were compared with those reported from other published studies on African, or African-origin populations from Mozambique, Sierra Leone, and the United States, and suggest significant differences in the mtDNA compositions of the various populations. The quality of the sequence data in our study was investigated and supported using phylogenetic measures. Our data demonstrate the diversity and distinctiveness of African populations, and underline the importance of establishing additional forensic mtDNA databases of indigenous African populations.
mESAdb: microRNA Expression and Sequence Analysis Database
Kaya, Koray D.; Karakülah, Gökhan; Yakıcıer, Cengiz M.; Acar, Aybar C.; Konu, Özlen
2011-01-01
microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data. PMID:21177657
mESAdb: microRNA expression and sequence analysis database.
Kaya, Koray D; Karakülah, Gökhan; Yakicier, Cengiz M; Acar, Aybar C; Konu, Ozlen
2011-01-01
microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data.
Hassan, Ali
2006-06-01
RNA interference (RNAi) in eukaryotes is a recently identified phenomenon in which small double stranded RNA molecules called short interfering RNA (siRNA) interact with messenger RNA (mRNA) containing homologous sequences in a sequence-specific manner. Ultimately, this interaction results in degradation of the target mRNA. Because of the high sequence specificity of the RNAi process, and the apparently ubiquitous expression of the endogenous protein components necessary for RNAi, there appears to be little limitation to the genes that can be targeted for silencing by RNAi. Thus, RNAi has enormous potential, both as a research tool and as a mode of therapy. Several recent patents have described advances in RNAi technology that are likely to lead to new treatments for cardiovascular disease. These patents have described methods for increased delivery of siRNA to cardiovascular target tissues, chemical modifications of siRNA that improve their pharmacokinetic characteristics, and expression vectors capable of expressing RNAi effectors in situ. Though RNAi has only recently been demonstrated to occur in mammalian tissues, work has advanced rapidly in the development of RNAi-based therapeutics. Recently, therapeutic silencing of apoliporotein B, the ligand for the low density lipoprotein receptor, has been demonstrated in adult mice by systemic administration of chemically modified siRNA. This demonstrates the potential for RNAi-based therapeutics, and suggests that the future for RNAi in the treatment of cardiovascular disease is bright.
BrEPS 2.0: Optimization of sequence pattern prediction for enzyme annotation.
Dudek, Christian-Alexander; Dannheim, Henning; Schomburg, Dietmar
2017-01-01
The prediction of gene functions is crucial for a large number of different life science areas. Faster high throughput sequencing techniques generate more and larger datasets. The manual annotation by classical wet-lab experiments is not suitable for these large amounts of data. We showed earlier that the automatic sequence pattern-based BrEPS protocol, based on manually curated sequences, can be used for the prediction of enzymatic functions of genes. The growing sequence databases provide the opportunity for more reliable patterns, but are also a challenge for the implementation of automatic protocols. We reimplemented and optimized the BrEPS pattern generation to be applicable for larger datasets in an acceptable timescale. Primary improvement of the new BrEPS protocol is the enhanced data selection step. Manually curated annotations from Swiss-Prot are used as reliable source for function prediction of enzymes observed on protein level. The pool of sequences is extended by highly similar sequences from TrEMBL and SwissProt. This allows us to restrict the selection of Swiss-Prot entries, without losing the diversity of sequences needed to generate significant patterns. Additionally, a supporting pattern type was introduced by extending the patterns at semi-conserved positions with highly similar amino acids. Extended patterns have an increased complexity, increasing the chance to match more sequences, without losing the essential structural information of the pattern. To enhance the usability of the database, we introduced enzyme function prediction based on consensus EC numbers and IUBMB enzyme nomenclature. BrEPS is part of the Braunschweig Enzyme Database (BRENDA) and is available on a completely redesigned website and as download. The database can be downloaded and used with the BrEPScmd command line tool for large scale sequence analysis. The BrEPS website and downloads for the database creation tool, command line tool and database are freely accessible at http://breps.tu-bs.de.
BrEPS 2.0: Optimization of sequence pattern prediction for enzyme annotation
Schomburg, Dietmar
2017-01-01
The prediction of gene functions is crucial for a large number of different life science areas. Faster high throughput sequencing techniques generate more and larger datasets. The manual annotation by classical wet-lab experiments is not suitable for these large amounts of data. We showed earlier that the automatic sequence pattern-based BrEPS protocol, based on manually curated sequences, can be used for the prediction of enzymatic functions of genes. The growing sequence databases provide the opportunity for more reliable patterns, but are also a challenge for the implementation of automatic protocols. We reimplemented and optimized the BrEPS pattern generation to be applicable for larger datasets in an acceptable timescale. Primary improvement of the new BrEPS protocol is the enhanced data selection step. Manually curated annotations from Swiss-Prot are used as reliable source for function prediction of enzymes observed on protein level. The pool of sequences is extended by highly similar sequences from TrEMBL and SwissProt. This allows us to restrict the selection of Swiss-Prot entries, without losing the diversity of sequences needed to generate significant patterns. Additionally, a supporting pattern type was introduced by extending the patterns at semi-conserved positions with highly similar amino acids. Extended patterns have an increased complexity, increasing the chance to match more sequences, without losing the essential structural information of the pattern. To enhance the usability of the database, we introduced enzyme function prediction based on consensus EC numbers and IUBMB enzyme nomenclature. BrEPS is part of the Braunschweig Enzyme Database (BRENDA) and is available on a completely redesigned website and as download. The database can be downloaded and used with the BrEPScmd command line tool for large scale sequence analysis. The BrEPS website and downloads for the database creation tool, command line tool and database are freely accessible at http://breps.tu-bs.de. PMID:28750104
DNA nanomapping using CRISPR-Cas9 as a programmable nanoparticle.
Mikheikin, Andrey; Olsen, Anita; Leslie, Kevin; Russell-Pavier, Freddie; Yacoot, Andrew; Picco, Loren; Payton, Oliver; Toor, Amir; Chesney, Alden; Gimzewski, James K; Mishra, Bud; Reed, Jason
2017-11-21
Progress in whole-genome sequencing using short-read (e.g., <150 bp), next-generation sequencing technologies has reinvigorated interest in high-resolution physical mapping to fill technical gaps that are not well addressed by sequencing. Here, we report two technical advances in DNA nanotechnology and single-molecule genomics: (1) we describe a labeling technique (CRISPR-Cas9 nanoparticles) for high-speed AFM-based physical mapping of DNA and (2) the first successful demonstration of using DVD optics to image DNA molecules with high-speed AFM. As a proof of principle, we used this new "nanomapping" method to detect and map precisely BCL2-IGH translocations present in lymph node biopsies of follicular lymphoma patents. This HS-AFM "nanomapping" technique can be complementary to both sequencing and other physical mapping approaches.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases.
Truong, Kevin; Ikura, Mitsuhiko
2003-05-06
Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.
iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.
Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi
2018-01-01
We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.
D.J. Glass; N. Takebayashi; L. Olson; D.L. Taylor
2013-01-01
The number of sequences from both formally described taxa and uncultured environmental DNA deposited in the International Nucleotide Sequence Databases has increased substantially over the last two decades. Although the majority of these sequences represent authentic gene copies, there is evidence of DNA artifacts in these databases as well. These include lab artifacts...
REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era
Leonard, Guy; Stevens, Jamie R.; Richards, Thomas A.
2009-01-01
The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment file, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree files (with a user-defined combination of species name and/or database accession number). Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file) and generation of species and accession number lists for use in supplementary materials or figure legends. PMID:19812722
PASS2: an automated database of protein alignments organised as structural superfamilies.
Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan
2004-04-02
The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html
Using the TIGR gene index databases for biological discovery.
Lee, Yuandan; Quackenbush, John
2003-11-01
The TIGR Gene Index web pages provide access to analyses of ESTs and gene sequences for nearly 60 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a homepage. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information.
Nakayama, Hiroshi; Akiyama, Misaki; Taoka, Masato; Yamauchi, Yoshio; Nobe, Yuko; Ishikawa, Hideaki; Takahashi, Nobuhiro; Isobe, Toshiaki
2009-04-01
We present here a method to correlate tandem mass spectra of sample RNA nucleolytic fragments with an RNA nucleotide sequence in a DNA/RNA sequence database, thereby allowing tandem mass spectrometry (MS/MS)-based identification of RNA in biological samples. Ariadne, a unique web-based database search engine, identifies RNA by two probability-based evaluation steps of MS/MS data. In the first step, the software evaluates the matches between the masses of product ions generated by MS/MS of an RNase digest of sample RNA and those calculated from a candidate nucleotide sequence in a DNA/RNA sequence database, which then predicts the nucleotide sequences of these RNase fragments. In the second step, the candidate sequences are mapped for all RNA entries in the database, and each entry is scored for a function of occurrences of the candidate sequences to identify a particular RNA. Ariadne can also predict post-transcriptional modifications of RNA, such as methylation of nucleotide bases and/or ribose, by estimating mass shifts from the theoretical mass values. The method was validated with MS/MS data of RNase T1 digests of in vitro transcripts. It was applied successfully to identify an unknown RNA component in a tRNA mixture and to analyze post-transcriptional modification in yeast tRNA(Phe-1).
Novel primers for complete mitochondrial cytochrome b genesequencing in mammals
Naidu, Ashwin; Fitak, Robert R.; Munguia-Vega, Adrian; Culver, Melanie
2011-01-01
Sequence-based species identification relies on the extent and integrity of sequence data available in online databases such as GenBank. When identifying species from a sample of unknown origin, partial DNA sequences obtained from the sample are aligned against existing sequences in databases. When the sequence from the matching species is not present in the database, high-scoring alignments with closely related sequences might produce unreliable results on species identity. For species identification in mammals, the cytochrome b (cyt b) gene has been identified to be highly informative; thus, large amounts of reference sequence data from the cyt b gene are much needed. To enhance availability of cyt b gene sequence data on a large number of mammalian species in GenBank and other such publicly accessible online databases, we identified a primer pair for complete cyt b gene sequencing in mammals. Using this primer pair, we successfully PCR amplified and sequenced the complete cyt b gene from 40 of 44 mammalian species representing 10 orders of mammals. We submitted 40 complete, correctly annotated, cyt b protein coding sequences to GenBank. To our knowledge, this is the first single primer pair to amplify the complete cyt b gene in a broad range of mammalian species. This primer pair can be used for the addition of new cyt b gene sequences and to enhance data available on species represented in GenBank. The availability of novel and complete gene sequences as high-quality reference data can improve the reliability of sequence-based species identification.
Equivalent Indels – Ambiguous Functional Classes and Redundancy in Databases
Assmus, Jens; Kleffe, Jürgen; Schmitt, Armin O.; Brockmann, Gudrun A.
2013-01-01
There is considerable interest in studying sequenced variations. However, while the positions of substitutions are uniquely identifiable by sequence alignment, the location of insertions and deletions still poses problems. Each insertion and deletion causes a change of sequence. Yet, due to low complexity or repetitive sequence structures, the same indel can sometimes be annotated in different ways. Two indels which differ in allele sequence and position can be one and the same, i.e. the alternative sequence of the whole chromosome is identical in both cases and, therefore, the two deletions are biologically equivalent. In such a case, it is impossible to identify the exact position of an indel merely based on sequence alignment. Thus, variation entries in a mutation database are not necessarily uniquely defined. We prove the existence of a contiguous region around an indel in which all deletions of the same length are biologically identical. Databases often show only one of several possible locations for a given variation. Furthermore, different data base entries can represent equivalent variation events. We identified 1,045,590 such problematic entries of insertions and deletions out of 5,860,408 indel entries in the current human database of Ensembl. Equivalent indels are found in sequence regions of different functions like exons, introns or 5' and 3' UTRs. One and the same variation can be assigned to several different functional classifications of which only one is correct. We implemented an algorithm that determines for each indel database entry its complete set of equivalent indels which is uniquely characterized by the indel itself and a given interval of the reference sequence. PMID:23658777
SALAD database: a motif-based database of protein annotations for plant comparative genomics
Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi
2010-01-01
Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named ‘SALAD on ARRAYs’ to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis. PMID:19854933
Assembly: a resource for assembled genomes at NCBI
Kitts, Paul A.; Church, Deanna M.; Thibaud-Nissen, Françoise; Choi, Jinna; Hem, Vichet; Sapojnikov, Victor; Smith, Robert G.; Tatusova, Tatiana; Xiang, Charlie; Zherikov, Andrey; DiCuccio, Michael; Murphy, Terence D.; Pruitt, Kim D.; Kimchi, Avi
2016-01-01
The NCBI Assembly database (www.ncbi.nlm.nih.gov/assembly/) provides stable accessioning and data tracking for genome assembly data. The model underlying the database can accommodate a range of assembly structures, including sets of unordered contig or scaffold sequences, bacterial genomes consisting of a single complete chromosome, or complex structures such as a human genome with modeled allelic variation. The database provides an assembly accession and version to unambiguously identify the set of sequences that make up a particular version of an assembly, and tracks changes to updated genome assemblies. The Assembly database reports metadata such as assembly names, simple statistical reports of the assembly (number of contigs and scaffolds, contiguity metrics such as contig N50, total sequence length and total gap length) as well as the assembly update history. The Assembly database also tracks the relationship between an assembly submitted to the International Nucleotide Sequence Database Consortium (INSDC) and the assembly represented in the NCBI RefSeq project. Users can find assemblies of interest by querying the Assembly Resource directly or by browsing available assemblies for a particular organism. Links in the Assembly Resource allow users to easily download sequence and annotations for current versions of genome assemblies from the NCBI genomes FTP site. PMID:26578580
SALAD database: a motif-based database of protein annotations for plant comparative genomics.
Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi
2010-01-01
Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.
NASA Astrophysics Data System (ADS)
Jun, LIU; Huang, Wei; Hongjie, Fan
2016-02-01
A novel method for finding the initial structure parameters of an optical system via the genetic algorithm (GA) is proposed in this research. Usually, optical designers start their designs from the commonly used structures from a patent database; however, it is time consuming to modify the patented structures to meet the specification. A high-performance design result largely depends on the choice of the starting point. Accordingly, it would be highly desirable to be able to calculate the initial structure parameters automatically. In this paper, a method that combines a genetic algorithm and aberration analysis is used to determine an appropriate initial structure of an optical system. We use a three-mirror system as an example to demonstrate the validity and reliability of this method. On-axis and off-axis telecentric three-mirror systems are obtained based on this method.
USDA-ARS?s Scientific Manuscript database
The ARS Culture Collection (NRRL) currently contains 7569 strains within the family Streptomycetaceae but 4368 of them have not been characterized to the species level. A gene sequence database using the Bacterial Isolate Genomic Sequence Database package (BIGSdb) (Jolley & Maiden, 2010) is availabl...
Engel, Stacia R.; Cherry, J. Michael
2013-01-01
The first completed eukaryotic genome sequence was that of the yeast Saccharomyces cerevisiae, and the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the original model organism database. SGD remains the authoritative community resource for the S. cerevisiae reference genome sequence and its annotation, and continues to provide comprehensive biological information correlated with S. cerevisiae genes and their products. A diverse set of yeast strains have been sequenced to explore commercial and laboratory applications, and a brief history of those strains is provided. The publication of these new genomes has motivated the creation of new tools, and SGD will annotate and provide comparative analyses of these sequences, correlating changes with variations in strain phenotypes and protein function. We are entering a new era at SGD, as we incorporate these new sequences and make them accessible to the scientific community, all in an effort to continue in our mission of educating researchers and facilitating discovery. Database URL: http://www.yeastgenome.org/ PMID:23487186
probeBase—an online resource for rRNA-targeted oligonucleotide probes and primers: new features 2016
Greuter, Daniel; Loy, Alexander; Horn, Matthias; Rattei, Thomas
2016-01-01
probeBase http://www.probebase.net is a manually maintained and curated database of rRNA-targeted oligonucleotide probes and primers. Contextual information and multiple options for evaluating in silico hybridization performance against the most recent rRNA sequence databases are provided for each oligonucleotide entry, which makes probeBase an important and frequently used resource for microbiology research and diagnostics. Here we present a major update of probeBase, which was last featured in the NAR Database Issue 2007. This update describes a complete remodeling of the database architecture and environment to accommodate computationally efficient access. Improved search functions, sequence match tools and data output now extend the opportunities for finding suitable hierarchical probe sets that target an organism or taxon at different taxonomic levels. To facilitate the identification of complementary probe sets for organisms represented by short rRNA sequence reads generated by amplicon sequencing or metagenomic analysis with next generation sequencing technologies such as Illumina and IonTorrent, we introduce a novel tool that recovers surrogate near full-length rRNA sequences for short query sequences and finds matching oligonucleotides in probeBase. PMID:26586809
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
[Integrated DNA barcoding database for identifying Chinese animal medicine].
Shi, Lin-Chun; Yao, Hui; Xie, Li-Fang; Zhu, Ying-Jie; Song, Jing-Yuan; Zhang, Hui; Chen, Shi-Lin
2014-06-01
In order to construct an integrated DNA barcoding database for identifying Chinese animal medicine, the authors and their cooperators have completed a lot of researches for identifying Chinese animal medicines using DNA barcoding technology. Sequences from GenBank have been analyzed simultaneously. Three different methods, BLAST, barcoding gap and Tree building, have been used to confirm the reliabilities of barcode records in the database. The integrated DNA barcoding database for identifying Chinese animal medicine has been constructed using three different parts: specimen, sequence and literature information. This database contained about 800 animal medicines and the adulterants and closely related species. Unknown specimens can be identified by pasting their sequence record into the window on the ID page of species identification system for traditional Chinese medicine (www. tcmbarcode. cn). The integrated DNA barcoding database for identifying Chinese animal medicine is significantly important for animal species identification, rare and endangered species conservation and sustainable utilization of animal resources.
ClusterMine360: a database of microbial PKS/NRPS biosynthesis
Conway, Kyle R.; Boddy, Christopher N.
2013-01-01
ClusterMine360 (http://www.clustermine360.ca/) is a database of microbial polyketide and non-ribosomal peptide gene clusters. It takes advantage of crowd-sourcing by allowing members of the community to make contributions while automation is used to help achieve high data consistency and quality. The database currently has >200 gene clusters from >185 compound families. It also features a unique sequence repository containing >10 000 polyketide synthase/non-ribosomal peptide synthetase domains. The sequences are filterable and downloadable as individual or multiple sequence FASTA files. We are confident that this database will be a useful resource for members of the polyketide synthases/non-ribosomal peptide synthetases research community, enabling them to keep up with the growing number of sequenced gene clusters and rapidly mine these clusters for functional information. PMID:23104377
Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P
1994-01-01
We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. In this database each sequence has been attributed a single genetic name. In the case of duplicated sequences a simple method has been applied to distinguish between sequences of one and the same gene from non-allelic sequences of duplicated genes. If necessary, synonyms are given in the case of allelic duplicated sequences. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, Swissprot and EMBL accession numbers. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). PMID:7937046
Transgenic Wheat, Barley and Oats: Future Prospects
NASA Astrophysics Data System (ADS)
Dunwell, Jim M.
Following the success of transgenic maize and rice, methods have now been developed for the efficient introduction of genes into wheat, barley and oats. This review summarizes the present position in relation to these three species, and also uses information from field trial databases and the patent literature to assess the future trends in the exploitation of transgenic material. This analysis includes agronomic traits and also discusses opportunities in expanding areas such as biofuels and biopharming.
Systems Biology of the Immune Response to Live and Inactivated Dengue Virus Vaccines
2017-09-01
Financial support; In-kind support (e.g., partner makes software, computers , equipment, etc., available to project staff); Facilities (e.g...reprints of manuscripts and abstracts, a curriculum vitae, patent applications, study questionnaires, and surveys , etc. Organization name: Walter...memory B-cells and the isotype usage of the antibody response. 9. A project-specific SQL database has been set up on a server based at URI. Major
Gene Unprediction with Spurio: A tool to identify spurious protein sequences.
Höps, Wolfram; Jeffryes, Matt; Bateman, Alex
2018-01-01
We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation. Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases. We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes. Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.
Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning
Xu, Shengzhi; Cheng, Xiang; Li, Zhengyi; Xiong, Li
2016-01-01
In this paper, we study the problem of mining frequent sequences under the rigorous differential privacy model. We explore the possibility of designing a differentially private frequent sequence mining (FSM) algorithm which can achieve both high data utility and a high degree of privacy. We found, in differentially private FSM, the amount of required noise is proportionate to the number of candidate sequences. If we could effectively reduce the number of unpromising candidate sequences, the utility and privacy tradeoff can be significantly improved. To this end, by leveraging a sampling-based candidate pruning technique, we propose a novel differentially private FSM algorithm, which is referred to as PFS2. The core of our algorithm is to utilize sample databases to further prune the candidate sequences generated based on the downward closure property. In particular, we use the noisy local support of candidate sequences in the sample databases to estimate which sequences are potentially frequent. To improve the accuracy of such private estimations, a sequence shrinking method is proposed to enforce the length constraint on the sample databases. Moreover, to decrease the probability of misestimating frequent sequences as infrequent, a threshold relaxation method is proposed to relax the user-specified threshold for the sample databases. Through formal privacy analysis, we show that our PFS2 algorithm is ε-differentially private. Extensive experiments on real datasets illustrate that our PFS2 algorithm can privately find frequent sequences with high accuracy. PMID:26973430
A comprehensive and scalable database search system for metaproteomics.
Chatterjee, Sandip; Stupp, Gregory S; Park, Sung Kyu Robin; Ducom, Jean-Christophe; Yates, John R; Su, Andrew I; Wolan, Dennis W
2016-08-16
Mass spectrometry-based shotgun proteomics experiments rely on accurate matching of experimental spectra against a database of protein sequences. Existing computational analysis methods are limited in the size of their sequence databases, which severely restricts the proteomic sequencing depth and functional analysis of highly complex samples. The growing amount of public high-throughput sequencing data will only exacerbate this problem. We designed a broadly applicable metaproteomic analysis method (ComPIL) that addresses protein database size limitations. Our approach to overcome this significant limitation in metaproteomics was to design a scalable set of sequence databases assembled for optimal library querying speeds. ComPIL was integrated with a modified version of the search engine ProLuCID (termed "Blazmass") to permit rapid matching of experimental spectra. Proof-of-principle analysis of human HEK293 lysate with a ComPIL database derived from high-quality genomic libraries was able to detect nearly all of the same peptides as a search with a human database (~500x fewer peptides in the database), with a small reduction in sensitivity. We were also able to detect proteins from the adenovirus used to immortalize these cells. We applied our method to a set of healthy human gut microbiome proteomic samples and showed a substantial increase in the number of identified peptides and proteins compared to previous metaproteomic analyses, while retaining a high degree of protein identification accuracy and allowing for a more in-depth characterization of the functional landscape of the samples. The combination of ComPIL with Blazmass allows proteomic searches to be performed with database sizes much larger than previously possible. These large database searches can be applied to complex meta-samples with unknown composition or proteomic samples where unexpected proteins may be identified. The protein database, proteomic search engine, and the proteomic data files for the 5 microbiome samples characterized and discussed herein are open source and available for use and additional analysis.
Navigating through the Jungle of Allergens: Features and Applications of Allergen Databases.
Radauer, Christian
2017-01-01
The increasing number of available data on allergenic proteins demanded the establishment of structured, freely accessible allergen databases. In this review article, features and applications of 6 of the most widely used allergen databases are discussed. The WHO/IUIS Allergen Nomenclature Database is the official resource of allergen designations. Allergome is the most comprehensive collection of data on allergens and allergen sources. AllergenOnline is aimed at providing a peer-reviewed database of allergen sequences for prediction of allergenicity of proteins, such as those planned to be inserted into genetically modified crops. The Structural Database of Allergenic Proteins (SDAP) provides a database of allergen sequences, structures, and epitopes linked to bioinformatics tools for sequence analysis and comparison. The Immune Epitope Database (IEDB) is the largest repository of T-cell, B-cell, and major histocompatibility complex protein epitopes including epitopes of allergens. AllFam classifies allergens into families of evolutionarily related proteins using definitions from the Pfam protein family database. These databases contain mostly overlapping data, but also show differences in terms of their targeted users, the criteria for including allergens, data shown for each allergen, and the availability of bioinformatics tools. © 2017 S. Karger AG, Basel.
Internet-accessible DNA sequence database for identifying fusaria from human and animal infections.
O'Donnell, Kerry; Sutton, Deanna A; Rinaldi, Michael G; Sarver, Brice A J; Balajee, S Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C; Robert, Vincent A R G; Crous, Pedro W; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M
2010-10-01
Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1α (EF-1α), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the sequence results can be verified and isolates are made available for future study.
PrionHome: a database of prions and other sequences relevant to prion phenomena.
Harbi, Djamel; Parthiban, Marimuthu; Gendoo, Deena M A; Ehsani, Sepehr; Kumar, Manish; Schmitt-Ulms, Gerold; Sowdhamini, Ramanathan; Harrison, Paul M
2012-01-01
Prions are units of propagation of an altered state of a protein or proteins; prions can propagate from organism to organism, through cooption of other protein copies. Prions contain no necessary nucleic acids, and are important both as both pathogenic agents, and as a potential force in epigenetic phenomena. The original prions were derived from a misfolded form of the mammalian Prion Protein PrP. Infection by these prions causes neurodegenerative diseases. Other prions cause non-Mendelian inheritance in budding yeast, and sometimes act as diseases of yeast. We report the bioinformatic construction of the PrionHome, a database of >2000 prion-related sequences. The data was collated from various public and private resources and filtered for redundancy. The data was then processed according to a transparent classification system of prionogenic sequences (i.e., sequences that can make prions), prionoids (i.e., proteins that propagate like prions between individual cells), and other prion-related phenomena. There are eight PrionHome classifications for sequences. The first four classifications are derived from experimental observations: prionogenic sequences, prionoids, other prion-related phenomena, and prion interactors. The second four classifications are derived from sequence analysis: orthologs, paralogs, pseudogenes, and candidate-prionogenic sequences. Database entries list: supporting information for PrionHome classifications, prion-determinant areas (where relevant), and disordered and compositionally-biased regions. Also included are literature references for the PrionHome classifications, transcripts and genomic coordinates, and structural data (including comparative models made for the PrionHome from manually curated alignments). We provide database usage examples for both vertebrate and fungal prion contexts. Using the database data, we have performed a detailed analysis of the compositional biases in known budding-yeast prionogenic sequences, showing that the only abundant bias pattern is for asparagine bias with subsidiary serine bias. We anticipate that this database will be a useful experimental aid and reference resource. It is freely available at: http://libaio.biol.mcgill.ca/prion.
PrionHome: A Database of Prions and Other Sequences Relevant to Prion Phenomena
Harbi, Djamel; Parthiban, Marimuthu; Gendoo, Deena M. A.; Ehsani, Sepehr; Kumar, Manish; Schmitt-Ulms, Gerold; Sowdhamini, Ramanathan; Harrison, Paul M.
2012-01-01
Prions are units of propagation of an altered state of a protein or proteins; prions can propagate from organism to organism, through cooption of other protein copies. Prions contain no necessary nucleic acids, and are important both as both pathogenic agents, and as a potential force in epigenetic phenomena. The original prions were derived from a misfolded form of the mammalian Prion Protein PrP. Infection by these prions causes neurodegenerative diseases. Other prions cause non-Mendelian inheritance in budding yeast, and sometimes act as diseases of yeast. We report the bioinformatic construction of the PrionHome, a database of >2000 prion-related sequences. The data was collated from various public and private resources and filtered for redundancy. The data was then processed according to a transparent classification system of prionogenic sequences (i.e., sequences that can make prions), prionoids (i.e., proteins that propagate like prions between individual cells), and other prion-related phenomena. There are eight PrionHome classifications for sequences. The first four classifications are derived from experimental observations: prionogenic sequences, prionoids, other prion-related phenomena, and prion interactors. The second four classifications are derived from sequence analysis: orthologs, paralogs, pseudogenes, and candidate-prionogenic sequences. Database entries list: supporting information for PrionHome classifications, prion-determinant areas (where relevant), and disordered and compositionally-biased regions. Also included are literature references for the PrionHome classifications, transcripts and genomic coordinates, and structural data (including comparative models made for the PrionHome from manually curated alignments). We provide database usage examples for both vertebrate and fungal prion contexts. Using the database data, we have performed a detailed analysis of the compositional biases in known budding-yeast prionogenic sequences, showing that the only abundant bias pattern is for asparagine bias with subsidiary serine bias. We anticipate that this database will be a useful experimental aid and reference resource. It is freely available at: http://libaio.biol.mcgill.ca/prion. PMID:22363733
PROFESS: a PROtein Function, Evolution, Structure and Sequence database
Triplet, Thomas; Shortridge, Matthew D.; Griep, Mark A.; Stark, Jaime L.; Powers, Robert; Revesz, Peter
2010-01-01
The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are ∼1100 molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein–protein interaction networks. Database URL: http://cse.unl.edu/∼profess/ PMID:20624718
Faster sequence homology searches by clustering subsequences.
Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka
2015-04-15
Sequence homology searches are used in various fields. New sequencing technologies produce huge amounts of sequence data, which continuously increase the size of sequence databases. As a result, homology searches require large amounts of computational time, especially for metagenomic analysis. We developed a fast homology search method based on database subsequence clustering, and implemented it as GHOSTZ. This method clusters similar subsequences from a database to perform an efficient seed search and ungapped extension by reducing alignment candidates based on triangle inequality. The database subsequence clustering technique achieved an ∼2-fold increase in speed without a large decrease in search sensitivity. When we measured with metagenomic data, GHOSTZ is ∼2.2-2.8 times faster than RAPSearch and is ∼185-261 times faster than BLASTX. The source code is freely available for download at http://www.bi.cs.titech.ac.jp/ghostz/ akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2007-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (www.ncbi.nlm.nih.gov).
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2005-01-01
GenBank is a comprehensive database that contains publicly available DNA sequences for more than 165,000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in the UK and the DNA Data Bank of Japan helps to ensure worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2006-01-01
GenBank (R) is a comprehensive database that contains publicly available DNA sequences for more than 205 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the Web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at www.ncbi.nlm.nih.gov.
Fokkema, Ivo F A C; den Dunnen, Johan T; Taschner, Peter E M
2005-08-01
The completion of the human genome project has initiated, as well as provided the basis for, the collection and study of all sequence variation between individuals. Direct access to up-to-date information on sequence variation is currently provided most efficiently through web-based, gene-centered, locus-specific databases (LSDBs). We have developed the Leiden Open (source) Variation Database (LOVD) software approaching the "LSDB-in-a-Box" idea for the easy creation and maintenance of a fully web-based gene sequence variation database. LOVD is platform-independent and uses PHP and MySQL open source software only. The basic gene-centered and modular design of the database follows the recommendations of the Human Genome Variation Society (HGVS) and focuses on the collection and display of DNA sequence variations. With minimal effort, the LOVD platform is extendable with clinical data. The open set-up should both facilitate and promote functional extension with scripts written by the community. The LOVD software is freely available from the Leiden Muscular Dystrophy pages (www.DMD.nl/LOVD/). To promote the use of LOVD, we currently offer curators the possibility to set up an LSDB on our Leiden server. (c) 2005 Wiley-Liss, Inc.
Arnold, Roland; Goldenberg, Florian; Mewes, Hans-Werner; Rattei, Thomas
2014-01-01
The Similarity Matrix of Proteins (SIMAP, http://mips.gsf.de/simap/) database has been designed to massively accelerate computationally expensive protein sequence analysis tasks in bioinformatics. It provides pre-calculated sequence similarities interconnecting the entire known protein sequence universe, complemented by pre-calculated protein features and domains, similarity clusters and functional annotations. SIMAP covers all major public protein databases as well as many consistently re-annotated metagenomes from different repositories. As of September 2013, SIMAP contains >163 million proteins corresponding to ∼70 million non-redundant sequences. SIMAP uses the sensitive FASTA search heuristics, the Smith–Waterman alignment algorithm, the InterPro database of protein domain models and the BLAST2GO functional annotation algorithm. SIMAP assists biologists by facilitating the interactive exploration of the protein sequence universe. Web-Service and DAS interfaces allow connecting SIMAP with any other bioinformatic tool and resource. All-against-all protein sequence similarity matrices of project-specific protein collections are generated on request. Recent improvements allow SIMAP to cover the rapidly growing sequenced protein sequence universe. New Web-Service interfaces enhance the connectivity of SIMAP. Novel tools for interactive extraction of protein similarity networks have been added. Open access to SIMAP is provided through the web portal; the portal also contains instructions and links for software access and flat file downloads. PMID:24165881
Plasmodium ovale infection in Malaysia: first imported case
2010-01-01
Background Plasmodium ovale infection is rarely reported in Malaysia. This is the first imported case of P. ovale infection in Malaysia which was initially misdiagnosed as Plasmodium vivax. Methods Peripheral blood sample was first examined by Giemsa-stained microscopy examination and further confirmed using a patented in-house multiplex PCR followed by sequencing. Results and Discussion Initial results from peripheral blood smear examination diagnosed P. vivax infection. However further analysis using a patented in-house multiplex PCR followed by sequencing confirmed the presence of P. ovale. Given that Anopheles maculatus and Anopheles dirus, vectors of P. ovale are found in Malaysia, this finding has significant implication on Malaysia's public health sector. Conclusions The current finding should serve as an alert to epidemiologists, clinicians and laboratory technicians in the possibility of finding P. ovale in Malaysia. P. ovale should be considered in the differential diagnosis of imported malaria cases in Malaysia due to the exponential increase in the number of visitors from P. ovale endemic regions and the long latent period of P. ovale. It is also timely that conventional diagnosis of malaria via microscopy should be coupled with more advanced molecular tools for effective diagnosis. PMID:20929588
Lawyers' delights and geneticists' nightmares: at forty, the double helix shows some wrinkles.
Sgaramella, V
1993-12-15
The National Institutes of Health (NIH) request to patent the base sequences of incomplete and uncharacterized fragments of DNA copied on messenger RNAs (cDNAs) extracted from human tissues, the refusal by the patent office, and the appeal placed by NIH, have incited a violent controversy, fueled by rational, as well as emotional elements. In a compromising mode between liberalism and protectionism, I propose that legal protection be considered only for those RNA/DNA sequences, either natural or artificial, which can generate practical applications per se, and not through their expression products. Another controversy is developing around a popular tool for genomic research: the fidelity of yeast artificial chromosome (YAC) libraries being distributed worldwide for physical mapping is being questioned. Some of these libraries have been shown to be affected by surprisingly high levels of co-cloning, in addition to more common gene reshuffling instances. Also in this case, scientific as well as non-scientific components have to be considered. Possible remedies for the underlying problems may be found in the proper use of kinetic, enzymatic and microbiological variables in the production of YACs. Here too, a sharper distinction between the secular and scientific gratifications of research could help.
Trends in Compulsory Licensing of Pharmaceuticals Since the Doha Declaration: A Database Analysis
Beall, Reed; Kuhn, Randall
2012-01-01
Background It is now a decade since the World Trade Organization (WTO) adopted the “Declaration on the TRIPS Agreement and Public Health” at its 4th Ministerial Conference in Doha. Many anticipated that these actions would lead nations to claim compulsory licenses (CLs) for pharmaceutical products with greater regularity. A CL is the use of a patented innovation that has been licensed by a state without the permission of the patent title holder. Skeptics doubted that many CLs would occur, given political pressure against CL activity and continued health system weakness in poor countries. The subsequent decade has seen little systematic assessment of the Doha Declaration's impact. Methods and Findings We assembled a database of all episodes in which a CL was publically entertained or announced by a WTO member state since 1995. Broad searches of CL activity were conducted using media, academic, and legal databases, yielding 34 potential CL episodes in 26 countries. Country- and product-specific searches were used to verify government participation, resulting in a final database of 24 verified CLs in 17 nations. We coded CL episodes in terms of outcome, national income, and disease group over three distinct periods of CL activity. Most CL episodes occurred between 2003 and 2005, involved drugs for HIV/AIDS, and occurred in upper-middle-income countries (UMICs). Aside from HIV/AIDS, few CL episodes involved communicable disease, and none occurred in least-developed or low-income countries. Conclusions Given skepticism about the Doha Declaration's likely impact, we note the relatively high occurrence of CLs, yet CL activity has diminished markedly since 2006. While UMICs have high CL activity and strong incentives to use CLs compared to other countries, we note considerable countervailing pressures against CL use even in UMICs. We conclude that there is a low probability of continued CL activity. We highlight the need for further systematic evaluation of global health governance actions. Please see later in the article for the Editors' Summary PMID:22253577
Trends in compulsory licensing of pharmaceuticals since the Doha Declaration: a database analysis.
Beall, Reed; Kuhn, Randall
2012-01-01
It is now a decade since the World Trade Organization (WTO) adopted the "Declaration on the TRIPS Agreement and Public Health" at its 4th Ministerial Conference in Doha. Many anticipated that these actions would lead nations to claim compulsory licenses (CLs) for pharmaceutical products with greater regularity. A CL is the use of a patented innovation that has been licensed by a state without the permission of the patent title holder. Skeptics doubted that many CLs would occur, given political pressure against CL activity and continued health system weakness in poor countries. The subsequent decade has seen little systematic assessment of the Doha Declaration's impact. We assembled a database of all episodes in which a CL was publically entertained or announced by a WTO member state since 1995. Broad searches of CL activity were conducted using media, academic, and legal databases, yielding 34 potential CL episodes in 26 countries. Country- and product-specific searches were used to verify government participation, resulting in a final database of 24 verified CLs in 17 nations. We coded CL episodes in terms of outcome, national income, and disease group over three distinct periods of CL activity. Most CL episodes occurred between 2003 and 2005, involved drugs for HIV/AIDS, and occurred in upper-middle-income countries (UMICs). Aside from HIV/AIDS, few CL episodes involved communicable disease, and none occurred in least-developed or low-income countries. Given skepticism about the Doha Declaration's likely impact, we note the relatively high occurrence of CLs, yet CL activity has diminished markedly since 2006. While UMICs have high CL activity and strong incentives to use CLs compared to other countries, we note considerable countervailing pressures against CL use even in UMICs. We conclude that there is a low probability of continued CL activity. We highlight the need for further systematic evaluation of global health governance actions. Please see later in the article for the Editors' Summary.
Nakagawa, So; Takahashi, Mahoko Ueda
2016-01-01
In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp. © The Author(s) 2016. Published by Oxford University Press.
Nakagawa, So; Takahashi, Mahoko Ueda
2016-01-01
In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species. Database URL: http://geve.med.u-tokai.ac.jp PMID:27242033
Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.
Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc
2016-01-01
Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.
Ndhlovu, Andrew; Durand, Pierre M.; Hazelhurst, Scott
2015-01-01
The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. Database URL: http://www.bioinf.wits.ac.za/software/fire/evodb PMID:26140928
TFBSshape: a motif database for DNA shape features of transcription factor binding sites.
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955
Liu, Yongchao; Maskell, Douglas L; Schmidt, Bertil
2009-01-01
Background The Smith-Waterman algorithm is one of the most widely used tools for searching biological sequence databases due to its high sensitivity. Unfortunately, the Smith-Waterman algorithm is computationally demanding, which is further compounded by the exponential growth of sequence databases. The recent emergence of many-core architectures, and their associated programming interfaces, provides an opportunity to accelerate sequence database searches using commonly available and inexpensive hardware. Findings Our CUDASW++ implementation (benchmarked on a single-GPU NVIDIA GeForce GTX 280 graphics card and a dual-GPU GeForce GTX 295 graphics card) provides a significant performance improvement compared to other publicly available implementations, such as SWPS3, CBESW, SW-CUDA, and NCBI-BLAST. CUDASW++ supports query sequences of length up to 59K and for query sequences ranging in length from 144 to 5,478 in Swiss-Prot release 56.6, the single-GPU version achieves an average performance of 9.509 GCUPS with a lowest performance of 9.039 GCUPS and a highest performance of 9.660 GCUPS, and the dual-GPU version achieves an average performance of 14.484 GCUPS with a lowest performance of 10.660 GCUPS and a highest performance of 16.087 GCUPS. Conclusion CUDASW++ is publicly available open-source software. It provides a significant performance improvement for Smith-Waterman-based protein sequence database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs. PMID:19416548
Tachyon search speeds up retrieval of similar sequences by several orders of magnitude.
Tan, Joshua; Kuchibhatla, Durga; Sirota, Fernanda L; Sherman, Westley A; Gattermayer, Tobias; Kwoh, Chia Yee; Eisenhaber, Frank; Schneider, Georg; Maurer-Stroh, Sebastian
2012-06-15
The usage of current sequence search tools becomes increasingly slower as databases of protein sequences continue to grow exponentially. Tachyon, a new algorithm that identifies closely related protein sequences ~200 times faster than standard BLAST, circumvents this limitation with a reduced database and oligopeptide matching heuristic. The tool is publicly accessible as a webserver at http://tachyon.bii.a-star.edu.sg and can also be accessed programmatically through SOAP.
Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.
Hiscock, D; Upton, C
2000-05-01
The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .
Adverse drug reaction, patent blue V dye and anaesthesia.
Tripathy, Swagata; Nair, Priya V
2012-11-01
Patent blue vital (PBV) dye is used for varied perioperative indications, and has a potential for causing life-threatening allergic reactions. In this retrospective case series study, at a tertiary level neurosciences centre, we analysed the nature, management and outcome of adverse drug reaction to the preoperative use of PBV for marking vertebral level prior to back surgeries. Patients were identified from the theatre and radiology database. Data were collected from the patients' notes retrieved from the medical records division. Eleven of 1247 (0.88%) patients experienced adverse reactions: 6 (0.48%) patients had minor grade I reactions (urticaria, blue hives, pruritis or generalised rash), 4 (0.32%) had grade II reactions (transient hypotension/bronchospasm/laryngospasm) and grade III reaction (hypotension requiring prolonged vasopressor support) was noted in 1 (0.08%) patient. No mortality was seen. The time of onset (range 10-45 min) frequently coincided with induction of anaesthesia or prone positioning of patient. Seven (63.6%) cases were cancelled or postponed (range 2-63 days). Treatment varied independent of the grade of reaction. Allergy workup (often incomplete) was done for 6 (54%) patients. An awareness of the time of onset and infrequency of life-threatening reactions to patent blue dye may result in better management, less postponement, more complete workup and referral of these events.
Parkinson's Disease, Diabetes and Cognitive Impairment.
Ashraghi, Mohammad R; Pagano, Gennaro; Polychronis, Sotirios; Niccolini, Flavia; Politis, Marios
2016-01-01
Parkinson's disease is a chronic neurodegenerative disorder characterized by a progressive loss of dopaminergic neurons. The pathophysiological mechanisms underlying Parkinson's are still unknown. Mitochondrial dysfunction, abnormal protein aggregation, increased neuroinflammation and impairment of brain glucose metabolism are shared processes among insulinresistance, diabetes and neurodegeneration and have been suggested as key mechanisms in development of Parkinson's and cognitive impairment. To review experimental and clinical evidence of underlying Parkinson's pathophysiology in common with diabetes and cognitive impairment. Anti-diabetic agents and recent patents for insulin-resistance that might be repositioned in the treatment of Parkinson's also have been included in this review. A narrative review using MEDLINE database. Common antidiabetic treatments such as DPP4 inhibitors, GLP-1 agonists and metformin have shown promise in the treatment of Parkinson's disease and cognitive impairment in animals and humans. Study of the pathophysiology of neurodegeneration common between diabetes and Parkinson's disease has given rise to new treatment possibilities. Patents published in the last 5 years could be used in novel approaches to Parkinson's treatment by targeting specific pathophysiology proteins, such as Nurr1, PINK1 and NrF2, while patents to improve penetration of the blood brain barrier could allow improved efficacy of existing treatments. Further studies using GLP-1 agonists and DPP-4 inhibitors to treat PD are warranted as they have shown promise.
Viscum articulatum Burm. f.: a review on its phytochemistry, pharmacology and traditional uses.
Patel, Bhishma P; Singh, Pawan K
2018-02-01
The aim of this study was to review and highlight traditional and ethnobotanical uses, phytochemical constituents, IP status, biological activity and pharmacological activity of Viscum articulatum. Thorough literature searches were performed on Viscum articulatum, and data were analysed for reported traditional uses, pharmacological activity, phytochemicals present and patents filed. Scientific and patent databases such as PubMed, Science Direct, Google Scholar, Google patents, USPTO and Espacenet were searched using different keywords. Viscum articulatum has been traditionally used in different parts of the world for treatment of various ailments. Almost all the parts such as leaves, root, stem and bark are having medicinal values and are reported for their uses in Ayurvedic and Chinese system of medicine for the management of various diseases. Modern scientific studies demonstrate efficacy of this plant against hypertension, ulcer, epilepsy, inflammation, wound, nephrotoxicity, HIV, cancer, etc. Major bioactive phytochemicals include oleanolic acid, betulinic acid, eriodictyol, naringenin, β-amyrin acetate, visartisides, etc. Side effects of allopathic medicines have created a global opportunity, acceptance and demand for phytomedicines. Viscum articulatum could be an excellent source of effective and safe phytomedicine for various ailments if focused translational efforts are undertaken by integrating the existing outcomes of researches. © 2017 Royal Pharmaceutical Society.
Changing Face of Wood Science in Modern Era: Contribution of Nanotechnology.
Mishra, Pawan Kumar; Giagli, Kyriaki; Tsalagkas, Dimitrios; Mishra, Harshita; Talegaonkar, Sushma; Gryc, Vladimír; Wimmer, Rupert
2018-02-14
Wood science and nanomaterials science interact together in two different aspects; a) fabrication of lignocellulosic nanomaterials derived from wood and plant-based sources and b) surface or bulk wood modification by nanoparticles. In this review, we attempt to visualize the impact of nanoparticles on the wood coating and preservation treatments based on a thorough registration of the patent databases. The study was carried out as an overview of the scientifically most followed trends on nanoparticles utilization in wood science and wood protection depicted by recent universal filed patents. This review is exclusively targeted on the solid (timber) wood as a subject material. Utilization of mainly metal nanoparticles as photoprotection, antibacterial, antifungal, antiabrasive and functional component on wood modification treatments was found to be widely patented. Additionally, an apparent minimization in the emission of volatile organic compounds (VOCs) has been succeeded. Bulk wood preservation and more importantly, wood coating, splay the range of strengthening wood dimensional stability and biological degradation, against moisture absorption and fungi respectively. Nanoparticle materials have addressed various issues of wood science in a more efficient and environmental way than the traditional methods. Nevertheless, abundant tests and regulations are still needed before industrializing or recycling these products. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
REDIdb: the RNA editing database.
Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla
2007-01-01
The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.
Robasky, Kimberly; Bulyk, Martha L
2011-01-01
The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.
Nascimento, Marcio L F; Araújo, Evando S; Cordeiro, Erlon R; de Oliveira, Ariadne H P; de Oliveira, Helinando P
2015-01-01
The development of new fibrilar materials based on electrospinning (ES) technique has a notable history of nearly four centuries of discoveries and results. The eletrospinning manufacturing is one of the most widely reported methods for nanofiber (NF) manufacturing, providing security, high quality and productivity. In spite of the first patent about electrospinning has been applied in April 5(th), 1900 by John Francis Cooley, a historical perspective (since 1600s) about this amazing discovery represents an important step for future applications. Nanofibers have been considered one of the top interesting fundamental study objects for academicians, and greatest intriguing business materials for modern industry. As a consequence, lucrative organizations and companies have explored the relevance of nanofibers. In this paper, the quantity of published manuscripts and patent inventions is presented and the correlation of research activities to the production of new electrospinning materials is shown. China and the United States have been leading in electrospinning and nanofibers development. The company triumph is mostly dependent on applications improvement relevant for broader business society. A dramatic rise of interest in nanofibers produced by electrospinning technique has been confirmed due to the publication data, author's affiliation, keywords, and essential characterization procedures. Is has been shown that the number of publications on electrospinning and nanofibers researches from academic institutions is higher than industrial laboratories. More than 1,891 patents using the term "electrospinning" and 2,960 with the term "nanofibers" according to the European Patent Office at title or abstract have been filed around the world up to 2013. These numbers just continue to increase along with worldwide ES-related sales. Curiously, for the same period 11,973 electrospinning documents and 18,679 nanofibers-related (mainly manuscripts) were published considering the Scopus database with the same terms in the title, abstract or using keywords. Thus, statistically, there are more published manuscripts worldwide than patents for both keywords.
Analysis of commercial and public bioactivity databases.
Tiikkainen, Pekka; Franke, Lutz
2012-02-27
Activity data for small molecules are invaluable in chemoinformatics. Various bioactivity databases exist containing detailed information of target proteins and quantitative binding data for small molecules extracted from journals and patents. In the current work, we have merged several public and commercial bioactivity databases into one bioactivity metabase. The molecular presentation, target information, and activity data of the vendor databases were standardized. The main motivation of the work was to create a single relational database which allows fast and simple data retrieval by in-house scientists. Second, we wanted to know the amount of overlap between databases by commercial and public vendors to see whether the former contain data complementing the latter. Third, we quantified the degree of inconsistency between data sources by comparing data points derived from the same scientific article cited by more than one vendor. We found that each data source contains unique data which is due to different scientific articles cited by the vendors. When comparing data derived from the same article we found that inconsistencies between the vendors are common. In conclusion, using databases of different vendors is still useful since the data overlap is not complete. It should be noted that this can be partially explained by the inconsistencies and errors in the source data.