Machado, Helena; Silva, Susana
2014-01-01
The creation and expansion of forensic DNA databases might involve potential threats to the protection of a range of human rights. At the same time, such databases have social benefits. Based on data collected through an online questionnaire applied to 628 individuals in Portugal, this paper aims to analyze the citizens' willingness to donate voluntarily a sample for profiling and inclusion in the National Forensic DNA Database and the views underpinning such a decision. Nearly one-quarter of the respondents would indicate 'no', and this negative response increased significantly with age and education. The overriding willingness to accept the inclusion of the individual genetic profile indicates an acknowledgement of the investigative potential of forensic DNA technologies and a relegation of civil liberties and human rights to the background, owing to the perceived benefits of protecting both society and the individual from crime. This rationale is mostly expressed by the idea that all citizens should contribute to the expansion of the National Forensic DNA Database for reasons that range from the more abstract assumption that donating a sample for profiling would be helpful in fighting crime to the more concrete suggestion that everyone (criminals and non-criminals) should be in the database. The concerns with the risks of accepting the donation of a sample for genetic profiling and inclusion in the National Forensic DNA Database are mostly related to lack of control and insufficient or unclear regulations concerning safeguarding individuals' data and supervising the access and uses of genetic data. By providing an empirically-grounded understanding of the attitudes regarding willingness to donate voluntary a sample for profiling and inclusion in a National Forensic DNA Database, this study also considers the citizens' perceived benefits and risks of operating forensic DNA databases. These collective views might be useful for the formation of international common ethical standards for the development and governance of DNA databases in a framework in which the citizens' perspectives are taken into consideration. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Machado, Helena; Santos, Filipe; Silva, Susana
2011-07-15
In this paper we aim to discuss how Portuguese prisoners know and what they feel about surveillance mechanisms related to the inclusion and deletion of the DNA profiles of convicted criminals in the national forensic database. Through a set of interviews with individuals currently imprisoned we focus on the ways this group perceives forensic DNA technologies. While the institutional and political discourses maintain that the restricted use and application of DNA profiles within the national forensic database protects individuals' rights, the prisoners claim that police misuse of such technologies potentially makes it difficult to escape from surveillance and acts as a mean of reinforcing the stigma of delinquency. The prisoners also argue that additional intensive and extensive use of surveillance devices might be more protective of their own individual rights and might possibly increase potential for exoneration. Crown Copyright © 2011. Published by Elsevier Ireland Ltd. All rights reserved.
[Current status of DNA databases in the forensic field: new progress, new legal needs].
Baeta, Miriam; Martínez-Jarreta, Begoña
2009-01-01
One of the most polemic issues regarding the use of deoxyribonucleic acid (DNA) in the legal sphere, refers to the creation of DNA databases. Until relatively recently, Spain did not have a law to support the establishment of a national DNA profile bank for forensic purposes, and preserve the fundamental rights of subjects whose data are archived therein. The regulatory law of police databases regarding identifiers obtained from DNA approved in 2007, covers this void in the Spanish legislation and responds to the incessant need to adapt the laws to continuous scientific and technological progress.
Ilgili, Önder; Arda, Berna
This paper presents and analyses, in terms of privacy and confidentiality, the Turkish Draft Law on National DNA Database prepared in 2004, and concerning the use of DNA analysis for forensic objectives and identity verification in Turkey. After a short introduction including related concepts, we evaluate the draft law and provide articles about confidentiality. The evaluation reminded us of some important topics at international level for the developing countries. As a result, the need for sophisticated legislations about DNA databases, for solutions to issues related to the education of employees, and the technological dependency to other countries emerged as main challenges in terms of confidentiality for the developing countries. As seen in the Turkish Draft Law on National DNA Database, the protection of the fundamental rights and freedoms requires more care during the legislative efforts.
The National DNA Data Bank of Canada: a Quebecer perspective
Milot, Emmanuel; Lecomte, Marie M. J.; Germain, Hugo; Crispino, Frank
2013-01-01
The Canadian National DNA Database was created in 1998 and first used in the mid-2000. Under management by the RCMP, the National DNA Data Bank of Canada offers each year satisfactory reported statistics for its use and efficiency. Built on two indexes (convicted offenders and crime scene indexes), the database not only provides increasing matches to offenders or linked traces to the various police forces of the nation, but offers a memory repository for cold cases. Despite these achievements, the data bank is now facing new challenges that will inevitably defy the way the database is currently used. These arise from the increasing power of detection of DNA traces, the diversity of demands from police investigators and the growth of the bank itself. Examples of new requirements from the database now include familial searches, low-copy-number analyses and the correct interpretation of mixed samples. This paper aims to develop on the original way set in Québec to address some of these challenges. Nevertheless, analytic and technological advances will inevitably lead to the introduction of new technologies in forensic laboratories, such as single cell sequencing, phenotyping, and proteomics. Furthermore, it will not only request a new holistic/global approach of the forensic molecular biology sciences (through academia and a more investigative role in the laboratory), but also new legal developments. Far from being exhaustive, this paper highlights some of the current use of the database, its potential for the future, and opportunity to expand as a result of recent technological developments in molecular biology, including, but not limited to DNA identification. PMID:24312124
The National DNA Data Bank of Canada: a Quebecer perspective.
Milot, Emmanuel; Lecomte, Marie M J; Germain, Hugo; Crispino, Frank
2013-11-20
The Canadian National DNA Database was created in 1998 and first used in the mid-2000. Under management by the RCMP, the National DNA Data Bank of Canada offers each year satisfactory reported statistics for its use and efficiency. Built on two indexes (convicted offenders and crime scene indexes), the database not only provides increasing matches to offenders or linked traces to the various police forces of the nation, but offers a memory repository for cold cases. Despite these achievements, the data bank is now facing new challenges that will inevitably defy the way the database is currently used. These arise from the increasing power of detection of DNA traces, the diversity of demands from police investigators and the growth of the bank itself. Examples of new requirements from the database now include familial searches, low-copy-number analyses and the correct interpretation of mixed samples. This paper aims to develop on the original way set in Québec to address some of these challenges. Nevertheless, analytic and technological advances will inevitably lead to the introduction of new technologies in forensic laboratories, such as single cell sequencing, phenotyping, and proteomics. Furthermore, it will not only request a new holistic/global approach of the forensic molecular biology sciences (through academia and a more investigative role in the laboratory), but also new legal developments. Far from being exhaustive, this paper highlights some of the current use of the database, its potential for the future, and opportunity to expand as a result of recent technological developments in molecular biology, including, but not limited to DNA identification.
Teodorović, Smilja; Mijović, Dragan; Radovanović Nenadić, Una; Savić, Marina
2017-05-01
Worldwide, the establishment of national forensic DNA databases has transformed personal identification in the criminal justice system over the past two decades. It has also stimulated much debate centering on ethical issues, human rights, individual privacy, lack of safeguards and other standards. Therefore, a balance between effectiveness and intrusiveness of a national DNA repository is an imperative and needs to be achieved through a suitable legal framework. On its path to the European Union (EU), the Republic of Serbia is required to harmonize its national policies and legislation with the EU. Specifically, Chapter 24 of the EU acquis communautaire (Justice, Freedom and Security) stipulates the compulsory creation of a forensic DNA registry and adoption of corresponding legislation. This process is expected to occur in 2016. Thus, in light of launching the national DNA database, the goal of this work is to instigate a consultation with the Serbian public regarding their views on various aspects of the forensic DNA databank. Importantly, this study specifically assessed the opinions of distinct categories of citizens, including the general public, the prosecutors' offices staff, prisoners, prison guards, and students majoring in criminalistics. Our findings set a baseline for Serbian attitudes towards DNA databank custody, DNA sample and profile inclusion and retention criteria, ethical issues and concerns. Furthermore, results clearly demonstrate a permissive outlook of the respondents who are professional "beneficiaries" of genetic profiling and a restrictive position taken by the respondents whose genetic material has been acquired by the government. We believe that this opinion poll will be essential in discussions regarding a national DNA database, as well as in motivating further research on the reasons behind the observed views and subsequent development of educational strategies. All of these are, in turn, expected to aid the creation of suitable legislation and to increase societal confidence that the repository will be used in the legal system without interference with individual rights and freedoms. Copyright © 2017 Elsevier B.V. All rights reserved.
Launching the Greek forensic DNA database. The legal framework and arising ethical issues.
Voultsos, Polychronis; Njau, Samuel; Tairis, Nikolaos; Psaroulis, Dimitrios; Kovatsi, Leda
2011-11-01
Since the creation of the first national DNA database in Europe in 1995, many European countries have legislated laws for initiating and regulating their own databases. The Greek government legislated a law in 2008, by which the National DNA Database of Greece was founded and regulated. According to this law, only DNA profiles from convicted criminals were recorded. Nevertheless, a year later, in 2009, the law was amended to permit the creation of an expanded database including innocent people and children. Unfortunately, the new law is very vague in many aspects and does not respect the principle of proportionality. Therefore, according to our opinion, it will soon need to be re-amended. Furthermore, prior to legislating the new law, there was no debate with the community itself in order to clarify what system would best suit Greece and what the citizens would be willing to accept. We present the current legal framework in Greece, we highlight issues that need to be clarified and we discuss possible ethical issues that may arise. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Forensic DNA databases in Western Balkan region: retrospectives, perspectives, and initiatives
Marjanović, Damir; Konjhodžić, Rijad; Butorac, Sara Sanela; Drobnič, Katja; Merkaš, Siniša; Lauc, Gordan; Primorac, Damir; Anđelinović, Šimun; Milosavljević, Mladen; Karan, Željko; Vidović, Stojko; Stojković, Oliver; Panić, Bojana; Vučetić Dragović, Anđelka; Kovačević, Sandra; Jakovski, Zlatko; Asplen, Chris; Primorac, Dragan
2011-01-01
The European Network of Forensic Science Institutes (ENFSI) recommended the establishment of forensic DNA databases and specific implementation and management legislations for all EU/ENFSI members. Therefore, forensic institutions from Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia launched a wide set of activities to support these recommendations. To assess the current state, a regional expert team completed detailed screening and investigation of the existing forensic DNA data repositories and associated legislation in these countries. The scope also included relevant concurrent projects and a wide spectrum of different activities in relation to forensics DNA use. The state of forensic DNA analysis was also determined in the neighboring Slovenia and Croatia, which already have functional national DNA databases. There is a need for a ‘regional supplement’ to the current documentation and standards pertaining to forensic application of DNA databases, which should include regional-specific preliminary aims and recommendations. PMID:21674821
Forensic DNA databases in Western Balkan region: retrospectives, perspectives, and initiatives.
Marjanović, Damir; Konjhodzić, Rijad; Butorac, Sara Sanela; Drobnic, Katja; Merkas, Sinisa; Lauc, Gordan; Primorac, Damir; Andjelinović, Simun; Milosavljević, Mladen; Karan, Zeljko; Vidović, Stojko; Stojković, Oliver; Panić, Bojana; Vucetić Dragović, Andjelka; Kovacević, Sandra; Jakovski, Zlatko; Asplen, Chris; Primorac, Dragan
2011-06-01
The European Network of Forensic Science Institutes (ENFSI) recommended the establishment of forensic DNA databases and specific implementation and management legislations for all EU/ENFSI members. Therefore, forensic institutions from Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia launched a wide set of activities to support these recommendations. To assess the current state, a regional expert team completed detailed screening and investigation of the existing forensic DNA data repositories and associated legislation in these countries. The scope also included relevant concurrent projects and a wide spectrum of different activities in relation to forensics DNA use. The state of forensic DNA analysis was also determined in the neighboring Slovenia and Croatia, which already have functional national DNA databases. There is a need for a 'regional supplement' to the current documentation and standards pertaining to forensic application of DNA databases, which should include regional-specific preliminary aims and recommendations.
ERIC Educational Resources Information Center
Harzbecker, Joseph, Jr.
1993-01-01
Describes the National Institute of Health's GenBank DNA sequence database and how it can be accessed through the Internet. A real reference question, which was answered successfully using the database, is reproduced to illustrate and elaborate on the potential of the Internet for information retrieval. (10 references) (KRN)
Three Decades of Recombinant DNA.
ERIC Educational Resources Information Center
Palmer, Jackie
1985-01-01
Discusses highlights in the development of genetic engineering, examining techniques with recombinant DNA, legal and ethical issues, GenBank (a national database of nucleic acid sequences), and other topics. (JN)
The Israel DNA database--the establishment of a rapid, semi-automated analysis system.
Zamir, Ashira; Dell'Ariccia-Carmon, Aviva; Zaken, Neomi; Oz, Carla
2012-03-01
The Israel Police DNA database, also known as IPDIS (Israel Police DNA Index System), has been operating since February 2007. During that time more than 135,000 reference samples have been uploaded and more than 2000 hits reported. We have developed an effective semi-automated system that includes two automated punchers, three liquid handler robots and four genetic analyzers. An inhouse LIMS program enables full tracking of every sample through the entire process of registration, pre-PCR handling, analysis of profiles, uploading to the database, hit reports and ultimately storage. The LIMS is also responsible for the future tracking of samples and their profiles to be expunged from the database according to the Israeli DNA legislation. The database is administered by an in-house developed software program, where reference and evidentiary profiles are uploaded, stored, searched and matched. The DNA database has proven to be an effective investigative tool which has gained the confidence of the Israeli public and on which the Israel National Police force has grown to rely. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Development of a 20-locus fluorescent multiplex system as a valuable tool for national DNA database.
Jiang, Xianhua; Guo, Fei; Jia, Fei; Jin, Ping; Sun, Zhu
2013-02-01
The multiplex system allows the detection of 19 autosomal short tandem repeat (STR) loci [including all Combined DNA Index System (CODIS) STR loci as well as D2S1338, D6S1043, D12S391, D19S433, Penta D and Penta E] plus the sex-determining locus Amelogenin in a single reaction, comprising all STR loci in various commercial kits used in the China national DNA database (NDNAD). Primers are designed so that the amplicons are distributed ranging from 90 base pairs (bp) to 450 bp within a five-dye fluorescent design with the fifth dye reserved for the internal size standard. With 30 cycles, 125 pg to 2 ng DNA template showed optimal profiling result, while robust profiles could also be achieved by adjusting the cycle numbers for the DNA template beyond that optimal DNA input range. Mixture studies showed that 83% and 87% of minor alleles were detected at 9:1 and 1:9 ratios, respectively. When 4 ng of degraded DNA was digested by 2-min DNase and 1 ng undegraded DNA was added to 400 μM haematin, the complete profiles were still observed. Polymerase chain reaction (PCR)-based procedures were examined and optimized including the concentrations of primer set, magnesium and the Taq polymerase as well as volume, cycle number and annealing temperature. In addition, the system has been validated by 3000 bloodstain samples and 35 common case samples in line with the Chinese National Standards and Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines. The total probability of identity (TPI) can reach to 8×10(-24), where DNA database can be improved at the level of 10 million DNA profiles or more because the number of expected match is far from one person (4×10(-10)) and can be negligible. Further, our system also demonstrates its good performance in case samples and it will be an ideal tool for forensic DNA typing and databasing with potential application. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Benschop, Corina C G; van de Merwe, Linda; de Jong, Jeroen; Vanvooren, Vanessa; Kempenaers, Morgane; Kees van der Beek, C P; Barni, Filippo; Reyes, Eusebio López; Moulin, Léa; Pene, Laurent; Haned, Hinda; Sijen, Titia
2017-07-01
Searching a national DNA database with complex and incomplete profiles usually yields very large numbers of possible matches that can present many candidate suspects to be further investigated by the forensic scientist and/or police. Current practice in most forensic laboratories consists of ordering these 'hits' based on the number of matching alleles with the searched profile. Thus, candidate profiles that share the same number of matching alleles are not differentiated and due to the lack of other ranking criteria for the candidate list it may be difficult to discern a true match from the false positives or notice that all candidates are in fact false positives. SmartRank was developed to put forward only relevant candidates and rank them accordingly. The SmartRank software computes a likelihood ratio (LR) for the searched profile and each profile in the DNA database and ranks database entries above a defined LR threshold according to the calculated LR. In this study, we examined for mixed DNA profiles of variable complexity whether the true donors are retrieved, what the number of false positives above an LR threshold is and the ranking position of the true donors. Using 343 mixed DNA profiles over 750 SmartRank searches were performed. In addition, the performance of SmartRank and CODIS were compared regarding DNA database searches and SmartRank was found complementary to CODIS. We also describe the applicable domain of SmartRank and provide guidelines. The SmartRank software is open-source and freely available. Using the best practice guidelines, SmartRank enables obtaining investigative leads in criminal cases lacking a suspect. Copyright © 2017 Elsevier B.V. All rights reserved.
Gill, Peter; Haned, Hinda; Bleka, Oyvind; Hansson, Oskar; Dørum, Guro; Egeland, Thore
2015-09-01
The introduction of Short Tandem Repeat (STR) DNA was a revolution within a revolution that transformed forensic DNA profiling into a tool that could be used, for the first time, to create National DNA databases. This transformation would not have been possible without the concurrent development of fluorescent automated sequencers, combined with the ability to multiplex several loci together. Use of the polymerase chain reaction (PCR) increased the sensitivity of the method to enable the analysis of a handful of cells. The first multiplexes were simple: 'the quad', introduced by the defunct UK Forensic Science Service (FSS) in 1994, rapidly followed by a more discriminating 'six-plex' (Second Generation Multiplex) in 1995 that was used to create the world's first national DNA database. The success of the database rapidly outgrew the functionality of the original system - by the year 2000 a new multiplex of ten-loci was introduced to reduce the chance of adventitious matches. The technology was adopted world-wide, albeit with different loci. The political requirement to introduce pan-European databases encouraged standardisation - the development of European Standard Set (ESS) of markers comprising twelve-loci is the latest iteration. Although development has been impressive, the methods used to interpret evidence have lagged behind. For example, the theory to interpret complex DNA profiles (low-level mixtures), had been developed fifteen years ago, but only in the past year or so, are the concepts starting to be widely adopted. A plethora of different models (some commercial and others non-commercial) have appeared. This has led to a confusing 'debate' about the 'best' to use. The different models available are described along with their advantages and disadvantages. A section discusses the development of national DNA databases, along with details of an associated controversy to estimate the strength of evidence of matches. Current methodology is limited to searches of complete profiles - another example where the interpretation of matches has not kept pace with development of theory. STRs have also transformed the area of Disaster Victim Identification (DVI) which frequently requires kinship analysis. However, genotyping efficiency is complicated by complex, degraded DNA profiles. Finally, there is now a detailed understanding of the causes of stochastic effects that cause DNA profiles to exhibit the phenomena of drop-out and drop-in, along with artefacts such as stutters. The phenomena discussed include: heterozygote balance; stutter; degradation; the effect of decreasing quantities of DNA; the dilution effect. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Chaitanya, Lakshmi; van Oven, Mannis; Brauer, Silke; Zimmermann, Bettina; Huber, Gabriela; Xavier, Catarina; Parson, Walther; de Knijff, Peter; Kayser, Manfred
2016-03-01
The use of mitochondrial DNA (mtDNA) for maternal lineage identification often marks the last resort when investigating forensic and missing-person cases involving highly degraded biological materials. As with all comparative DNA testing, a match between evidence and reference sample requires a statistical interpretation, for which high-quality mtDNA population frequency data are crucial. Here, we determined, under high quality standards, the complete mtDNA control-region sequences of 680 individuals from across the Netherlands sampled at 54 sites, covering the entire country with 10 geographic sub-regions. The complete mtDNA control region (nucleotide positions 16,024-16,569 and 1-576) was amplified with two PCR primers and sequenced with ten different sequencing primers using the EMPOP protocol. Haplotype diversity of the entire sample set was very high at 99.63% and, accordingly, the random-match probability was 0.37%. No population substructure within the Netherlands was detected with our dataset. Phylogenetic analyses were performed to determine mtDNA haplogroups. Inclusion of these high-quality data in the EMPOP database (accession number: EMP00666) will improve its overall data content and geographic coverage in the interest of all EMPOP users worldwide. Moreover, this dataset will serve as (the start of) a national reference database for mtDNA applications in forensic and missing person casework in the Netherlands. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources.
Lim, Jeongheui; Kim, Sang-Yoon; Kim, Sungmin; Eo, Hae-Seok; Kim, Chang-Bae; Paek, Woon Kee; Kim, Won; Bhak, Jong
2009-12-03
DNA barcoding provides a rapid, accurate, and standardized method for species-level identification using short DNA sequences. Such a standardized identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases. We have built a general DNA barcode data processing system, BioBarcode, with open source software - which is a general purpose database and server. It uses mySQL RDBMS 5.0, BLAST2, and Apache httpd server. An exemplary database of BioBarcode has around 11,300 specimen entries (including GenBank data) and registers the biological species to map their genetic relationships. The BioBarcode database contains a chromatogram viewer which improves the performance in DNA sequence analyses. Asia has a very high degree of biodiversity and the BioBarcode database server system aims to provide an efficient bioinformatics protocol that can be freely used by Asian researchers and research organizations interested in DNA barcoding. The BioBarcode promotes the rapid acquisition of biological species DNA sequence data that meet global standards by providing specialized services, and provides useful tools that will make barcoding cheaper and faster in the biodiversity community such as standardization, depository, management, and analysis of DNA barcode data. The system can be downloaded upon request, and an exemplary server has been constructed with which to build an Asian biodiversity system http://www.asianbarcode.org.
Mashima, Jun; Kodama, Yuichi; Fujisawa, Takatomo; Katayama, Toshiaki; Okuda, Yoshihiro; Kaminuma, Eli; Ogasawara, Osamu; Okubo, Kousaku; Nakamura, Yasukazu; Takagi, Toshihisa
2017-01-01
The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). The DDBJ Center also services Japanese Genotype-phenotype Archive (JGA), with the National Bioscience Database Center to collect human-subjected data from Japanese researchers. Here, we report our database activities for INSDC and JGA over the past year, and introduce retrieval and analytical services running on our supercomputer system and their recent modifications. Furthermore, with the Database Center for Life Science, the DDBJ Center improves semantic web technologies to integrate and to share biological data, for providing the RDF version of the sequence data. PMID:27924010
Gill, P; Bleka, Ø; Egeland, T
2014-11-01
Likelihood ratio (LR) methods to interpret multi-contributor, low template, complex DNA mixtures are becoming standard practice. The next major development will be to introduce search engines based on the new methods to interrogate very large national DNA databases, such as those held by China, the USA and the UK. Here we describe a rapid method that was used to assign a LR to each individual member of database of 5 million genotypes which can be ranked in order. Previous authors have only considered database trawls in the context of binary match or non-match criteria. However, the concept of match/non-match no longer applies within the new paradigm introduced, since the distribution of resultant LRs is continuous for practical purposes. An English appeal court decision allows scientists to routinely report complex DNA profiles using nothing more than their subjective personal 'experience of casework' and 'observations' in order to apply an expression of the rarity of an evidential sample. This ruling must be considered in context of a recent high profile English case, where an individual was extracted from a database and wrongly accused of a serious crime. In this case the DNA evidence was used to negate the overwhelming exculpatory (non-DNA) evidence. Demonstrable confirmation bias, also known as the 'CSI-effect, seriously affected the investigation. The case demonstrated that in practice, databases could be used to select and prosecute an individual, simply because he ranked high in the list of possible matches. We have identified this phenomenon as a cognitive error which we term: 'the naïve investigator effect'. We take the opportunity to test the performance of database extraction strategies either by using a simple matching allele count (MAC) method or LR. The example heard by the appeal court is used as the exemplar case. It is demonstrated that the LR search-method offers substantial benefits compared to searches based on simple matching allele count (MAC) methods. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Database extraction strategies for low-template evidence.
Bleka, Øyvind; Dørum, Guro; Haned, Hinda; Gill, Peter
2014-03-01
Often in forensic cases, the profile of at least one of the contributors to a DNA evidence sample is unknown and a database search is needed to discover possible perpetrators. In this article we consider two types of search strategies to extract suspects from a database using methods based on probability arguments. The performance of the proposed match scores is demonstrated by carrying out a study of each match score relative to the level of allele drop-out in the crime sample, simulating low-template DNA. The efficiency was measured by random man simulation and we compared the performance using the SGM Plus kit and the ESX 17 kit for the Norwegian population, demonstrating that the latter has greatly enhanced power to discover perpetrators of crime in large national DNA databases. The code for the database extraction strategies will be prepared for release in the R-package forensim. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Maguire, C N; McCallum, L A; Storey, C; Whitaker, J P
2014-01-01
The National DNA Database (NDNAD) of England and Wales was established on April 10th 1995. The NDNAD is governed by a variety of legislative instruments that mean that DNA samples can be taken if an individual is arrested and detained in a police station. The biological samples and the DNA profiles derived from them can be used for purposes related to the prevention and detection of crime, the investigation of an offence and for the conduct of a prosecution. Following the South East Asian Tsunami of December 2004, the legislation was amended to allow the use of the NDNAD to assist in the identification of a deceased person or of a body part where death has occurred from natural causes or from a natural disaster. The UK NDNAD now contains the DNA profiles of approximately 6 million individuals representing 9.6% of the UK population. As the science of DNA profiling advanced, the National DNA Database provided a potential resource for increased intelligence beyond the direct matching for which it was originally created. The familial searching service offered to the police by several UK forensic science providers exploits the size and geographic coverage of the NDNAD and the fact that close relatives of an offender may share a significant proportion of that offender's DNA profile and will often reside in close geographic proximity to him or her. Between 2002 and 2011 Forensic Science Service Ltd. (FSS) provided familial search services to support 188 police investigations, 70 of which are still active cases. This technique, which may be used in serious crime cases or in 'cold case' reviews when there are few or no investigative leads, has led to the identification of 41 perpetrators or suspects. In this paper we discuss the processes, utility, and governance of the familial search service in which the NDNAD is searched for close genetic relatives of an offender who has left DNA evidence at a crime scene, but whose DNA profile is not represented within the NDNAD. We discuss the scientific basis of the familial search approach, other DNA-based methods for eliminating individuals from the candidate lists generated by these NDNAD searches, the value of filtering these lists by age, ethnic appearance and geography and the governance required by the NDNAD Strategy Board when a police force commissions a familial search. We present the FSS data in relation to the utility of the familial searching service and demonstrate the power of the technique by reference to casework examples. We comment on the uptake of familial searching of DNA databases in the USA, the Netherlands, Australia, and New Zealand. Finally, following the adverse ruling by the European Court of Human Rights against the UK in regard to the S & Marper cases and the consequent introduction of the Protection of Freedoms Act (2012), we discuss the impact that changes to regulations concerning the storage of DNA samples will have on the continuing provision of familial searching of the National DNA Database in England and Wales. Published by Elsevier Ireland Ltd.
Genetics and Forensics: Making the National DNA Database
Johnson, Paul; Williams, Robin; Martin, Paul
2005-01-01
This paper is based on a current study of the growing police use of the epistemic authority of molecular biology for the identification of criminal suspects in support of crime investigation. It discusses the development of DNA profiling and the establishment and development of the UK National DNA Database (NDNAD) as an instance of the ‘scientification of police work’ (Ericson and Shearing 1986) in which the police uses of science and technology have a recursive effect on their future development. The NDNAD, owned by the Association of Chief Police Officers of England and Wales, is the first of its kind in the world and currently contains the genetic profiles of more than 2 million people. The paper provides a framework for the examination of this socio-technical innovation, begins to tease out the dense and compact history of the database and accounts for the way in which changes and developments across disparate scientific, governmental and policing contexts, have all contributed to the range of uses to which it is put. PMID:16467921
BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources
2009-01-01
Background DNA barcoding provides a rapid, accurate, and standardized method for species-level identification using short DNA sequences. Such a standardized identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases. Results We have built a general DNA barcode data processing system, BioBarcode, with open source software - which is a general purpose database and server. It uses mySQL RDBMS 5.0, BLAST2, and Apache httpd server. An exemplary database of BioBarcode has around 11,300 specimen entries (including GenBank data) and registers the biological species to map their genetic relationships. The BioBarcode database contains a chromatogram viewer which improves the performance in DNA sequence analyses. Conclusion Asia has a very high degree of biodiversity and the BioBarcode database server system aims to provide an efficient bioinformatics protocol that can be freely used by Asian researchers and research organizations interested in DNA barcoding. The BioBarcode promotes the rapid acquisition of biological species DNA sequence data that meet global standards by providing specialized services, and provides useful tools that will make barcoding cheaper and faster in the biodiversity community such as standardization, depository, management, and analysis of DNA barcode data. The system can be downloaded upon request, and an exemplary server has been constructed with which to build an Asian biodiversity system http://www.asianbarcode.org. PMID:19958506
Yoo, Seong Yeon; Cho, Nam Soo; Park, Myung Jin; Seong, Ki Min; Hwang, Jung Ho; Song, Seok Bean; Han, Myun Soo; Lee, Won Tae; Chung, Ki Wha
2011-01-01
Genotyping of highly polymorphic short tandem repeat (STR) markers is widely used for the genetic identification of individuals in forensic DNA analyses and in paternity disputes. The National DNA Profile Databank recently established by the DNA Identification Act in Korea contains the computerized STR DNA profiles of individuals convicted of crimes. For the establishment of a large autosomal STR loci population database, 1805 samples were obtained at random from Korean individuals and 15 autosomal STR markers were analyzed using the AmpFlSTR Identifiler PCR Amplification kit. For the 15 autosomal STR markers, no deviations from the Hardy-Weinberg equilibrium were observed. The most informative locus in our data set was the D2S1338 with a discrimination power of 0.9699. The combined matching probability was 1.521 × 10-17. This large STR profile dataset including atypical alleles will be important for the establishment of the Korean DNA database and for forensic applications. PMID:21597912
Yoo, Seong Yeon; Cho, Nam Soo; Park, Myung Jin; Seong, Ki Min; Hwang, Jung Ho; Song, Seok Bean; Han, Myun Soo; Lee, Won Tae; Chung, Ki Wha
2011-07-01
Genotyping of highly polymorphic short tandem repeat (STR) markers is widely used for the genetic identification of individuals in forensic DNA analyses and in paternity disputes. The National DNA Profile Databank recently established by the DNA Identification Act in Korea contains the computerized STR DNA profiles of individuals convicted of crimes. For the establishment of a large autosomal STR loci population database, 1805 samples were obtained at random from Korean individuals and 15 autosomal STR markers were analyzed using the AmpFlSTR Identifiler PCR Amplification kit. For the 15 autosomal STR markers, no deviations from the Hardy-Weinberg equilibrium were observed. The most informative locus in our data set was the D2S1338 with a discrimination power of 0.9699. The combined matching probability was 1.521 × 10(-17). This large STR profile dataset including atypical alleles will be important for the establishment of the Korean DNA database and for forensic applications.
STRBase: a short tandem repeat DNA database for the human identity testing community
Ruitberg, Christian M.; Reeder, Dennis J.; Butler, John M.
2001-01-01
The National Institute of Standards and Technology (NIST) has compiled and maintained a Short Tandem Repeat DNA Internet Database (http://www.cstl.nist.gov/biotech/strbase/) since 1997 commonly referred to as STRBase. This database is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. STRBase consolidates and organizes the abundant literature on this subject to facilitate on-going efforts in DNA typing. Observed alleles and annotated sequence for each STR locus are described along with a review of STR analysis technologies. Additionally, commercially available STR multiplex kits are described, published polymerase chain reaction (PCR) primer sequences are reported, and validation studies conducted by a number of forensic laboratories are listed. To supplement the technical information, addresses for scientists and hyperlinks to organizations working in this area are available, along with the comprehensive reference list of over 1300 publications on STRs used for DNA typing purposes. PMID:11125125
Williams, Robin; Johnson, Paul
2005-01-01
This paper examines the increasing police use of DNA profiling and databasing as a developing instrumentality of modern state surveillance. It briefly notes previously published work on a variety of surveillance technologies and their role in the governance of social action and social order. It then argues that there are important differences amongst the ways in which several such technologies construct and use identificatory artefacts, their orientations to human subjectivity, and their role in the governmentality of citizens and others. The paper then describes the novel and powerful form of bio-surveillance offered by DNA profiling and illustrates this by reference to an ongoing empirical study of the police uses of the UK National DNA Database for the investigation of crime. It is argued that DNA profiling and databasing enable the construction of a ‘closed circuit’ of surveillance of a defined population. PMID:16467920
Information resources at the National Center for Biotechnology Information.
Woodsmall, R M; Benson, D A
1993-01-01
The National Center for Biotechnology Information (NCBI), part of the National Library of Medicine, was established in 1988 to perform basic research in the field of computational molecular biology as well as build and distribute molecular biology databases. The basic research has led to new algorithms and analysis tools for interpreting genomic data and has been instrumental in the discovery of human disease genes for neurofibromatosis and Kallmann syndrome. The principal database responsibility is the National Institutes of Health (NIH) genetic sequence database, GenBank. NCBI, in collaboration with international partners, builds, distributes, and provides online and CD-ROM access to over 112,000 DNA sequences. Another major program is the integration of multiple sequences databases and related bibliographic information and the development of network-based retrieval systems for Internet access. PMID:8374583
Mansel, Charlotte; Davies, Sharon
2012-10-01
There are currently over 250,000 children between the ages of 10 and 18 years who have their genetic information stored on the National DNA Database. This paper explores the legal and ethical issues surrounding this controversial subject, with particular focus on juvenile capacity and the potential results of criminalizing young children and adolescents. The implications of the adverse legal judgement of the European Court of Human Rights in S and Marper v UK (2008) and the violation of Article 8 of the Convention are discussed. The authors have considered the requirement to balance the rights of the individual, particularly those of minors, against the need to protect the public and have compared the position in Scotland to that of the rest of the UK. The authors conclude that a more ethically acceptable alternative could be the creation of a separate forensic database for children aged 10-18 years, set up to safeguard the interests of those who have not been convicted of any crime.
USDA-ARS?s Scientific Manuscript database
High density genotyping techniques are needed for investigating antimicrobial resistance especially in the case of multi-drug resistant (MDR) isolates. To achieve this all antimicrobial resistance genes in the NCBI Genbank database were identified by key word searches of sequence annotations and the...
Insect barcode information system.
Pratheepa, Maria; Jalali, Sushil Kumar; Arokiaraj, Robinson Silvester; Venkatesan, Thiruvengadam; Nagesh, Mandadi; Panda, Madhusmita; Pattar, Sharath
2014-01-01
Insect Barcode Information System called as Insect Barcode Informática (IBIn) is an online database resource developed by the National Bureau of Agriculturally Important Insects, Bangalore. This database provides acquisition, storage, analysis and publication of DNA barcode records of agriculturally important insects, for researchers specifically in India and other countries. It bridges a gap in bioinformatics by integrating molecular, morphological and distribution details of agriculturally important insects. IBIn was developed using PHP/My SQL by using relational database management concept. This database is based on the client- server architecture, where many clients can access data simultaneously. IBIn is freely available on-line and is user-friendly. IBIn allows the registered users to input new information, search and view information related to DNA barcode of agriculturally important insects.This paper provides a current status of insect barcode in India and brief introduction about the database IBIn. http://www.nabg-nbaii.res.in/barcode.
DNA Data Bank of Japan: 30th anniversary.
Kodama, Yuichi; Mashima, Jun; Kosuge, Takehide; Kaminuma, Eli; Ogasawara, Osamu; Okubo, Kousaku; Nakamura, Yasukazu; Takagi, Toshihisa
2018-01-04
The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center also services the Japanese Genotype-phenotype Archive (JGA) with the National Bioscience Database Center to collect genotype and phenotype data of human individuals. Here, we outline our database activities for INSDC and JGA over the past year, and introduce submission, retrieval and analysis services running on our supercomputer system and their recent developments. Furthermore, we highlight our responses to the amended Japanese rules for the protection of personal information and the launch of the DDBJ Group Cloud service for sharing pre-publication data among research groups. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
2007-06-01
short period of time. When web search organizations canvas the web looking for sites to catalog, they will discover your systems and create registry... Fingerprint & DNA Databases, INTERPOL & National Law Enforcement Communication Systems, Firearms Registration Records, Drivers License, Birth
The UK National DNA Database: Implementation of the Protection of Freedoms Act 2012.
Amankwaa, Aaron Opoku; McCartney, Carole
2018-03-01
In 2008, the European Court of Human Rights, in S and Marper v the United Kingdom, ruled that a retention regime that permits the indefinite retention of DNA records of both convicted and non-convicted ("innocent") individuals is disproportionate. The court noted that there was inadequate evidence to justify the retention of DNA records of the innocent. Since the Marper ruling, the laws governing the taking, use, and retention of forensic DNA in England and Wales have changed with the enactment of the Protection of Freedoms Act 2012 (PoFA). This Act, put briefly, permits the indefinite retention of DNA profiles of most convicted individuals and temporal retention for some first-time convicted minors and innocent individuals on the National DNA Database (NDNAD). The PoFA regime was implemented in October 2013. This paper examines ten post-implementation reports of the NDNAD Strategy Board (3), the NDNAD Ethics Group (3) and the Office of the Biometrics Commissioner (OBC) (4). Overall, the reports highlight a considerable improvement in the performance of the database, with a current match rate of 63.3%. Further, the new regime has strengthened the genetic privacy protection of UK citizens. The OBC reports detail implementation challenges ranging from technical, legal and procedural issues to sufficient understanding of the requirements of PoFA by police forces. Risks highlighted in these reports include the deletion of some "retainable" profiles, which could potentially lead to future crimes going undetected. A further risk is the illegal retention of some profiles from innocent individuals, which may lead to privacy issues and legal challenges. In conclusion, the PoFA regime appears to be working well, however, critical research is still needed to evaluate its overall efficacy compared to other retention regimes. Copyright © 2018 Elsevier B.V. All rights reserved.
Wheeler, David
2007-01-01
GenBank(R) is a comprehensive database of publicly available DNA sequences for more than 205,000 named organisms and for more than 60,000 within the embryophyta, obtained through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Daily data exchange with the European Molecular Biology Laboratory (EMBL) in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases with taxonomy, genome, mapping, protein structure, and domain information and the biomedical journal literature through PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available through FTP. GenBank usage scenarios ranging from local analyses of the data available through FTP to online analyses supported by the NCBI Web-based tools are discussed. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.
Hicks, T; Taroni, F; Curran, J; Buckleton, J; Castella, V; Ribaux, O
2010-10-01
Familial searching consists of searching for a full profile left at a crime scene in a National DNA Database (NDNAD). In this paper we are interested in the circumstance where no full match is returned, but a partial match is found between a database member's profile and the crime stain. Because close relatives share more of their DNA than unrelated persons, this partial match may indicate that the crime stain was left by a close relative of the person with whom the partial match was found. This approach has successfully solved important crimes in the UK and the USA. In a previous paper, a model, which takes into account substructure and siblings, was used to simulate a NDNAD. In this paper, we have used this model to test the usefulness of familial searching and offer guidelines for pre-assessment of the cases based on the likelihood ratio. Siblings of "persons" present in the simulated Swiss NDNAD were created. These profiles (N=10,000) were used as traces and were then compared to the whole database (N=100,000). The statistical results obtained show that the technique has great potential confirming the findings of previous studies. However, effectiveness of the technique is only one part of the story. Familial searching has juridical and ethical aspects that should not be ignored. In Switzerland for example, there are no specific guidelines to the legality or otherwise of familial searching. This article both presents statistical results, and addresses criminological and civil liberties aspects to take into account risks and benefits of familial searching. Copyright © 2009 Elsevier Ireland Ltd. All rights reserved.
Contamination of sequence databases with adaptor sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoshikawa, Takeo; Sanders, A.R.; Detera-Wadleigh, S.D.
Because of the exponential increase in the amount of DNA sequences being added to the public databases on a daily basis, it has become imperative to identify sources of contamination rapidly. Previously, contaminations of sequence databases have been reported to alert the scientific community to the problem. These contaminations can be divided into two categories. The first category comprises host sequences that have been difficult for submitters to manage or control. Examples include anomalous sequences derived from Escherichia coli, which are inserted into the chromosomes (and plasmids) of the bacterial hosts. Insertion sequences are highly mobile and are capable ofmore » transposing themselves into plasmids during cloning manipulation. Another example of the first category is the infection with yeast genomic DNA or with bacterial DNA of some commercially available cDNA libraries from Clontech. The second category of database contamination is due to the inadvertent inclusion of nonhost sequences. This category includes incorporation of cloning-vector sequences and multicloning sites in the database submission. M13-derived artifacts have been common, since M13-based vectors have been widely used for subcloning DNA fragments. Recognizing this problem, the National Center for Biotechnology Information (NCBI) started to screen, in April 1994, all sequences directly submitted to GenBank, against a set of vector data retrieved from GenBank by use of key-word searches, such as {open_quotes}vector.{close_quotes} In this report, we present evidence for another sequence artifact that is widespread but that, to our knowledge, has not yet been reported. 11 refs., 1 tab.« less
New taxonomy and old collections: integrating DNA barcoding into the collection curation process.
Puillandre, N; Bouchet, P; Boisselier-Dubayle, M-C; Brisset, J; Buge, B; Castelin, M; Chagnoux, S; Christophe, T; Corbari, L; Lambourdière, J; Lozouet, P; Marani, G; Rivasseau, A; Silva, N; Terryn, Y; Tillier, S; Utge, J; Samadi, S
2012-05-01
Because they house large biodiversity collections and are also research centres with sequencing facilities, natural history museums are well placed to develop DNA barcoding best practices. The main difficulty is generally the vouchering system: it must ensure that all data produced remain attached to the corresponding specimen, from the field to publication in articles and online databases. The Museum National d'Histoire Naturelle in Paris is one of the leading laboratories in the Marine Barcode of Life (MarBOL) project, which was used as a pilot programme to include barcode collections for marine molluscs and crustaceans. The system is based on two relational databases. The first one classically records the data (locality and identification) attached to the specimens. In the second one, tissue-clippings, DNA extractions (both preserved in 2D barcode tubes) and PCR data (including primers) are linked to the corresponding specimen. All the steps of the process [sampling event, specimen identification, molecular processing, data submission to Barcode Of Life Database (BOLD) and GenBank] are thus linked together. Furthermore, we have developed several web-based tools to automatically upload data into the system, control the quality of the sequences produced and facilitate the submission to online databases. This work is the result of a joint effort from several teams in the Museum National d'Histoire Naturelle (MNHN), but also from a collaborative network of taxonomists and molecular systematists outside the museum, resulting in the vouchering so far of ∼41,000 sequences and the production of ∼11,000 COI sequences. © 2012 Blackwell Publishing Ltd.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2009-01-01
GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank(R) staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
U.S. initiatives to strengthen forensic science & international standards in forensic DNA.
Butler, John M
2015-09-01
A number of initiatives are underway in the United States in response to the 2009 critique of forensic science by a National Academy of Sciences committee. This article provides a broad review of activities including efforts of the White House National Science and Technology Council Subcommittee on Forensic Science and a partnership between the Department of Justice (DOJ) and the National Institute of Standards and Technology (NIST) to create the National Commission on Forensic Science and the Organization of Scientific Area Committees. These initiatives are seeking to improve policies and practices of forensic science. Efforts to fund research activities and aid technology transition and training in forensic science are also covered. The second portion of the article reviews standards in place or in development around the world for forensic DNA. Documentary standards are used to help define written procedures to perform testing. Physical standards serve as reference materials for calibration and traceability purposes when testing is performed. Both documentary and physical standards enable reliable data comparison, and standard data formats and common markers or testing regions are crucial for effective data sharing. Core DNA markers provide a common framework and currency for constructing DNA databases with compatible data. Recent developments in expanding core DNA markers in Europe and the United States are discussed. Published by Elsevier Ireland Ltd.
U.S. initiatives to strengthen forensic science & international standards in forensic DNA
Butler, John M.
2015-01-01
A number of initiatives are underway in the United States in response to the 2009 critique of forensic science by a National Academy of Sciences committee. This article provides a broad review of activities including efforts of the White House National Science and Technology Council Subcommittee on Forensic Science and a partnership between the Department of Justice (DOJ) and the National Institute of Standards and Technology (NIST) to create the National Commission on Forensic Science and the Organization of Scientific Area Committees. These initiatives are seeking to improve policies and practices of forensic science. Efforts to fund research activities and aid technology transition and training in forensic science are also covered. The second portion of the article reviews standards in place or in development around the world for forensic DNA. Documentary standards are used to help define written procedures to perform testing. Physical standards serve as reference materials for calibration and traceability purposes when testing is performed. Both documentary and physical standards enable reliable data comparison, and standard data formats and common markers or testing regions are crucial for effective data sharing. Core DNA markers provide a common framework and currency for constructing DNA databases with compatible data. Recent developments in expanding core DNA markers in Europe and the United States are discussed. PMID:26164236
Development of forensic-quality full mtGenome haplotypes: success rates with low template specimens.
Just, Rebecca S; Scheible, Melissa K; Fast, Spence A; Sturk-Andreaggi, Kimberly; Higginbotham, Jennifer L; Lyons, Elizabeth A; Bush, Jocelyn M; Peck, Michelle A; Ring, Joseph D; Diegoli, Toni M; Röck, Alexander W; Huber, Gabriela E; Nagl, Simone; Strobl, Christina; Zimmermann, Bettina; Parson, Walther; Irwin, Jodi A
2014-05-01
Forensic mitochondrial DNA (mtDNA) testing requires appropriate, high quality reference population data for estimating the rarity of questioned haplotypes and, in turn, the strength of the mtDNA evidence. Available reference databases (SWGDAM, EMPOP) currently include information from the mtDNA control region; however, novel methods that quickly and easily recover mtDNA coding region data are becoming increasingly available. Though these assays promise to both facilitate the acquisition of mitochondrial genome (mtGenome) data and maximize the general utility of mtDNA testing in forensics, the appropriate reference data and database tools required for their routine application in forensic casework are lacking. To address this deficiency, we have undertaken an effort to: (1) increase the large-scale availability of high-quality entire mtGenome reference population data, and (2) improve the information technology infrastructure required to access/search mtGenome data and employ them in forensic casework. Here, we describe the application of a data generation and analysis workflow to the development of more than 400 complete, forensic-quality mtGenomes from low DNA quantity blood serum specimens as part of a U.S. National Institute of Justice funded reference population databasing initiative. We discuss the minor modifications made to a published mtGenome Sanger sequencing protocol to maintain a high rate of throughput while minimizing manual reprocessing with these low template samples. The successful use of this semi-automated strategy on forensic-like samples provides practical insight into the feasibility of producing complete mtGenome data in a routine casework environment, and demonstrates that large (>2kb) mtDNA fragments can regularly be recovered from high quality but very low DNA quantity specimens. Further, the detailed empirical data we provide on the amplification success rates across a range of DNA input quantities will be useful moving forward as PCR-based strategies for mtDNA enrichment are considered for targeted next-generation sequencing workflows. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites.
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955
The DNA database search controversy revisited: bridging the Bayesian-frequentist gap.
Storvik, Geir; Egeland, Thore
2007-09-01
Two different quantities have been suggested for quantification of evidence in cases where a suspect is found by a search through a database of DNA profiles. The likelihood ratio, typically motivated from a Bayesian setting, is preferred by most experts in the field. The so-called np rule has been suggested through frequentist arguments and has been suggested by the American National Research Council and Stockmarr (1999, Biometrics55, 671-677). The two quantities differ substantially and have given rise to the DNA database search controversy. Although several authors have criticized the different approaches, a full explanation of why these differences appear is still lacking. In this article we show that a P-value in a frequentist hypothesis setting is approximately equal to the result of the np rule. We argue, however, that a more reasonable procedure in this case is to use conditional testing, in which case a P-value directly related to posterior probabilities and the likelihood ratio is obtained. This way of viewing the problem bridges the gap between the Bayesian and frequentist approaches. At the same time it indicates that the np rule should not be used to quantify evidence.
Hunter, Susan B.; Vauterin, Paul; Lambert-Fair, Mary Ann; Van Duyne, M. Susan; Kubota, Kristy; Graves, Lewis; Wrigley, Donna; Barrett, Timothy; Ribot, Efrain
2005-01-01
The PulseNet National Database, established by the Centers for Disease Control and Prevention in 1996, consists of pulsed-field gel electrophoresis (PFGE) patterns obtained from isolates of food-borne pathogens (currently Escherichia coli O157:H7, Salmonella, Shigella, and Listeria) and textual information about the isolates. Electronic images and accompanying text are submitted from over 60 U.S. public health and food regulatory agency laboratories. The PFGE patterns are generated according to highly standardized PFGE protocols. Normalization and accurate comparison of gel images require the use of a well-characterized size standard in at least three lanes of each gel. Originally, a well-characterized strain of each organism was chosen as the reference standard for that particular database. The increasing number of databases, difficulty in identifying an organism-specific standard for each database, the increased range of band sizes generated by the use of additional restriction endonucleases, and the maintenance of many different organism-specific strains encouraged us to search for a more versatile and universal DNA size marker. A Salmonella serotype Braenderup strain (H9812) was chosen as the universal size standard. This strain was subjected to rigorous testing in our laboratories to ensure that it met the desired criteria, including coverage of a wide range of DNA fragment sizes, even distribution of bands, and stability of the PFGE pattern. The strategy used to convert and compare data generated by the new and old reference standards is described. PMID:15750058
DOE Office of Scientific and Technical Information (OSTI.GOV)
Velsko, S. P.
The microbial DNA Index System (MiDIS) is a concept for a microbial forensic database and investigative decision support system that can be used to help investigators identify the sources of microbial agents that have been used in a criminal or terrorist incident. The heart of the proposed system is a rigorous method for calculating source probabilities by using certain fundamental sampling distributions associated with the propagation and mutation of microbes on disease transmission networks. This formalism has a close relationship to mitochondrial and Y-chromosomal human DNA forensics, and the proposed decision support system is somewhat analogous to the CODIS andmore » SWGDAM mtDNA databases. The MiDIS concept does not involve the use of opportunistic collections of microbial isolates and phylogenetic tree building as a basis for inference. A staged approach can be used to build MiDIS as an enduring capability, beginning with a pilot demonstration program that must meet user expectations for performance and validation before evolving into a continuing effort. Because MiDIS requires input from a a broad array of expertise including outbreak surveillance, field microbial isolate collection, microbial genome sequencing, disease transmission networks, and laboratory mutation rate studies, it will be necessary to assemble a national multi-laboratory team to develop such a system. The MiDIS effort would lend direction and focus to the national microbial genetics research program for microbial forensics, and would provide an appropriate forensic framework for interfacing to future national and international disease surveillance efforts.« less
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Enhancing the DNA Patent Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walters, LeRoy B.
Final Report on Award No. DE-FG0201ER63171 Principal Investigator: LeRoy B. Walters February 18, 2008 This project successfully completed its goal of surveying and reporting on the DNA patenting and licensing policies at 30 major U.S. academic institutions. The report of survey results was published in the January 2006 issue of Nature Biotechnology under the title “The Licensing of DNA Patents by US Academic Institutions: An Empirical Survey.” Lori Pressman was the lead author on this feature article. A PDF reprint of the article will be submitted to our Program Officer under separate cover. The project team has continued to updatemore » the DNA Patent Database on a weekly basis since the conclusion of the project. The database can be accessed at dnapatents.georgetown.edu. This database provides a valuable research tool for academic researchers, policymakers, and citizens. A report entitled Reaping the Benefits of Genomic and Proteomic Research: Intellectual Property Rights, Innovation, and Public Health was published in 2006 by the Committee on Intellectual Property Rights in Genomic and Protein Research and Innovation, Board on Science, Technology, and Economic Policy at the National Academies. The report was edited by Stephen A. Merrill and Anne-Marie Mazza. This report employed and then adapted the methodology developed by our research project and quoted our findings at several points. (The full report can be viewed online at the following URL: http://www.nap.edu/openbook.php?record_id=11487&page=R1). My colleagues and I are grateful for the research support of the ELSI program at the U.S. Department of Energy.« less
The Development of Vocational Vehicle Drive Cycles and Segmentation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duran, Adam W.; Phillips, Caleb T.; Konan, Arnaud M.
Under a collaborative interagency agreement between the U.S. Environmental Protection Agency and the U.S Department of Energy (DOE), the National Renewable Energy Laboratory (NREL) performed a series of in-depth analyses to characterize the on-road driving behavior including distributions of vehicle speed, idle time, accelerations and decelerations, and other driving metrics of medium- and heavy-duty vocational vehicles operating within the United States. As part of this effort, NREL researchers segmented U.S. medium- and heavy-duty vocational vehicle driving characteristics into three distinct operating groups or clusters using real world drive cycle data collected at 1 Hz and stored in NREL's Fleet DNAmore » database. The Fleet DNA database contains millions of miles of historical real-world drive cycle data captured from medium- and heavy vehicles operating across the United States. The data encompass data from existing DOE activities as well as contributions from valued industry stakeholder participants. For this project, data captured from 913 unique vehicles comprising 16,250 days of operation were drawn from the Fleet DNA database and examined. The Fleet DNA data used as a source for this analysis has been collected from a total of 30 unique fleets/data providers operating across 22 unique geographic locations spread across the United States. This includes locations with topology ranging from the foothills of Denver, Colorado, to the flats of Miami, Florida. The range of fleets, geographic locations, and total number of vehicles analyzed ensures results that include the influence of these factors. While no analysis will be perfect without unlimited resources and data, it is the researchers understanding that the Fleet DNA database is the largest and most thorough publicly accessible vocational vehicle usage database currently in operation. This report includes an introduction to the Fleet DNA database and the data contained within, a presentation of the results of the statistical analysis performed by NREL, review of the logistic model developed to predict cluster membership, and a discussion and detailed summary of the development of the vocational drive cycle weights and representative transient drive cycles for testing and simulation. Additional discussion of known limitations and potential future work are also included in the report content.« less
A Comparison of Proposal Processing Forms and Databases at Twelve Universities.
ERIC Educational Resources Information Center
Olsen, Leslie A.; Beattie, Robert R.
1990-01-01
Institutions varied widely in amount and type of data collected in tracking research proposals. Many do not collect predicted data or much data of national interest. Most do not require prior approval of use of human subjects, recombinant DNA, or hazardous substances. Only one required certification of integrity of scholarship. (Author/MSE)
Towards computational improvement of DNA database indexing and short DNA query searching.
Stojanov, Done; Koceski, Sašo; Mileva, Aleksandra; Koceska, Nataša; Bande, Cveta Martinovska
2014-09-03
In order to facilitate and speed up the search of massive DNA databases, the database is indexed at the beginning, employing a mapping function. By searching through the indexed data structure, exact query hits can be identified. If the database is searched against an annotated DNA query, such as a known promoter consensus sequence, then the starting locations and the number of potential genes can be determined. This is particularly relevant if unannotated DNA sequences have to be functionally annotated. However, indexing a massive DNA database and searching an indexed data structure with millions of entries is a time-demanding process. In this paper, we propose a fast DNA database indexing and searching approach, identifying all query hits in the database, without having to examine all entries in the indexed data structure, limiting the maximum length of a query that can be searched against the database. By applying the proposed indexing equation, the whole human genome could be indexed in 10 hours on a personal computer, under the assumption that there is enough RAM to store the indexed data structure. Analysing the methodology proposed by Reneker, we observed that hits at starting positions [Formula: see text] are not reported, if the database is searched against a query shorter than [Formula: see text] nucleotides, such that [Formula: see text] is the length of the DNA database words being mapped and [Formula: see text] is the length of the query. A solution of this drawback is also presented.
Post-conviction DNA testing: the UK's first ‘exoneration’ case?
Johnson, Paul; Williams, Robin
2005-01-01
The routine incorporation of forensic DNA profiling into the criminal justice systems of the United Kingdom has been widely promoted as a device for improving the quality of investigative and prosecutorial processes. From its first uses in the 1980s, in cases of serious crime, to the now daily collection, analysis and comparison of genetic samples in the National DNA Database, DNA profiling has become a standard instrument of policing and a powerful evidential resource for prosecutors. However, the use of post-conviction DNA testing has, until recently, been uncommon in the United Kingdom. This paper explores the first case, in England, of the contribution of DNA profiling to a successful appeal against conviction by an imprisoned offender. Analysis of the details of this case is used to emphasise the ways in which novel forms of scientific evidence remain subject to traditional and heterogeneous tests of relevance and credibility. PMID:15112595
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
DOE Office of Scientific and Technical Information (OSTI.GOV)
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
MICA: desktop software for comprehensive searching of DNA databases
Stokes, William A; Glick, Benjamin S
2006-01-01
Background Molecular biologists work with DNA databases that often include entire genomes. A common requirement is to search a DNA database to find exact matches for a nondegenerate or partially degenerate query. The software programs available for such purposes are normally designed to run on remote servers, but an appealing alternative is to work with DNA databases stored on local computers. We describe a desktop software program termed MICA (K-Mer Indexing with Compact Arrays) that allows large DNA databases to be searched efficiently using very little memory. Results MICA rapidly indexes a DNA database. On a Macintosh G5 computer, the complete human genome could be indexed in about 5 minutes. The indexing algorithm recognizes all 15 characters of the DNA alphabet and fully captures the information in any DNA sequence, yet for a typical sequence of length L, the index occupies only about 2L bytes. The index can be searched to return a complete list of exact matches for a nondegenerate or partially degenerate query of any length. A typical search of a long DNA sequence involves reading only a small fraction of the index into memory. As a result, searches are fast even when the available RAM is limited. Conclusion MICA is suitable as a search engine for desktop DNA analysis software. PMID:17018144
Heninger, Michael; Hanzlick, Randy
2011-03-01
Medical examiners and coroners occasionally encounter unidentified human bodies, which remain unidentified for extended periods. In such cases, when traditional methods of identification have failed or cannot be used, DNA profiling may be used. The Federal Bureau of Investigation has a National Missing Person DNA database (NMPDD) laboratory to which samples may be submitted on such cases and from possible relatives or environments of unidentified decedents. This article describes the experience of the Fulton County Medical Examiner (FCME) in submitting samples to the NMPDD laboratory. A database was established at the FCME to track the submission of samples from unidentified decedents to the NMPDD laboratory for DNA testing along with the results and turnaround times. In December 2004, the FCME inventoried all cases for which samples were available and began to submit them to the NMPDD laboratory for testing. DNA testing and isolation rates, sample type, and turnaround times were tabulated in October 2006 for samples submitted between December 16, 2004 and December 16, 2005. An overall summary of data was also prepared concerning the status of all samples submitted as of April 17, 2007. During the 1-year study period, samples from 77 unidentified decedents were submitted to the laboratory. As of October 2006 (22 months after submission of the first samples and 10 months after submission of the last samples), testing had been completed on 53% of the samples submitted, and 68% of those tested resulted in a mitochondrial DNA profile. Turnaround times ranged from 66 to 557 days, improved with time, and had a mean of 107 days for specimens submitted during the latter part of the study period. As of April 17, 2007, we had submitted samples involving 84 unidentified decedents. Seventy-five percent of the samples have now been tested. Data from the NMPDD laboratory have resulted in 4 identifications by comparison with putative relatives, 4 exclusions, and no cold hits through comparison NMPDD DNA profiles from missing persons. More extensive data are presented in the body of this article. The NMPDD laboratory provides useful and free services to medical examiners, coroners, and law enforcement agencies that require DNA services regarding missing and unidentified persons. Turnaround times have improved. The success of the system in getting cold hits will be heavily dependent on law enforcement filing missing persons reports and submission of reference samples from putative relatives of the decedent. We recommend collecting specimens for DNA analysis early on in the postmortem investigation, submitting samples to the NMPDD laboratory or one of its participating laboratories when traditional methods for identification cannot be used or have failed, not burying bodies until a DNA profile has been obtained, and not cremating unidentified remains.
RICD: a rice indica cDNA database resource for rice functional genomics.
Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin
2008-11-26
The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
MitBASE : a comprehensive and integrated mitochondrial DNA database. The present status
Attimonelli, M.; Altamura, N.; Benne, R.; Brennicke, A.; Cooper, J. M.; D’Elia, D.; Montalvo, A. de; Pinto, B. de; De Robertis, M.; Golik, P.; Knoop, V.; Lanave, C.; Lazowska, J.; Licciulli, F.; Malladi, B. S.; Memeo, F.; Monnerot, M.; Pasimeni, R.; Pilbout, S.; Schapira, A. H. V.; Sloof, P.; Saccone, C.
2000-01-01
MitBASE is an integrated and comprehensive database of mitochondrial DNA data which collects, under a single interface, databases for Plant, Vertebrate, Invertebrate, Human, Protist and Fungal mtDNA and a Pilot database on nuclear genes involved in mitochondrial biogenesis in Saccharomyces cerevisiae. MitBASE reports all available information from different organisms and from intraspecies variants and mutants. Data have been drawn from the primary databases and from the literature; value adding information has been structured, e.g., editing information on protist mtDNA genomes, pathological information for human mtDNA variants, etc. The different databases, some of which are structured using commercial packages (Microsoft Access, File Maker Pro) while others use a flat-file format, have been integrated under ORACLE. Ad hoc retrieval systems have been devised for some of the above listed databases keeping into account their peculiarities. The database is resident at the EBI and is available at the following site: http://www3.ebi.ac.uk/Research/Mitbase/mitbase.pl . The impact of this project is intended for both basic and applied research. The study of mitochondrial genetic diseases and mitochondrial DNA intraspecies diversity are key topics in several biotechnological fields. The database has been funded within the EU Biotechnology programme. PMID:10592207
Tozzo, Pamela; Fassina, Antonio; Caenazzo, Luciana
2017-12-01
Current policy approaches to social and ethical issues surrounding biobanks manifest lack of public information given by researchers and government, despite the evidence that Italian citizens are well informed about technical and other public perspectives of biotechnologies. For this reason, the focus of our survey was to interview our University's students on these aspects. The sample consisted of Padua University students (N = 959), who were administered a questionnaire comprising eight questions covering their knowledge about biobanks, their perception of the related benefits and risks, their willingness to donate samples to a biobank for research purposes, their attitude to having their own DNA profile included in a forensic DNA database, and the reasons behind their answers. The vast majority of the students invited to take part in the survey completed the questionnaire, and the number of participants sufficed to be considered representative of the target population. Despite the respondents' unfamiliarity with the topics explored, suggested by the huge group of respondents answering "I don't know" to the questions regarding Itaian regulation and reality, their answers demonstrate a general agreement to participate in a biobanking scheme for research purposes, as expressed by the 91% of respondents who were reportedly willing to donate their samples. As for the idea of a forensic DNA database, 35% of respondents said they would agree to having their profile included in such a database, even if they were not fully aware of the benefits and risks of such action.This study shows that Italian people with a higher education take a generally positive attitude to the idea of donating biological samples. It contributes to empirical evidence of what Italy's citizens understand about biobanking, and of their willingness to donate samples for research purposes, and also to have their genetic profiles included in a national forensic DNA database. Our findings may have clear implications for the policy discussion on biobanks in Italy, in particular it is important to take into account the Italian population's poor consciousness of forensic DNA database, in order to ensure a better interaction between policy makers and citizens and to make them more aware of the need to balance the individual's rights and the security of society.
Short Tandem Repeat DNA Internet Database
National Institute of Standards and Technology Data Gateway
SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access) Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.
High-throughput STR analysis for DNA database using direct PCR.
Sim, Jeong Eun; Park, Su Jeong; Lee, Han Chul; Kim, Se-Yong; Kim, Jong Yeol; Lee, Seung Hwan
2013-07-01
Since the Korean criminal DNA database was launched in 2010, we have focused on establishing an automated DNA database profiling system that analyzes short tandem repeat loci in a high-throughput and cost-effective manner. We established a DNA database profiling system without DNA purification using a direct PCR buffer system. The quality of direct PCR procedures was compared with that of conventional PCR system under their respective optimized conditions. The results revealed not only perfect concordance but also an excellent PCR success rate, good electropherogram quality, and an optimal intra/inter-loci peak height ratio. In particular, the proportion of DNA extraction required due to direct PCR failure could be minimized to <3%. In conclusion, the newly developed direct PCR system can be adopted for automated DNA database profiling systems to replace or supplement conventional PCR system in a time- and cost-saving manner. © 2013 American Academy of Forensic Sciences Published 2013. This article is a U.S. Government work and is in the public domain in the U.S.A.
[Integrated DNA barcoding database for identifying Chinese animal medicine].
Shi, Lin-Chun; Yao, Hui; Xie, Li-Fang; Zhu, Ying-Jie; Song, Jing-Yuan; Zhang, Hui; Chen, Shi-Lin
2014-06-01
In order to construct an integrated DNA barcoding database for identifying Chinese animal medicine, the authors and their cooperators have completed a lot of researches for identifying Chinese animal medicines using DNA barcoding technology. Sequences from GenBank have been analyzed simultaneously. Three different methods, BLAST, barcoding gap and Tree building, have been used to confirm the reliabilities of barcode records in the database. The integrated DNA barcoding database for identifying Chinese animal medicine has been constructed using three different parts: specimen, sequence and literature information. This database contained about 800 animal medicines and the adulterants and closely related species. Unknown specimens can be identified by pasting their sequence record into the window on the ID page of species identification system for traditional Chinese medicine (www. tcmbarcode. cn). The integrated DNA barcoding database for identifying Chinese animal medicine is significantly important for animal species identification, rare and endangered species conservation and sustainable utilization of animal resources.
The Protein-DNA Interface database
2010-01-01
The Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 Å or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface. We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes. PMID:20482798
The Protein-DNA Interface database.
Norambuena, Tomás; Melo, Francisco
2010-05-18
The Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 A or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface.We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes.
Evaluation of DNA mixtures from database search.
Chung, Yuk-Ka; Hu, Yue-Qing; Fung, Wing K
2010-03-01
With the aim of bridging the gap between DNA mixture analysis and DNA database search, a novel approach is proposed to evaluate the forensic evidence of DNA mixtures when the suspect is identified by the search of a database of DNA profiles. General formulae are developed for the calculation of the likelihood ratio for a two-person mixture under general situations including multiple matches and imperfect evidence. The influence of the prior probabilities on the weight of evidence under the scenario of multiple matches is demonstrated by a numerical example based on Hong Kong data. Our approach is shown to be capable of presenting the forensic evidence of DNA mixtures in a comprehensive way when the suspect is identified through database search.
Charoute, Hicham; Nahili, Halima; Abidi, Omar; Gabi, Khalid; Rouba, Hassan; Fakiri, Malika; Barakat, Abdelhamid
2014-03-01
National and ethnic mutation databases provide comprehensive information about genetic variations reported in a population or an ethnic group. In this paper, we present the Moroccan Genetic Disease Database (MGDD), a catalogue of genetic data related to diseases identified in the Moroccan population. We used the PubMed, Web of Science and Google Scholar databases to identify available articles published until April 2013. The Database is designed and implemented on a three-tier model using Mysql relational database and the PHP programming language. To date, the database contains 425 mutations and 208 polymorphisms found in 301 genes and 259 diseases. Most Mendelian diseases in the Moroccan population follow autosomal recessive mode of inheritance (74.17%) and affect endocrine, nutritional and metabolic physiology. The MGDD database provides reference information for researchers, clinicians and health professionals through a user-friendly Web interface. Its content should be useful to improve researches in human molecular genetics, disease diagnoses and design of association studies. MGDD can be publicly accessed at http://mgdd.pasteur.ma.
JICST Factual Database JICST DNA Database
NASA Astrophysics Data System (ADS)
Shirokizawa, Yoshiko; Abe, Atsushi
Japan Information Center of Science and Technology (JICST) has started the on-line service of DNA database in October 1988. This database is composed of EMBL Nucleotide Sequence Library and Genetic Sequence Data Bank. The authors outline the database system, data items and search commands. Examples of retrieval session are presented.
DNA barcoding in the media: does coverage of cool science reflect its social context?
Geary, Janis; Camicioli, Emma; Bubela, Tania
2016-09-01
Paul Hebert and colleagues first described DNA barcoding in 2003, which led to international efforts to promote and coordinate its use. Since its inception, DNA barcoding has generated considerable media coverage. We analysed whether this coverage reflected both the scientific and social mandates of international barcoding organizations. We searched newspaper databases to identify 900 English-language articles from 2003 to 2013. Coverage of the science of DNA barcoding was highly positive but lacked context for key topics. Coverage omissions pose challenges for public understanding of the science and applications of DNA barcoding; these included coverage of governance structures and issues related to the sharing of genetic resources across national borders. Our analysis provided insight into how barcoding communication efforts have translated into media coverage; more targeted communication efforts may focus media attention on previously omitted, but important topics. Our analysis is timely as the DNA barcoding community works to establish the International Society for the Barcode of Life.
DNA barcoding the native flowering plants and conifers of Wales.
de Vere, Natasha; Rich, Tim C G; Ford, Col R; Trinder, Sarah A; Long, Charlotte; Moore, Chris W; Satterthwaite, Danielle; Davies, Helena; Allainguillaume, Joel; Ronca, Sandra; Tatarinova, Tatiana; Garbett, Hannah; Walker, Kevin; Wilkinson, Mike J
2012-01-01
We present the first national DNA barcode resource that covers the native flowering plants and conifers for the nation of Wales (1143 species). Using the plant DNA barcode markers rbcL and matK, we have assembled 97.7% coverage for rbcL, 90.2% for matK, and a dual-locus barcode for 89.7% of the native Welsh flora. We have sampled multiple individuals for each species, resulting in 3304 rbcL and 2419 matK sequences. The majority of our samples (85%) are from DNA extracted from herbarium specimens. Recoverability of DNA barcodes is lower using herbarium specimens, compared to freshly collected material, mostly due to lower amplification success, but this is balanced by the increased efficiency of sampling species that have already been collected, identified, and verified by taxonomic experts. The effectiveness of the DNA barcodes for identification (level of discrimination) is assessed using four approaches: the presence of a barcode gap (using pairwise and multiple alignments), formation of monophyletic groups using Neighbour-Joining trees, and sequence similarity in BLASTn searches. These approaches yield similar results, providing relative discrimination levels of 69.4 to 74.9% of all species and 98.6 to 99.8% of genera using both markers. Species discrimination can be further improved using spatially explicit sampling. Mean species discrimination using barcode gap analysis (with a multiple alignment) is 81.6% within 10×10 km squares and 93.3% for 2×2 km squares. Our database of DNA barcodes for Welsh native flowering plants and conifers represents the most complete coverage of any national flora, and offers a valuable platform for a wide range of applications that require accurate species identification.
DNA Barcoding the Native Flowering Plants and Conifers of Wales
de Vere, Natasha; Rich, Tim C. G.; Ford, Col R.; Trinder, Sarah A.; Long, Charlotte; Moore, Chris W.; Satterthwaite, Danielle; Davies, Helena; Allainguillaume, Joel; Ronca, Sandra; Tatarinova, Tatiana; Garbett, Hannah; Walker, Kevin; Wilkinson, Mike J.
2012-01-01
We present the first national DNA barcode resource that covers the native flowering plants and conifers for the nation of Wales (1143 species). Using the plant DNA barcode markers rbcL and matK, we have assembled 97.7% coverage for rbcL, 90.2% for matK, and a dual-locus barcode for 89.7% of the native Welsh flora. We have sampled multiple individuals for each species, resulting in 3304 rbcL and 2419 matK sequences. The majority of our samples (85%) are from DNA extracted from herbarium specimens. Recoverability of DNA barcodes is lower using herbarium specimens, compared to freshly collected material, mostly due to lower amplification success, but this is balanced by the increased efficiency of sampling species that have already been collected, identified, and verified by taxonomic experts. The effectiveness of the DNA barcodes for identification (level of discrimination) is assessed using four approaches: the presence of a barcode gap (using pairwise and multiple alignments), formation of monophyletic groups using Neighbour-Joining trees, and sequence similarity in BLASTn searches. These approaches yield similar results, providing relative discrimination levels of 69.4 to 74.9% of all species and 98.6 to 99.8% of genera using both markers. Species discrimination can be further improved using spatially explicit sampling. Mean species discrimination using barcode gap analysis (with a multiple alignment) is 81.6% within 10×10 km squares and 93.3% for 2×2 km squares. Our database of DNA barcodes for Welsh native flowering plants and conifers represents the most complete coverage of any national flora, and offers a valuable platform for a wide range of applications that require accurate species identification. PMID:22701588
Song, Wen Jun; Qin, Qi Wei; Qiu, Jin; Huang, Can Hua; Wang, Fan; Hew, Choy Leong
2004-01-01
Here we report the complete genome sequence of Singapore grouper iridovirus (SGIV). Sequencing of the random shotgun and restriction endonuclease genomic libraries showed that the entire SGIV genome consists of 140,131 nucleotide bp. One hundred sixty-two open reading frames (ORFs) from the sense and antisense DNA strands, coding for lengths varying from 41 to 1,268 amino acids, were identified. Computer-assisted analyses of the deduced amino acid sequences revealed that 77 of the ORFs exhibited homologies to known virus genes, 23 of which matched functional iridovirus proteins. Forty-two putative conserved domains or signatures were detected in the National Center for Biotechnology Information CD-Search database and PROSITE database. An assortment of enzyme activities involved in DNA replication, transcription, nucleotide metabolism, cell signaling, etc., were identified. Viruses were cultured on a cell line derived from the embryonated egg of the grouper Epinephelus tauvina, isolated, and purified by sucrose gradient ultracentrifugation. The protein extract from the purified virions was analyzed by polyacrylamide gel electrophoresis followed by in-gel digestion of protein bands. Matrix-assisted laser desorption ionization-time of flight mass spectrometry and database searching led to identification of 26 proteins. Twenty of these represented novel or previously unidentified genes, which were further confirmed by reverse transcription-PCR (RT-PCR) and DNA sequencing of their respective RT-PCR products. PMID:15507645
Searching mixed DNA profiles directly against profile databases.
Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John
2014-03-01
DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Haddow, Gillian; Laurie, Graeme; Cunningham-Burley, Sarah; Hunter, Kathryn G
2007-01-01
In recent years, there has been a rise in the creation of DNA databases promising a range of health benefits to individuals and populations. This development has been accompanied by an interest in, and concern for the ethical, legal and social aspects of such collections. In terms of policy solutions, much of the focus of these debates has been on issues of consent, confidentiality and research governance. However, there are broader concerns, such as those associated with commercialisation, which cannot be adequately addressed by these foci. In this article, we focus on the health-wealth benefits that DNA databases promise by considering the views of 10 focus groups on Generation Scotland, Scotland's first national genetic database. As in previous studies, our qualitative research on public/s and stakeholders' views of DNA databases show the prospect of utilising donated samples and information derived for wealth-related ends (i.e. for private profit), irrespective of whether there is an associated health-related benefit, arouses considerable reaction. While health-wealth benefits are not mutually exclusive ideals, the tendency has been to cast 'public' benefits as exclusively health-related, while 'private' commercial benefits for funders and/or researchers are held out as a necessary pay-off. We argue for a less polarised approach that reconsiders what is meant by 'public benefits' and questions the exclusivity of commercial interests. We believe accommodation can be achieved via the mobilisation of a grass roots solution known as 'benefit-sharing' or a 'profit pay-off'. We propose a sociologically informed model that has a pragmatic, legal framework, which responds seriously to public concerns.
hPDI: a database of experimental human protein-DNA interactions.
Xie, Zhi; Hu, Shaohui; Blackshaw, Seth; Zhu, Heng; Qian, Jiang
2010-01-15
The human protein DNA Interactome (hPDI) database holds experimental protein-DNA interaction data for humans identified by protein microarray assays. The unique characteristics of hPDI are that it contains consensus DNA-binding sequences not only for nearly 500 human transcription factors but also for >500 unconventional DNA-binding proteins, which are completely uncharacterized previously. Users can browse, search and download a subset or the entire data via a web interface. This database is freely accessible for any academic purposes. http://bioinfo.wilmer.jhu.edu/PDI/.
2005-01-01
Précis The rapid implementation and continuing expansion of forensic DNA databases around the world has been supported by claims about their effectiveness in criminal investigations and challenged by assertions of the resulting intrusiveness into individual privacy. These two competing perspectives provide the basis for ongoing considerations about the categories of persons who should be subject to nonconsensual DNA sampling and profile retention as well as the uses to which such profiles should be put. This paper uses the example of the current arrangements for forensic DNA databasing in England & Wales to discuss the ways in which the legislative and operational basis for police DNA databasing is reliant upon continuous deliberations over these and other matters by a range of key stakeholders. We also assess the effects of the recent innovative use of DNA databasing for ‘familial searching’ in this jurisdiction in order to show how agreed understandings about the appropriate uses of DNA can become unsettled and reformulated even where their investigative effectiveness is uncontested. We conclude by making some observations about the future of what is recognised to be the largest forensic DNA database in the world. PMID:16240734
DNA profiles, computer searches, and the Fourth Amendment.
Kimel, Catherine W
2013-01-01
Pursuant to federal statutes and to laws in all fifty states, the United States government has assembled a database containing the DNA profiles of over eleven million citizens. Without judicial authorization, the government searches each of these profiles one-hundred thousand times every day, seeking to link database subjects to crimes they are not suspected of committing. Yet, courts and scholars that have addressed DNA databasing have focused their attention almost exclusively on the constitutionality of the government's seizure of the biological samples from which the profiles are generated. This Note fills a gap in the scholarship by examining the Fourth Amendment problems that arise when the government searches its vast DNA database. This Note argues that each attempt to match two DNA profiles constitutes a Fourth Amendment search because each attempted match infringes upon database subjects' expectations of privacy in their biological relationships and physical movements. The Note further argues that database searches are unreasonable as they are currently conducted, and it suggests an adaptation of computer-search procedures to remedy the constitutional deficiency.
Tucker, Valerie C; Hopwood, Andrew J; Sprecher, Cynthia J; McLaren, Robert S; Rabbach, Dawn R; Ensenberger, Martin G; Thompson, Jonelle M; Storts, Douglas R
2011-11-01
In response to the ENFSI and EDNAP groups' call for new STR multiplexes for Europe, Promega(®) developed a suite of four new DNA profiling kits. This paper describes the developmental validation study performed on the PowerPlex(®) ESI 16 (European Standard Investigator 16) and the PowerPlex(®) ESI 17 Systems. The PowerPlex(®) ESI 16 System combines the 11 loci compatible with the UK National DNA Database(®), contained within the AmpFlSTR(®) SGM Plus(®) PCR Amplification Kit, with five additional loci: D2S441, D10S1248, D22S1045, D1S1656 and D12S391. The multiplex was designed to reduce the amplicon size of the loci found in the AmpFlSTR(®) SGM Plus(®) kit. This design facilitates increased robustness and amplification success for the loci used in the national DNA databases created in many countries, when analyzing degraded DNA samples. The PowerPlex(®) ESI 17 System amplifies the same loci as the PowerPlex(®) ESI 16 System, but with the addition of a primer pair for the SE33 locus. Tests were designed to address the developmental validation guidelines issued by the Scientific Working Group on DNA Analysis Methods (SWGDAM), and those of the DNA Advisory Board (DAB). Samples processed include DNA mixtures, PCR reactions spiked with inhibitors, a sensitivity series, and 306 United Kingdom donor samples to determine concordance with data generated with the AmpFlSTR(®) SGM Plus(®) kit. Allele frequencies from 242 white Caucasian samples collected in the United Kingdom are also presented. The PowerPlex(®) ESI 16 and ESI 17 Systems are robust and sensitive tools, suitable for the analysis of forensic DNA samples. Full profiles were routinely observed with 62.5pg of a fully heterozygous single source DNA template. This high level of sensitivity was found to impact on mixture analyses, where 54-86% of unique minor contributor alleles were routinely observed in a 1:19 mixture ratio. Improved sensitivity combined with the robustness afforded by smaller amplicons has substantially improved the quantity of data obtained from degraded samples, and the improved chemistry confers exceptional tolerance to high levels of laboratory prepared inhibitors. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Biomedical Requirements for High Productivity Computing Systems
2005-04-01
server at http://www.ncbi.nlm.nih.gov/BLAST/. There are many variants of BLAST, including: 1. BLASTN - Compares a DNA query to a DNA database. Searches ...database (3 reading frames from each strand of the DNA) searching . 13 4. TBLASTN - Compares a protein query to a DNA database, in the 6 possible...the molecular during this phase. After eliminating molecules that could not match the query , an atom-by-atom search for the molecules in conducted
The collation of forensic DNA case data into a multi-dimensional intelligence database.
Walsh, S J; Moss, D S; Kliem, C; Vintiner, G M
2002-01-01
The primary aim of any DNA Database is to link individuals to unsolved offenses and unsolved offenses to each other via DNA profiling. This aim has been successfully realised during the operation of the New Zealand (NZ) DNA Databank over the past five years. The DNA Intelligence Project (DIP), a collaborative project involving NZ forensic and law enforcement agencies, interrogated the forensic case data held on the NZ DNA databank and collated it into a functional intelligence database. This database has been used to identify significant trends which direct Police and forensic personnel towards the most appropriate use of DNA technology. Intelligence is being provided in areas such as the level of usage of DNA techniques in criminal investigation, the relative success of crime scene samples and the geographical distribution of crimes. The DIP has broadened the dimensions of the information offered through the NZ DNA Databank and has furthered the understanding and investigative capability of both Police and forensic scientists. The outcomes of this research fit soundly with the current policies of 'intelligence led policing', which are being adopted by Police jurisdictions locally and overseas.
Parson, W; Gusmão, L; Hares, D R; Irwin, J A; Mayr, W R; Morling, N; Pokorak, E; Prinz, M; Salas, A; Schneider, P M; Parsons, T J
2014-11-01
The DNA Commission of the International Society of Forensic Genetics (ISFG) regularly publishes guidelines and recommendations concerning the application of DNA polymorphisms to the question of human identification. Previous recommendations published in 2000 addressed the analysis and interpretation of mitochondrial DNA (mtDNA) in forensic casework. While the foundations set forth in the earlier recommendations still apply, new approaches to the quality control, alignment and nomenclature of mitochondrial sequences, as well as the establishment of mtDNA reference population databases, have been developed. Here, we describe these developments and discuss their application to both mtDNA casework and mtDNA reference population databasing applications. While the generation of mtDNA for forensic casework has always been guided by specific standards, it is now well-established that data of the same quality are required for the mtDNA reference population data used to assess the statistical weight of the evidence. As a result, we introduce guidelines regarding sequence generation, as well as quality control measures based on the known worldwide mtDNA phylogeny, that can be applied to ensure the highest quality population data possible. For both casework and reference population databasing applications, the alignment and nomenclature of haplotypes is revised here and the phylogenetic alignment proffered as acceptable standard. In addition, the interpretation of heteroplasmy in the forensic context is updated, and the utility of alignment-free database searches for unbiased probability estimates is highlighted. Finally, we discuss statistical issues and define minimal standards for mtDNA database searches. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Buxbaum, Joseph D; Bolshakova, Nadia; Brownfeld, Jessica M; Anney, Richard Jl; Bender, Patrick; Bernier, Raphael; Cook, Edwin H; Coon, Hilary; Cuccaro, Michael; Freitag, Christine M; Hallmayer, Joachim; Geschwind, Daniel; Klauck, Sabine M; Nurnberger, John I; Oliveira, Guiomar; Pinto, Dalila; Poustka, Fritz; Scherer, Stephen W; Shih, Andy; Sutcliffe, James S; Szatmari, Peter; Vicente, Astrid M; Vieland, Veronica; Gallagher, Louise
2014-01-01
There is an urgent need for expanding and enhancing autism spectrum disorder (ASD) samples, in order to better understand causes of ASD. In a unique public-private partnership, 13 sites with extensive experience in both the assessment and diagnosis of ASD embarked on an ambitious, 2-year program to collect samples for genetic and phenotypic research and begin analyses on these samples. The program was called The Autism Simplex Collection (TASC). TASC sample collection began in 2008 and was completed in 2010, and included nine sites from North America and four sites from Western Europe, as well as a centralized Data Coordinating Center. Over 1,700 trios are part of this collection, with DNA from transformed cells now available through the National Institute of Mental Health (NIMH). Autism Diagnostic Interview-Revised (ADI-R) and Autism Diagnostic Observation Schedule-Generic (ADOS-G) measures are available for all probands, as are standardized IQ measures, Vineland Adaptive Behavioral Scales (VABS), the Social Responsiveness Scale (SRS), Peabody Picture Vocabulary Test (PPVT), and physical measures (height, weight, and head circumference). At almost every site, additional phenotypic measures were collected, including the Broad Autism Phenotype Questionnaire (BAPQ) and Repetitive Behavior Scale-Revised (RBS-R), as well as the non-word repetition scale, Communication Checklist (Children's or Adult), and Aberrant Behavior Checklist (ABC). Moreover, for nearly 1,000 trios, the Autism Genome Project Consortium (AGP) has carried out Illumina 1 M SNP genotyping and called copy number variation (CNV) in the samples, with data being made available through the National Institutes of Health (NIH). Whole exome sequencing (WES) has been carried out in over 500 probands, together with ancestry matched controls, and this data is also available through the NIH. Additional WES is being carried out by the Autism Sequencing Consortium (ASC), where the focus is on sequencing complete trios. ASC sequencing for the first 1,000 samples (all from whole-blood DNA) is complete and data will be released in 2014. Data is being made available through NIH databases (database of Genotypes and Phenotypes (dbGaP) and National Database for Autism Research (NDAR)) with DNA released in Dist 11.0. Primary funding for the collection, genotyping, sequencing and distribution of TASC samples was provided by Autism Speaks and the NIH, including the National Institute of Mental Health (NIMH) and the National Human Genetics Research Institute (NHGRI). TASC represents an important sample set that leverages expert sites. Similar approaches, leveraging expert sites and ongoing studies, represent an important path towards further enhancing available ASD samples.
The effect of wild card designations and rare alleles in forensic DNA database searches.
Tvedebrink, Torben; Bright, Jo-Anne; Buckleton, John S; Curran, James M; Morling, Niels
2015-05-01
Forensic DNA databases are powerful tools used for the identification of persons of interest in criminal investigations. Typically, they consist of two parts: (1) a database containing DNA profiles of known individuals and (2) a database of DNA profiles associated with crime scenes. The risk of adventitious or chance matches between crimes and innocent people increases as the number of profiles within a database grows and more data is shared between various forensic DNA databases, e.g. from different jurisdictions. The DNA profiles obtained from crime scenes are often partial because crime samples may be compromised in quantity or quality. When an individual's profile cannot be resolved from a DNA mixture, ambiguity is introduced. A wild card, F, may be used in place of an allele that has dropped out or when an ambiguous profile is resolved from a DNA mixture. Variant alleles that do not correspond to any marker in the allelic ladder or appear above or below the extent of the allelic ladder range are assigned the allele designation R for rare allele. R alleles are position specific with respect to the observed/unambiguous allele. The F and R designations are made when the exact genotype has not been determined. The F and R designation are treated as wild cards for searching, which results in increased chance of adventitious matches. We investigated the probability of adventitious matches given these two types of wild cards. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Wang, Zheng; Zhou, Di; Jia, Zhenjun; Li, Luyao; Wu, Wei; Li, Chengtao; Hou, Yiping
2016-01-01
STRs, scattered throughout the genome with higher mutation rate, are attractive to genetic application like forensic, anthropological and population genetics studies. STR profiling has now been applied in various aspects of human identification in forensic investigations. This work described the developmental validation of a novel and universal assay, the Huaxia Platinum System, which amplifies all markers in the expanded CODIS core loci and the Chinese National Database in one single PCR system. Developmental validation demonstrated that this novel assay is accurate, sensitive, reproducible and robust. No discordant calls were observed between the Huaxia Platinum System and other STR systems. Full genotypes could be achieved even with 250 pg of human DNA. Additionally, 402 unrelated individuals from 3 main ethnic groups of China (Han, Uygur and Tibetan) were genotyped to investigate the effectiveness of this novel assay. The CMP were 2.3094 × 10−27, 4.3791 × 10−28 and 6.9118 × 10−27, respectively, and the CPE were 0.99999999939059, 0.99999999989653 and 0.99999999976386, respectively. Aforementioned results suggested that the Huaxia Platinum System is polymorphic and informative, which provides efficient tool for national DNA database and facilitate international data sharing. PMID:27498550
Update on Genomic Databases and Resources at the National Center for Biotechnology Information.
Tatusova, Tatiana
2016-01-01
The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duran, Adam W; Phillips, Caleb T; Perr-Sauer, Jordan
Under a collaborative interagency agreement between the U.S. Environmental Protection Agency and the U.S. Department of Energy (DOE), the National Renewable Energy Laboratory (NREL) performed a series of in-depth analyses to characterize on-road driving behavior including distributions of vehicle speed, idle time, accelerations and decelerations, and other driving metrics of medium- and heavy-duty vocational vehicles operating within the United States. As part of this effort, NREL researchers segmented U.S. medium- and heavy-duty vocational vehicle driving characteristics into three distinct operating groups or clusters using real-world drive cycle data collected at 1 Hz and stored in NREL's Fleet DNA database. Themore » Fleet DNA database contains millions of miles of historical drive cycle data captured from medium- and heavy-duty vehicles operating across the United States. The data encompass existing DOE activities as well as contributions from valued industry stakeholder participants. For this project, data captured from 913 unique vehicles comprising 16,250 days of operation were drawn from the Fleet DNA database and examined. The Fleet DNA data used as a source for this analysis has been collected from a total of 30 unique fleets/data providers operating across 22 unique geographic locations spread across the United States. This includes locations with topographies ranging from the foothills of Denver, Colorado, to the flats of Miami, Florida. This paper includes the results of the statistical analysis performed by NREL and a discussion and detailed summary of the development of the vocational drive cycle weights and representative transient drive cycles for testing and simulation. Additional discussion of known limitations and potential future work is also included.« less
R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring.
Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès
2016-01-01
Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/. © The Author(s) 2016. Published by Oxford University Press.
R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring
Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès
2016-01-01
Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/ PMID:26989149
Jia, Ying; Cantu, Bruno A; Sánchez, Elda E; Pérez, John C
2008-06-15
To advance our knowledge on the snake venom composition and transcripts expressed in venom gland at the molecular level, we constructed a cDNA library from the venom gland of Agkistrodon piscivorus leucostoma for the generation of expressed sequence tags (ESTs) database. From the randomly sequenced 2112 independent clones, we have obtained ESTs for 1309 (62%) cDNAs, which showed significant deduced amino acid sequence similarity (scores >80) to previously characterized proteins in National Center for Biotechnology Information (NCBI) database. Ribosomal proteins make up 47 clones (2%) and the remaining 756 (36%) cDNAs represent either unknown identity or show BLASTX sequence identity scores of <80 with known GenBank accessions. The most highly expressed gene encoding phospholipase A(2) (PLA(2)) accounting for 35% of A. p. leucostoma venom gland cDNAs was identified and further confirmed by crude venom applied to sodium dodecyl sulfate/polyacrylamide gel electrophoresis (SDS-PAGE) electrophoresis and protein sequencing. A total of 180 representative genes were obtained from the sequence assemblies and deposited to EST database. Clones showing sequence identity to disintegrins, thrombin-like enzymes, hemorrhagic toxins, fibrinogen clotting inhibitors and plasminogen activators were also identified in our EST database. These data can be used to develop a research program that will help us identify genes encoding proteins that are of medical importance or proteins involved in the mechanisms of the toxin venom.
The Web-Based DNA Vaccine Database DNAVaxDB and Its Usage for Rational DNA Vaccine Design.
Racz, Rebecca; He, Yongqun
2016-01-01
A DNA vaccine is a vaccine that uses a mammalian expression vector to express one or more protein antigens and is administered in vivo to induce an adaptive immune response. Since the 1990s, a significant amount of research has been performed on DNA vaccines and the mechanisms behind them. To meet the needs of the DNA vaccine research community, we created DNAVaxDB ( http://www.violinet.org/dnavaxdb ), the first Web-based database and analysis resource of experimentally verified DNA vaccines. All the data in DNAVaxDB, which includes plasmids, antigens, vaccines, and sources, is manually curated and experimentally verified. This chapter goes over the detail of DNAVaxDB system and shows how the DNA vaccine database, combined with the Vaxign vaccine design tool, can be used for rational design of a DNA vaccine against a pathogen, such as Mycobacterium bovis.
Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R
2018-05-01
Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung
2017-06-26
Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
3DNALandscapes: a database for exploring the conformational features of DNA.
Zheng, Guohui; Colasanti, Andrew V; Lu, Xiang-Jun; Olson, Wilma K
2010-01-01
3DNALandscapes, located at: http://3DNAscapes.rutgers.edu, is a new database for exploring the conformational features of DNA. In contrast to most structural databases, which archive the Cartesian coordinates and/or derived parameters and images for individual structures, 3DNALandscapes enables searches of conformational information across multiple structures. The database contains a wide variety of structural parameters and molecular images, computed with the 3DNA software package and known to be useful for characterizing and understanding the sequence-dependent spatial arrangements of the DNA sugar-phosphate backbone, sugar-base side groups, base pairs, base-pair steps, groove structure, etc. The data comprise all DNA-containing structures--both free and bound to proteins, drugs and other ligands--currently available in the Protein Data Bank. The web interface allows the user to link, report, plot and analyze this information from numerous perspectives and thereby gain insight into DNA conformation, deformability and interactions in different sequence and structural contexts. The data accumulated from known, well-resolved DNA structures can serve as useful benchmarks for the analysis and simulation of new structures. The collective data can also help to understand how DNA deforms in response to proteins and other molecules and undergoes conformational rearrangements.
The Italian Twin Project: from the personal identification number to a national twin registry.
Stazi, Maria Antonietta; Cotichini, Rodolfo; Patriarca, Valeria; Brescianini, Sonia; Fagnani, Corrado; D'Ippolito, Cristina; Cannoni, Stefania; Ristori, Giovanni; Salvetti, Marco
2002-10-01
The unique opportunity given by the "fiscal code", an alphanumeric identification with demographic information on any single person residing in Italy, introduced in 1976 by the Ministry of Finance, allowed a database of all potential Italian twins to be created. This database contains up to now name, surname, date and place of birth and home address of about 1,300,000 "possible twins". Even though we estimated an excess of 40% of pseudo-twins, this still is the world's largest twin population ever collected. The database of possible twins is currently used in population-based studies on multiple sclerosis, Alzheimer's disease, celiac disease, and type 1 diabetes. A system is currently being developed for linking the database with data from mortality and cancer registries. In 2001, the Italian Government, through the Ministry of Health, financed a broad national research program on twin studies, including the establishment of a national twin registry. Among all the possible twins, a sample of 500,000 individuals are going to be contacted and we expect to enrol around 120,000 real twin pairs in a formal Twin Registry. According to available financial resources, a sub sample of the enrolled population will be asked to donate DNA. A biological bank from twins will be then implemented, guaranteeing information on future etiological questions regarding genetic and modifiable factors for physical impairment and disability, cancers, cardiovascular diseases and other age related chronic illnesses.
Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.
Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M
2011-01-01
Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.
Brandstätter, Anita; Peterson, Christine T; Irwin, Jodi A; Mpoke, Solomon; Koech, Davy K; Parson, Walther; Parsons, Thomas J
2004-10-01
Large forensic mtDNA databases which adhere to strict guidelines for generation and maintenance, are not available for many populations outside of the United States and western Europe. We have established a high quality mtDNA control region sequence database for urban Nairobi as both a reference database for forensic investigations, and as a tool to examine the genetic variation of Kenyan sequences in the context of known African variation. The Nairobi sequences exhibited high variation and a low random match probability, indicating utility for forensic testing. Haplogroup identification and frequencies were compared with those reported from other published studies on African, or African-origin populations from Mozambique, Sierra Leone, and the United States, and suggest significant differences in the mtDNA compositions of the various populations. The quality of the sequence data in our study was investigated and supported using phylogenetic measures. Our data demonstrate the diversity and distinctiveness of African populations, and underline the importance of establishing additional forensic mtDNA databases of indigenous African populations.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives.
Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu
2017-01-01
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives
Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu
2017-01-01
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924
Compressing DNA sequence databases with coil.
White, W Timothy J; Hendy, Michael D
2008-05-20
Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression - an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression - the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.
Compressing DNA sequence databases with coil
White, W Timothy J; Hendy, Michael D
2008-01-01
Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794
Human Chromosome Y and Haplogroups; introducing YDHS Database.
Tiirikka, Timo; Moilanen, Jukka S
2015-12-01
As the high throughput sequencing efforts generate more biological information, scientists from different disciplines are interpreting the polymorphisms that make us unique. In addition, there is an increasing trend in general public to research their own genealogy, find distant relatives and to know more about their biological background. Commercial vendors are providing analyses of mitochondrial and Y-chromosomal markers for such purposes. Clearly, an easy-to-use free interface to the existing data on the identified variants would be in the interest of general public and professionals less familiar with the field. Here we introduce a novel metadatabase YDHS that aims to provide such an interface for Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants. The database uses ISOGG Y-DNA tree as the source of mutations and haplogroups and by using genomic positions of the mutations the database links them to genes and other biological entities. YDHS contains analysis tools for deeper Y-SNP analysis. YDHS addresses the shortage of Y-DNA related databases. We have tested our database using a set of different cases from literature ranging from infertility to autism. The database is at http://www.semanticgen.net/ydhs Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants have not been in the scientific limelight, excluding certain specialized fields like forensics, mainly because there is not much freely available information or it is scattered in different sources. However, as we have demonstrated Y-SNPs do play a role in various cases on the haplogroup level and it is possible to create a free Y-DNA dedicated bioinformatics resource.
Kobayashi, Takehiko; Sasaki, Mariko
2017-01-01
The ribosomal RNA gene (rDNA) is the most abundant gene in yeast and other eukaryotic organisms. Due to its heavy transcription, repetitive structure and programmed replication fork pauses, the rDNA is one of the most unstable regions in the genome. Thus, the rDNA is the best region to study the mechanisms responsible for maintaining genome integrity. Recently, we screened a library of ∼4800 budding yeast gene knockout strains to identify mutants defective in the maintenance of rDNA stability. The results of this screen are summarized in the Yeast rDNA Stability (YRS) Database, in which the stability and copy number of rDNA in each mutant are presented. From this screen, we identified ∼700 genes that may contribute to the maintenance of rDNA stability. In addition, ∼50 mutants had abnormally high or low rDNA copy numbers. Moreover, some mutants with unstable rDNA displayed abnormalities in another chromosome. In this review, we introduce the YRS Database and discuss the roles of newly identified genes that contribute to rDNA maintenance and genome integrity. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
2014-01-01
Background There is an urgent need for expanding and enhancing autism spectrum disorder (ASD) samples, in order to better understand causes of ASD. Methods In a unique public-private partnership, 13 sites with extensive experience in both the assessment and diagnosis of ASD embarked on an ambitious, 2-year program to collect samples for genetic and phenotypic research and begin analyses on these samples. The program was called The Autism Simplex Collection (TASC). TASC sample collection began in 2008 and was completed in 2010, and included nine sites from North America and four sites from Western Europe, as well as a centralized Data Coordinating Center. Results Over 1,700 trios are part of this collection, with DNA from transformed cells now available through the National Institute of Mental Health (NIMH). Autism Diagnostic Interview-Revised (ADI-R) and Autism Diagnostic Observation Schedule-Generic (ADOS-G) measures are available for all probands, as are standardized IQ measures, Vineland Adaptive Behavioral Scales (VABS), the Social Responsiveness Scale (SRS), Peabody Picture Vocabulary Test (PPVT), and physical measures (height, weight, and head circumference). At almost every site, additional phenotypic measures were collected, including the Broad Autism Phenotype Questionnaire (BAPQ) and Repetitive Behavior Scale-Revised (RBS-R), as well as the non-word repetition scale, Communication Checklist (Children’s or Adult), and Aberrant Behavior Checklist (ABC). Moreover, for nearly 1,000 trios, the Autism Genome Project Consortium (AGP) has carried out Illumina 1 M SNP genotyping and called copy number variation (CNV) in the samples, with data being made available through the National Institutes of Health (NIH). Whole exome sequencing (WES) has been carried out in over 500 probands, together with ancestry matched controls, and this data is also available through the NIH. Additional WES is being carried out by the Autism Sequencing Consortium (ASC), where the focus is on sequencing complete trios. ASC sequencing for the first 1,000 samples (all from whole-blood DNA) is complete and data will be released in 2014. Data is being made available through NIH databases (database of Genotypes and Phenotypes (dbGaP) and National Database for Autism Research (NDAR)) with DNA released in Dist 11.0. Primary funding for the collection, genotyping, sequencing and distribution of TASC samples was provided by Autism Speaks and the NIH, including the National Institute of Mental Health (NIMH) and the National Human Genetics Research Institute (NHGRI). Conclusions TASC represents an important sample set that leverages expert sites. Similar approaches, leveraging expert sites and ongoing studies, represent an important path towards further enhancing available ASD samples. PMID:25392729
NCBI GEO: mining millions of expression profiles--database and tools.
Barrett, Tanya; Suzek, Tugba O; Troup, Dennis B; Wilhite, Stephen E; Ngau, Wing-Chi; Ledoux, Pierre; Rudnev, Dmitry; Lash, Alex E; Fujibuchi, Wataru; Edgar, Ron
2005-01-01
The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest fully public repository for high-throughput molecular abundance data, primarily gene expression data. The database has a flexible and open design that allows the submission, storage and retrieval of many data types. These data include microarray-based experiments measuring the abundance of mRNA, genomic DNA and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. GEO currently holds over 30,000 submissions representing approximately half a billion individual molecular abundance measurements, for over 100 organisms. Here, we describe recent database developments that facilitate effective mining and visualization of these data. Features are provided to examine data from both experiment- and gene-centric perspectives using user-friendly Web-based interfaces accessible to those without computational or microarray-related analytical expertise. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.
1992-10-16
the DNA Fingerprint Laboratory. The Los Angeles Police Department and its former Chief, Daryl Gates for permitting a secret unit, the ...authorized to change information in. Conclusions Where angels fear .... Of all the reasons for compartmentation for which the level of evaluation...database, and a security label attribute is associated with data in each tuple in a relation. The range and distribution of security levels may
Sehgal, Manika; Singh, Tiratha Raj
2014-04-01
We present DR-GAS(1), a unique, consolidated and comprehensive DNA repair genetic association studies database of human DNA repair system. It presents information on repair genes, assorted mechanisms of DNA repair, linkage disequilibrium, haplotype blocks, nsSNPs, phosphorylation sites, associated diseases, and pathways involved in repair systems. DNA repair is an intricate process which plays an essential role in maintaining the integrity of the genome by eradicating the damaging effect of internal and external changes in the genome. Hence, it is crucial to extensively understand the intact process of DNA repair, genes involved, non-synonymous SNPs which perhaps affect the function, phosphorylated residues and other related genetic parameters. All the corresponding entries for DNA repair genes, such as proteins, OMIM IDs, literature references and pathways are cross-referenced to their respective primary databases. DNA repair genes and their associated parameters are either represented in tabular or in graphical form through images elucidated by computational and statistical analyses. It is believed that the database will assist molecular biologists, biotechnologists, therapeutic developers and other scientific community to encounter biologically meaningful information, and meticulous contribution of genetic level information towards treacherous diseases in human DNA repair systems. DR-GAS is freely available for academic and research purposes at: http://www.bioinfoindia.org/drgas. Copyright © 2014 Elsevier B.V. All rights reserved.
Shackleton, David; Pagram, Jenny; Ives, Lesley; Vanhinsbergh, Des
2018-06-02
The RapidHIT™ 200 System is a fully automated sample-to-DNA profile system designed to produce high quality DNA profiles within 2h. The use of RapidHIT™ 200 System within the United Kingdom Criminal Justice System (UKCJS) has required extensive development and validation of methods with a focus on AmpFℓSTR ® NGMSElect™ Express PCR kit to comply with specific regulations for loading to the UK National DNA Database (NDNAD). These studies have been carried out using single source reference samples to simulate live reference samples taken from arrestees and victims for elimination. The studies have shown that the system is capable of generating high quality profile and has achieved the accreditations necessary to load to the NDNAD; a first for the UK. Copyright © 2018 Elsevier B.V. All rights reserved.
Das, Raima; Ghosh, Sankar Kumar
2017-04-01
DNA repair pathway is a primary defense system that eliminates wide varieties of DNA damage. Any deficiencies in them are likely to cause the chromosomal instability that leads to cell malfunctioning and tumorigenesis. Genetic polymorphisms in DNA repair genes have demonstrated a significant association with cancer risk. Our study attempts to give a glimpse of the overall scenario of the germline polymorphisms in the DNA repair genes by taking into account of the Exome Aggregation Consortium (ExAC) database as well as the Human Gene Mutation Database (HGMD) for evaluating the disease link, particularly in cancer. It has been found that ExAC DNA repair dataset (which consists of 228 DNA repair genes) comprises 30.4% missense, 12.5% dbSNP reported and 3.2% ClinVar significant variants. 27% of all the missense variants has the deleterious SIFT score of 0.00 and 6% variants carrying the most damaging Polyphen-2 score of 1.00, thus affecting the protein structure and function. However, as per HGMD, only a fraction (1.2%) of ExAC DNA repair variants was found to be cancer-related, indicating remaining variants reported in both the databases to be further analyzed. This, in turn, may provide an increased spectrum of the reported cancer linked variants in the DNA repair genes present in ExAC database. Moreover, further in silico functional assay of the identified vital cancer-associated variants, which is essential to get their actual biological significance, may shed some lights in the field of targeted drug development in near future. Copyright © 2017. Published by Elsevier B.V.
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping
K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale
1998-01-01
DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
DNA algorithms of implementing biomolecular databases on a biological computer.
Chang, Weng-Long; Vasilakos, Athanasios V
2015-01-01
In this paper, DNA algorithms are proposed to perform eight operations of relational algebra (calculus), which include Cartesian product, union, set difference, selection, projection, intersection, join, and division, on biomolecular relational databases.
NPIDB: Nucleic acid-Protein Interaction DataBase.
Kirsanov, Dmitry D; Zanegina, Olga N; Aksianov, Evgeniy A; Spirin, Sergei A; Karyagina, Anna S; Alexeevski, Andrei V
2013-01-01
The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.
Benschop, Corina C G; van der Beek, Cornelis P; Meiland, Hugo C; van Gorp, Ankie G M; Westen, Antoinette A; Sijen, Titia
2011-08-01
To analyze DNA samples with very low DNA concentrations, various methods have been developed that sensitize short tandem repeat (STR) typing. Sensitized DNA typing is accompanied by stochastic amplification effects, such as allele drop-outs and drop-ins. Therefore low template (LT) DNA profiles are interpreted with care. One can either try to infer the genotype by a consensus method that uses alleles confirmed in replicate analyses, or one can use a statistical model to evaluate the strength of the evidence in a direct comparison with a known DNA profile. In this study we focused on the first strategy and we show that the procedure by which the consensus profile is assembled will affect genotyping reliability. In order to gain insight in the roles of replicate number and requested level of reproducibility, we generated six independent amplifications of samples of known donors. The LT methods included both increased cycling and enhanced capillary electrophoresis (CE) injection [1]. Consensus profiles were assembled from two to six of the replications using four methods: composite (include all alleles), n-1 (include alleles detected in all but one replicate), n/2 (include alleles detected in at least half of the replicates) and 2× (include alleles detected twice). We compared the consensus DNA profiles with the DNA profile of the known donor, studied the stochastic amplification effects and examined the effect of the consensus procedure on DNA database search results. From all these analyses we conclude that the accuracy of LT DNA typing and the efficiency of database searching improve when the number of replicates is increased and the consensus method is n/2. The most functional number of replicates within this n/2 method is four (although a replicate number of three suffices for samples showing >25% of the alleles in standard STR typing). This approach was also the optimal strategy for the analysis of 2-person mixtures, although modified search strategies may be needed to retrieve the minor component in database searches. From the database searches follows the recommendation to specifically mark LT DNA profiles when entering them into the DNA database. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Sequencing of cDNA Clones from the Genetic Map of Tomato (Lycopersicon esculentum)
Ganal, Martin W.; Czihal, Rosemarie; Hannappel, Ulrich; Kloos, Dorothee-U.; Polley, Andreas; Ling, Hong-Qing
1998-01-01
The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions. [cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695–AA825005 and the dbEST_Id database under accession nos. 1546519–1546862.] PMID:9724330
Molecular Identification and Databases in Fusarium
USDA-ARS?s Scientific Manuscript database
DNA sequence-based methods for identifying pathogenic and mycotoxigenic Fusarium isolates have become the gold standard worldwide. Moreover, fusarial DNA sequence data are increasing rapidly in several web-accessible databases for comparative purposes. Unfortunately, the use of Basic Alignment Sea...
MitoBreak: the mitochondrial DNA breakpoints database.
Damas, Joana; Carneiro, João; Amorim, António; Pereira, Filipe
2014-01-01
Mitochondrial DNA (mtDNA) rearrangements are key events in the development of many diseases. Investigations of mtDNA regions affected by rearrangements (i.e. breakpoints) can lead to important discoveries about rearrangement mechanisms and can offer important clues about the causes of mitochondrial diseases. Here, we present the mitochondrial DNA breakpoints database (MitoBreak; http://mitobreak.portugene.com), a free, web-accessible comprehensive list of breakpoints from three classes of somatic mtDNA rearrangements: circular deleted (deletions), circular partially duplicated (duplications) and linear mtDNAs. Currently, MitoBreak contains >1400 mtDNA rearrangements from seven species (Homo sapiens, Mus musculus, Rattus norvegicus, Macaca mulatta, Drosophila melanogaster, Caenorhabditis elegans and Podospora anserina) and their associated phenotypic information collected from nearly 400 publications. The database allows researchers to perform multiple types of data analyses through user-friendly interfaces with full or partial datasets. It also permits the download of curated data and the submission of new mtDNA rearrangements. For each reported case, MitoBreak also documents the precise breakpoint positions, junction sequences, disease or associated symptoms and links to the related publications, providing a useful resource to study the causes and consequences of mtDNA structural alterations.
Clone and genomic repositories at the American Type Culture Collection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maglott, D.R.; Nierman, W.C.
1990-01-01
The American Type Culture Collection (ATCC) has a long history of characterizing, preserving, and distributing biological resource materials for the scientific community. Starting in 1925 as a repository for standard bacterial and fungal strains, its collections have diversified with technologic advances and in response to the requirements of its users. To serve the needs of the human genetics community, the National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH), established an international Repository of Human DNA Probes and Libraries at the ATCC in 1985. This repository expanded the existing collections of recombinant clones and librariesmore » at the ATCC, with the specific purposes of (1) obtaining, amplifying, and distribution probes detecting restriction fragment length polymorphisms (RFLPs); (2) obtaining, amplifying, and distributing genomic and cDNA clones from known genes independent of RFLP detection; (3) distributing the chromosome-specific libraries generated by the National Laboratory Gene Library Project at the Lawrence Livermore and Los Alamos National Laboratories and (4) maintaining a public, online database describing the repository materials. Because it was recognized that animal models and comparative mapping can be crucial to genomic characterization, the scope of the repository was broadened in February 1989 to include probes from the mouse genome.« less
Gruszka, Damian; Marzec, Marek; Szarejko, Iwona
2012-06-14
The high level of conservation of genes that regulate DNA replication and repair indicates that they may serve as a source of information on the origin and evolution of the species and makes them a reliable system for the identification of cross-species homologs. Studies that had been conducted to date shed light on the processes of DNA replication and repair in bacteria, yeast and mammals. However, there is still much to be learned about the process of DNA damage repair in plants. These studies, which were conducted mainly using bioinformatics tools, enabled the list of genes that participate in various pathways of DNA repair in Arabidopsis thaliana (L.) Heynh to be outlined; however, information regarding these mechanisms in crop plants is still very limited. A similar, functional approach is particularly difficult for a species whose complete genomic sequences are still unavailable. One of the solutions is to apply ESTs (Expressed Sequence Tags) as the basis for gene identification. For the construction of the barley EST DNA Replication and Repair Database (bEST-DRRD), presented here, the Arabidopsis nucleotide and protein sequences involved in DNA replication and repair were used to browse for and retrieve the deposited sequences, derived from four barley (Hordeum vulgare L.) sequence databases, including the "Barley Genome version 0.05" database (encompassing ca. 90% of barley coding sequences) and from two databases covering the complete genomes of two monocot models: Oryza sativa L. and Brachypodium distachyon L. in order to identify homologous genes. Sequences of the categorised Arabidopsis queries are used for browsing the repositories, which are located on the ViroBLAST platform. The bEST-DRRD is currently used in our project during the identification and validation of the barley genes involved in DNA repair. The presented database provides information about the Arabidopsis genes involved in DNA replication and repair, their expression patterns and models of protein interactions. It was designed and established to provide an open-access tool for the identification of monocot homologs of known Arabidopsis genes that are responsible for DNA-related processes. The barley genes identified in the project are currently being analysed to validate their function.
DNAtraffic--a new database for systems biology of DNA dynamics during the cell life.
Kuchta, Krzysztof; Barszcz, Daniela; Grzesiuk, Elzbieta; Pomorski, Pawel; Krwawicz, Joanna
2012-01-01
DNAtraffic (http://dnatraffic.ibb.waw.pl/) is dedicated to be a unique comprehensive and richly annotated database of genome dynamics during the cell life. It contains extensive data on the nomenclature, ontology, structure and function of proteins related to the DNA integrity mechanisms such as chromatin remodeling, histone modifications, DNA repair and damage response from eight organisms: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Escherichia coli and Arabidopsis thaliana. DNAtraffic contains comprehensive information on the diseases related to the assembled human proteins. DNAtraffic is richly annotated in the systemic information on the nomenclature, chemistry and structure of DNA damage and their sources, including environmental agents or commonly used drugs targeting nucleic acids and/or proteins involved in the maintenance of genome stability. One of the DNAtraffic database aim is to create the first platform of the combinatorial complexity of DNA network analysis. Database includes illustrations of pathways, damage, proteins and drugs. Since DNAtraffic is designed to cover a broad spectrum of scientific disciplines, it has to be extensively linked to numerous external data sources. Our database represents the result of the manual annotation work aimed at making the DNAtraffic much more useful for a wide range of systems biology applications.
DNAtraffic—a new database for systems biology of DNA dynamics during the cell life
Kuchta, Krzysztof; Barszcz, Daniela; Grzesiuk, Elzbieta; Pomorski, Pawel; Krwawicz, Joanna
2012-01-01
DNAtraffic (http://dnatraffic.ibb.waw.pl/) is dedicated to be a unique comprehensive and richly annotated database of genome dynamics during the cell life. It contains extensive data on the nomenclature, ontology, structure and function of proteins related to the DNA integrity mechanisms such as chromatin remodeling, histone modifications, DNA repair and damage response from eight organisms: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Escherichia coli and Arabidopsis thaliana. DNAtraffic contains comprehensive information on the diseases related to the assembled human proteins. DNAtraffic is richly annotated in the systemic information on the nomenclature, chemistry and structure of DNA damage and their sources, including environmental agents or commonly used drugs targeting nucleic acids and/or proteins involved in the maintenance of genome stability. One of the DNAtraffic database aim is to create the first platform of the combinatorial complexity of DNA network analysis. Database includes illustrations of pathways, damage, proteins and drugs. Since DNAtraffic is designed to cover a broad spectrum of scientific disciplines, it has to be extensively linked to numerous external data sources. Our database represents the result of the manual annotation work aimed at making the DNAtraffic much more useful for a wide range of systems biology applications. PMID:22110027
Toward a mtDNA locus-specific mutation database using the LOVD platform.
Elson, Joanna L; Sweeney, Mary G; Procaccio, Vincent; Yarham, John W; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H; Pitceathly, Robert D S; Thorburn, David R; Lott, Marie T; Wallace, Douglas C; Taylor, Robert W; McFarland, Robert
2012-09-01
The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. © 2012 Wiley Periodicals, Inc.
Toward a mtDNA Locus-Specific Mutation Database Using the LOVD Platform
Elson, Joanna L.; Sweeney, Mary G.; Procaccio, Vincent; Yarham, John W.; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H.; Pitceathly, Robert D.S.; Thorburn, David R.; Lott, Marie T.; Wallace, Douglas C.; Taylor, Robert W.; McFarland, Robert
2015-01-01
The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. PMID:22581690
OnTheFly: a database of Drosophila melanogaster transcription factors and their binding sites.
Shazman, Shula; Lee, Hunjoong; Socol, Yakov; Mann, Richard S; Honig, Barry
2014-01-01
We present OnTheFly (http://bhapp.c2b2.columbia.edu/OnTheFly/index.php), a database comprising a systematic collection of transcription factors (TFs) of Drosophila melanogaster and their DNA-binding sites. TFs predicted in the Drosophila melanogaster genome are annotated and classified and their structures, obtained via experiment or homology models, are provided. All known preferred TF DNA-binding sites obtained from the B1H, DNase I and SELEX methodologies are presented. DNA shape parameters predicted for these sites are obtained from a high throughput server or from crystal structures of protein-DNA complexes where available. An important feature of the database is that all DNA-binding domains and their binding sites are fully annotated in a eukaryote using structural criteria and evolutionary homology. OnTheFly thus provides a comprehensive view of TFs and their binding sites that will be a valuable resource for deciphering non-coding regulatory DNA.
Schoch, Conrad L; Robbertse, Barbara; Robert, Vincent; Vu, Duong; Cardinali, Gianluigi; Irinyi, Laszlo; Meyer, Wieland; Nilsson, R Henrik; Hughes, Karen; Miller, Andrew N; Kirk, Paul M; Abarenkov, Kessy; Aime, M Catherine; Ariyawansa, Hiran A; Bidartondo, Martin; Boekhout, Teun; Buyck, Bart; Cai, Qing; Chen, Jie; Crespo, Ana; Crous, Pedro W; Damm, Ulrike; De Beer, Z Wilhelm; Dentinger, Bryn T M; Divakar, Pradeep K; Dueñas, Margarita; Feau, Nicolas; Fliegerova, Katerina; García, Miguel A; Ge, Zai-Wei; Griffith, Gareth W; Groenewald, Johannes Z; Groenewald, Marizeth; Grube, Martin; Gryzenhout, Marieka; Gueidan, Cécile; Guo, Liangdong; Hambleton, Sarah; Hamelin, Richard; Hansen, Karen; Hofstetter, Valérie; Hong, Seung-Beom; Houbraken, Jos; Hyde, Kevin D; Inderbitzin, Patrik; Johnston, Peter R; Karunarathna, Samantha C; Kõljalg, Urmas; Kovács, Gábor M; Kraichak, Ekaphan; Krizsan, Krisztina; Kurtzman, Cletus P; Larsson, Karl-Henrik; Leavitt, Steven; Letcher, Peter M; Liimatainen, Kare; Liu, Jian-Kui; Lodge, D Jean; Luangsa-ard, Janet Jennifer; Lumbsch, H Thorsten; Maharachchikumbura, Sajeewa S N; Manamgoda, Dimuthu; Martín, María P; Minnis, Andrew M; Moncalvo, Jean-Marc; Mulè, Giuseppina; Nakasone, Karen K; Niskanen, Tuula; Olariaga, Ibai; Papp, Tamás; Petkovits, Tamás; Pino-Bodas, Raquel; Powell, Martha J; Raja, Huzefa A; Redecker, Dirk; Sarmiento-Ramirez, J M; Seifert, Keith A; Shrestha, Bhushan; Stenroos, Soili; Stielow, Benjamin; Suh, Sung-Oui; Tanaka, Kazuaki; Tedersoo, Leho; Telleria, M Teresa; Udayanga, Dhanushka; Untereiner, Wendy A; Diéguez Uribeondo, Javier; Subbarao, Krishna V; Vágvölgyi, Csaba; Visagie, Cobus; Voigt, Kerstin; Walker, Donald M; Weir, Bevan S; Weiß, Michael; Wijayawardene, Nalin N; Wingfield, Michael J; Xu, J P; Yang, Zhu L; Zhang, Ning; Zhuang, Wen-Ying; Federhen, Scott
2014-01-01
DNA phylogenetic comparisons have shown that morphology-based species recognition often underestimates fungal diversity. Therefore, the need for accurate DNA sequence data, tied to both correct taxonomic names and clearly annotated specimen data, has never been greater. Furthermore, the growing number of molecular ecology and microbiome projects using high-throughput sequencing require fast and effective methods for en masse species assignments. In this article, we focus on selecting and re-annotating a set of marker reference sequences that represent each currently accepted order of Fungi. The particular focus is on sequences from the internal transcribed spacer region in the nuclear ribosomal cistron, derived from type specimens and/or ex-type cultures. Re-annotated and verified sequences were deposited in a curated public database at the National Center for Biotechnology Information (NCBI), namely the RefSeq Targeted Loci (RTL) database, and will be visible during routine sequence similarity searches with NR_prefixed accession numbers. A set of standards and protocols is proposed to improve the data quality of new sequences, and we suggest how type and other reference sequences can be used to improve identification of Fungi. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA177353. Published by Oxford University Press 2013. This work is written by US Government employees and is in the public domain in the US.
Machado, Helena; Silva, Susana
2015-01-01
The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of ‘solidarity’, traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. PMID:26139851
BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson, William R
2014-01-01
BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
DNA typing in forensic medicine and in criminal investigations: a current survey.
Benecke, M
1997-05-01
Since 1985 DNA typing of biological material has become one of the most powerful tools for personal identification in forensic medicine and in criminal investigations [1-6]. Classical DNA "fingerprinting" is increasingly being replaced by polymerase chain reaction (PCR) based technology which detects very short polymorphic stretches of DNA [7-15]. DNA loci which forensic scientists study do not code for proteins, and they are spread over the whole genome [16, 17]. These loci are neutral, and few provide any information about individuals except for their identity. Minute amounts of biological material are sufficient for DNA typing. Many European countries are beginning to establish databases to store DNA profiles of crime scenes and known offenders. A brief overview is given of past and present DNA typing and the establishment of forensic DNA databases in Europe.
DNA typing in forensic medicine and in criminal investigations: a current survey
NASA Astrophysics Data System (ADS)
Benecke, Mark
Since 1985 DNA typing of biological material has become one of the most powerful tools for personal identification in forensic medicine and in criminal investigations [1-6]. Classical DNA "fingerprinting" is increasingly being replaced by polymerase chain reaction (PCR) based technology which detects very short polymorphic stretches of DNA [7-15]. DNA loci which forensic scientists study do not code for proteins, and they are spread over the whole genome [16, 17]. These loci are neutral, and few provide any information about individuals except for their identity. Minute amounts of biological material are sufficient for DNA typing. Many European countries are beginning to establish databases to store DNA profiles of crime scenes and known offenders. A brief overview is given of past and present DNA typing and the establishment of forensic DNA databases in Europe.
Alignment of high-throughput sequencing data inside in-memory databases.
Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias
2014-01-01
In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.
DNA barcoding of medicinal plant material for identification
USDA-ARS?s Scientific Manuscript database
Because of the increasing demand for herbal remedies and for authentication of the source material, it is vital to provide a single database containing information about authentic plant materials and their potential adulterants. The database should provide DNA barcodes for data retrieval and similar...
Design checkpoint kinase 2 inhibitors by pharmacophore modeling and virtual screening techniques.
Wang, Yen-Ling; Lin, Chun-Yuan; Shih, Kuei-Chung; Huang, Jui-Wen; Tang, Chuan-Yi
2013-12-01
Damage to DNA is caused by ionizing radiation, genotoxic chemicals or collapsed replication forks. When DNA is damaged or cells fail to respond, a mutation that is associated with breast or ovarian cancer may occur. Mammalian cells control and stabilize the genome using a cell cycle checkpoint to prevent damage to DNA or to repair damaged DNA. Checkpoint kinase 2 (Chk2) is one of the important kinases, which strongly affects DNA-damage and plays an important role in the response to the breakage of DNA double-strands and related lesions. Therefore, this study concerns Chk2. Its purpose is to find potential inhibitors using the pharmacophore hypotheses (PhModels) and virtual screening techniques. PhModels can identify inhibitors with high biological activities and virtual screening techniques are used to screen the database of the National Cancer Institute (NCI) to retrieve compounds that exhibit all of the pharmacophoric features of potential inhibitors with high interaction energy. Ten PhModels were generated using the HypoGen best algorithm. The established PhModel, Hypo01, was evaluated by performing a cost function analysis of its correlation coefficient (r), root mean square deviation (RMSD), cost difference, and configuration cost, with the values 0.955, 1.28, 192.51, and 16.07, respectively. The result of Fischer's cross-validation test for the Hypo01 model yielded a 95% confidence level, and the correlation coefficient of the testing set (rtest) had a best value of 0.81. The potential inhibitors were then chosen from the NCI database by Hypo01 model screening and molecular docking using the cdocker docking program. Finally, the selected compounds exhibited the identified pharmacophoric features and had a high interaction energy between the ligand and the receptor. Eighty-three potential inhibitors for Chk2 are retrieved for further study. Copyright © 2013 Elsevier Ltd. All rights reserved.
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
HUNT: launch of a full-length cDNA database from the Helix Research Institute.
Yudate, H T; Suwa, M; Irie, R; Matsui, H; Nishikawa, T; Nakamura, Y; Yamaguchi, D; Peng, Z Z; Yamamoto, T; Nagai, K; Hayashi, K; Otsuki, T; Sugiyama, T; Ota, T; Suzuki, Y; Sugano, S; Isogai, T; Masuho, Y
2001-01-01
The Helix Research Institute (HRI) in Japan is releasing 4356 HUman Novel Transcripts and related information in the newly established HUNT database. The institute is a joint research project principally funded by the Japanese Ministry of International Trade and Industry, and the clones were sequenced in the governmental New Energy and Industrial Technology Development Organization (NEDO) Human cDNA Sequencing Project. The HUNT database contains an extensive amount of annotation from advanced analysis and represents an essential bioinformatics contribution towards understanding of the gene function. The HRI human cDNA clones were obtained from full-length enriched cDNA libraries constructed with the oligo-capping method and have resulted in novel full-length cDNA sequences. A large fraction has little similarity to any proteins of known function and to obtain clues about possible function we have developed original analysis procedures. Any putative function deduced here can be validated or refuted by complementary analysis results. The user can also extract information from specific categories like PROSITE patterns, PFAM domains, PSORT localization, transmembrane helices and clones with GENIUS structure assignments. The HUNT database can be accessed at http://www.hri.co.jp/HUNT.
MeDReaders: a database for transcription factors that bind to methylated DNA.
Wang, Guohua; Luo, Ximei; Wang, Jianan; Wan, Jun; Xia, Shuli; Zhu, Heng; Qian, Jiang; Wang, Yadong
2018-01-04
Understanding the molecular principles governing interactions between transcription factors (TFs) and DNA targets is one of the main subjects for transcriptional regulation. Recently, emerging evidence demonstrated that some TFs could bind to DNA motifs containing highly methylated CpGs both in vitro and in vivo. Identification of such TFs and elucidation of their physiological roles now become an important stepping-stone toward understanding the mechanisms underlying the methylation-mediated biological processes, which have crucial implications for human disease and disease development. Hence, we constructed a database, named as MeDReaders, to collect information about methylated DNA binding activities. A total of 731 TFs, which could bind to methylated DNA sequences, were manually curated in human and mouse studies reported in the literature. In silico approaches were applied to predict methylated and unmethylated motifs of 292 TFs by integrating whole genome bisulfite sequencing (WGBS) and ChIP-Seq datasets in six human cell lines and one mouse cell line extracted from ENCODE and GEO database. MeDReaders database will provide a comprehensive resource for further studies and aid related experiment designs. The database implemented unified access for users to most TFs involved in such methylation-associated binding actives. The website is available at http://medreader.org/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ng, Kevin Kit Siong; Lee, Soon Leong; Tnah, Lee Hong; Nurul-Farhanah, Zakaria; Ng, Chin Hong; Lee, Chai Ting; Tani, Naoki; Diway, Bibian; Lai, Pei Sing; Khoo, Eyen
2016-07-01
Illegal logging and smuggling of Gonystylus bancanus (Thymelaeaceae) poses a serious threat to this fragile valuable peat swamp timber species. Using G. bancanus as a case study, DNA markers were used to develop identification databases at the species, population and individual level. The species level database for Gonystylus comprised of an rDNA (ITS2) and two cpDNA (trnH-psbA and trnL) markers based on a 20 Gonystylus species database. When concatenated, taxonomic species recognition was achieved with a resolution of 90% (18 out of the 20 species). In addition, based on 17 natural populations of G. bancanus throughout West (Peninsular Malaysia) and East (Sabah and Sarawak) Malaysia, population and individual identification databases were developed using cpDNA and STR markers respectively. A haplotype distribution map for Malaysia was generated using six cpDNA markers, resulting in 12 unique multilocus haplotypes, from 24 informative intraspecific variable sites. These unique haplotypes suggest a clear genetic structuring of West and East regions. A simulation procedure based on the composition of the samples was used to test whether a suspected sample conformed to a given regional origin. Overall, the observed type I and II errors of the databases showed good concordance with the predicted 5% threshold which indicates that the databases were useful in revealing provenance and establishing conformity of samples from West and East Malaysia. Sixteen STRs were used to develop the DNA profiling databases for individual identification. Bayesian clustering analyses divided the 17 populations into two main genetic clusters, corresponding to the regions of West and East Malaysia. Population substructuring (K=2) was observed within each region. After removal of bias resulting from sampling effects and population subdivision, conservativeness tests showed that the West and East Malaysia databases were conservative. This suggests that both databases can be used independently for random match probability estimation within respective regions. The reliability of the databases was further determined by independent self-assignment tests based on the likelihood of each individual's multilocus genotype occurring in each identified population, genetic cluster and region with an average percentage of correctly assigned individuals of 54.80%, 99.60% and 100% respectively. Thus, after appropriate validation, the genetic identification databases developed for G. bancanus in this study could support forensic applications and help safeguard this valuable species into the future. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Genetic identification of missing persons: DNA analysis of human remains and compromised samples.
Alvarez-Cubero, M J; Saiz, M; Martinez-Gonzalez, L J; Alvarez, J C; Eisenberg, A J; Budowle, B; Lorente, J A
2012-01-01
Human identification has made great strides over the past 2 decades due to the advent of DNA typing. Forensic DNA typing provides genetic data from a variety of materials and individuals, and is applied to many important issues that confront society. Part of the success of DNA typing is the generation of DNA databases to help identify missing persons and to develop investigative leads to assist law enforcement. DNA databases house DNA profiles from convicted felons (and in some jurisdictions arrestees), forensic evidence, human remains, and direct and family reference samples of missing persons. These databases are essential tools, which are becoming quite large (for example the US Database contains 10 million profiles). The scientific, governmental and private communities continue to work together to standardize genetic markers for more effective worldwide data sharing, to develop and validate robust DNA typing kits that contain the reagents necessary to type core identity genetic markers, to develop technologies that facilitate a number of analytical processes and to develop policies to make human identity testing more effective. Indeed, DNA typing is integral to resolving a number of serious criminal and civil concerns, such as solving missing person cases and identifying victims of mass disasters and children who may have been victims of human trafficking, and provides information for historical studies. As more refined capabilities are still required, novel approaches are being sought, such as genetic testing by next-generation sequencing, mass spectrometry, chip arrays and pyrosequencing. Single nucleotide polymorphisms offer the potential to analyze severely compromised biological samples, to determine the facial phenotype of decomposed human remains and to predict the bioancestry of individuals, a new focus in analyzing this type of markers. Copyright © 2012 S. Karger AG, Basel.
Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders.
Hamosh, Ada; Scott, Alan F; Amberger, Joanna S; Bocchini, Carol A; McKusick, Victor A
2005-01-01
Online Mendelian Inheritance in Man (OMIM) is a comprehensive, authoritative and timely knowledgebase of human genes and genetic disorders compiled to support human genetics research and education and the practice of clinical genetics. Started by Dr Victor A. McKusick as the definitive reference Mendelian Inheritance in Man, OMIM (http://www.ncbi.nlm.nih.gov/omim/) is now distributed electronically by the National Center for Biotechnology Information, where it is integrated with the Entrez suite of databases. Derived from the biomedical literature, OMIM is written and edited at Johns Hopkins University with input from scientists and physicians around the world. Each OMIM entry has a full-text summary of a genetically determined phenotype and/or gene and has numerous links to other genetic databases such as DNA and protein sequence, PubMed references, general and locus-specific mutation databases, HUGO nomenclature, MapViewer, GeneTests, patient support groups and many others. OMIM is an easy and straightforward portal to the burgeoning information in human genetics.
USDA-ARS?s Scientific Manuscript database
Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...
SAM: String-based sequence search algorithm for mitochondrial DNA database queries
Röck, Alexander; Irwin, Jodi; Dür, Arne; Parsons, Thomas; Parson, Walther
2011-01-01
The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org). PMID:21056022
Plant rDNA database: update and new features.
Garcia, Sònia; Gálvez, Francisco; Gras, Airy; Kovařík, Aleš; Garnatje, Teresa
2014-01-01
The Plant rDNA database (www.plantrdnadatabase.com) is an open access online resource providing detailed information on numbers, structures and positions of 5S and 18S-5.8S-26S (35S) ribosomal DNA loci. The data have been obtained from >600 publications on plant molecular cytogenetics, mostly based on fluorescent in situ hybridization (FISH). This edition of the database contains information on 1609 species derived from 2839 records, which means an expansion of 55.76 and 94.45%, respectively. It holds the data for angiosperms, gymnosperms, bryophytes and pteridophytes available as of June 2013. Information from publications reporting data for a single rDNA (either 5S or 35S alone) and annotation regarding transcriptional activity of 35S loci now appears in the database. Preliminary analyses suggest greater variability in the number of rDNA loci in gymnosperms than in angiosperms. New applications provide ideograms of the species showing the positions of rDNA loci as well as a visual representation of their genome sizes. We have also introduced other features to boost the usability of the Web interface, such as an application for convenient data export and a new section with rDNA-FISH-related information (mostly detailing protocols and reagents). In addition, we upgraded and/or proofread tabs and links and modified the website for a more dynamic appearance. This manuscript provides a synopsis of these changes and developments. http://www.plantrdnadatabase.com. © The Author(s) 2014. Published by Oxford University Press.
Morimoto, K; Kimura, M; Murata, T; Imai, Y; Ookami, N; Igarashi, T; Kanoh, N; Kaminuma, T; Hayashi, Y
1994-01-01
Many carcinogens react with DNA and form critical DNA adducts, such as O6-alkylguanine (O6-AG), O4-alkylthymine (O4-AT), and 8-hydroxyguanine (8-OHG). This study provides a database that can be used for molecular dosimetry of these DNA adducts. A literature survey on DNA binding in vivo was done by the Dialog search from the MEDLINE database. We propose a Critical Covalent Binding Index (CCBI) for the assessment of in vivo DNA binding level (expressed as micro mol chemical bound per mol G or T/mmol chemical administered per kg body weight). The number of records and compounds in parenthesis of O6-AG, O4-AT, and 8-OHG were 245(13), 54(4), 79(15), respectively. Since the CCBI values for N-nitrosamine in target organ were higher than for non-target organ, they may provide a useful index for estimation of target organ site and carcinogenic potency. As a case example, CCBI values for O4-AT from animal data were applied for diethylnitrosamine human exposure estimation by diethylnitrosamine.
Osypov, Alexander A; Krutinin, Gleb G; Krutinina, Eugenia A; Kamzolova, Svetlana G
2012-04-01
Electrostatic properties of genome DNA are important to its interactions with different proteins, in particular, related to transcription. DEPPDB - DNA Electrostatic Potential (and other Physical) Properties Database - provides information on the electrostatic and other physical properties of genome DNA combined with its sequence and annotation of biological and structural properties of genomes and their elements. Genomes are organized on taxonomical basis, supporting comparative and evolutionary studies. Currently, DEPPDB contains all completely sequenced bacterial, viral, mitochondrial, and plastids genomes according to the NCBI RefSeq, and some model eukaryotic genomes. Data for promoters, regulation sites, binding proteins, etc., are incorporated from established DBs and literature. The database is complemented by analytical tools. User sequences calculations are available. Case studies discovered electrostatics complementing DNA bending in E.coli plasmid BNT2 promoter functioning, possibly affecting host-environment metabolic switch. Transcription factors binding sites gravitate to high potential regions, confirming the electrostatics universal importance in protein-DNA interactions beyond the classical promoter-RNA polymerase recognition and regulation. Other genome elements, such as terminators, also show electrostatic peculiarities. Most intriguing are gene starts, exhibiting taxonomic correlations. The necessity of the genome electrostatic properties studies is discussed.
Deppdb--DNA electrostatic potential properties database: electrostatic properties of genome DNA.
Osypov, Alexander A; Krutinin, Gleb G; Kamzolova, Svetlana G
2010-06-01
The electrostatic properties of genome DNA influence its interactions with different proteins, in particular, the regulation of transcription by RNA-polymerases. DEPPDB--DNA Electrostatic Potential Properties Database--was developed to hold and provide all available information on the electrostatic properties of genome DNA combined with its sequence and annotation of biological and structural properties of genome elements and whole genomes. Genomes in DEPPDB are organized on a taxonomical basis. Currently, the database contains all the completely sequenced bacterial and viral genomes according to NCBI RefSeq. General properties of the genome DNA electrostatic potential profile and principles of its formation are revealed. This potential correlates with the GC content but does not correspond to it exactly and strongly depends on both the sequence arrangement and its context (flanking regions). Analysis of the promoter regions for bacterial and viral RNA polymerases revealed a correspondence between the scale of these proteins' physical properties and electrostatic profile patterns. We also discovered a direct correlation between the potential value and the binding frequency of RNA polymerase to DNA, supporting the idea of the role of electrostatics in these interactions. This matches a pronounced tendency of the promoter regions to possess higher values of the electrostatic potential.
Irwin, Jodi A; Saunier, Jessica L; Strouss, Katharine M; Sturk, Kimberly A; Diegoli, Toni M; Just, Rebecca S; Coble, Michael D; Parson, Walther; Parsons, Thomas J
2007-06-01
In an effort to increase the quantity, breadth and availability of mtDNA databases suitable for forensic comparisons, we have developed a high-throughput process to generate approximately 5000 control region sequences per year from regional US populations, global populations from which the current US population is derived and global populations currently under-represented in available forensic databases. The system utilizes robotic instrumentation for all laboratory steps from pre-extraction through sequence detection, and a rigorous eight-step, multi-laboratory data review process with entirely electronic data transfer. Over the past 3 years, nearly 10,000 control region sequences have been generated using this approach. These data are being made publicly available and should further address the need for consistent, high-quality mtDNA databases for forensic testing.
The Polish Genetic Database of Victims of Totalitarianisms.
Ossowski, A; Kuś, M; Kupiec, T; Bykowska, M; Zielińska, G; Jasiński, M E; March, A L
2016-01-01
This paper describes the creation of the Polish Genetic Database of Victims of Totalitarianism and the first research conducted under this project. On September 28th 2012, the Pomeranian Medical University in Szczecin and the Institute of National Remembrance-Commission for Prosecution of Crimes against the Polish Nation agreed to support the creation of the Polish Genetic Database of Victims of Totalitarianism (PBGOT, www.pbgot.pl). The purpose was to employ state-of-the-art methods of forensic genetics to identify the remains of unidentified victims of Communist and Nazi totalitarian regimes. The database was designed to serve as a central repository of genetic information of the victim's DNA and that of the victim's nearest living relatives, with the goal of making a positive identification of the victim. Along the way, PGBOT encountered several challenges. First, extracting useable DNA samples from the remains of individuals who had been buried for over half a century required forensic geneticists to create special procedures and protocols. Second, obtaining genetic reference material and historical information from the victim's closest relatives was both problematic and urgent. The victim's nearest living relatives were part of a dying generation, and the opportunity to obtain the best genetic and historical information about the victims would soon die with them. For this undertaking, PGBOT assembled a team of historians, archaeologists, forensic anthropologists, and forensic geneticists from several European research institutions. The field work was divided into five broad categories: (1) exhumation of victim remains and storing their biological material for later genetic testing; (2) researching archives and historical data for a more complete profile of those killed or missing and the families that lost them; (3) locating the victim's nearest relatives to obtain genetic reference samples (swabs), (4) entering the genetic data from both victims and family members into a common database; (5) making a conclusive, final identification of the victim. PGBOT's first project was to identify victims of the Communist regime buried in hidden mass graves in the Powązki Military Cemetery in Warsaw. Throughout 2012 and 2013, PGBOT carried out archaeological exhumations in the Powązki Military Cemetery that resulted in the recovery of the skeletal remains of 194 victims in several mass graves. Of the 194 sets of remains, more than 50 victims have been successfully matched and identified through genetic evidence. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
African-American mitochondrial DNAs often match mtDNAs found in multiple African ethnic groups
Ely, Bert; Wilson, Jamie Lee; Jackson, Fatimah; Jackson, Bruce A
2006-01-01
Background Mitochondrial DNA (mtDNA) haplotypes have become popular tools for tracing maternal ancestry, and several companies offer this service to the general public. Numerous studies have demonstrated that human mtDNA haplotypes can be used with confidence to identify the continent where the haplotype originated. Ideally, mtDNA haplotypes could also be used to identify a particular country or ethnic group from which the maternal ancestor emanated. However, the geographic distribution of mtDNA haplotypes is greatly influenced by the movement of both individuals and population groups. Consequently, common mtDNA haplotypes are shared among multiple ethnic groups. We have studied the distribution of mtDNA haplotypes among West African ethnic groups to determine how often mtDNA haplotypes can be used to reconnect Americans of African descent to a country or ethnic group of a maternal African ancestor. The nucleotide sequence of the mtDNA hypervariable segment I (HVS-I) usually provides sufficient information to assign a particular mtDNA to the proper haplogroup, and it contains most of the variation that is available to distinguish a particular mtDNA haplotype from closely related haplotypes. In this study, samples of general African-American and specific Gullah/Geechee HVS-I haplotypes were compared with two databases of HVS-I haplotypes from sub-Saharan Africa, and the incidence of perfect matches recorded for each sample. Results When two independent African-American samples were analyzed, more than half of the sampled HVS-I mtDNA haplotypes exactly matched common haplotypes that were shared among multiple African ethnic groups. Another 40% did not match any sequence in the database, and fewer than 10% were an exact match to a sequence from a single African ethnic group. Differences in the regional distribution of haplotypes were observed in the African database, and the African-American haplotypes were more likely to match haplotypes found in ethnic groups from West or West Central Africa than those found in eastern or southern Africa. Fewer than 14% of the African-American mtDNA sequences matched sequences from only West Africa or only West Central Africa. Conclusion Our database of sub-Saharan mtDNA sequences includes the most common haplotypes that are shared among ethnic groups from multiple regions of Africa. These common haplotypes have been found in half of all sub-Saharan Africans. More than 60% of the remaining haplotypes differ from the common haplotypes at a single nucleotide position in the HVS-I region, and they are likely to occur at varying frequencies within sub-Saharan Africa. However, the finding that 40% of the African-American mtDNAs analyzed had no match in the database indicates that only a small fraction of the total number of African haplotypes has been identified. In addition, the finding that fewer than 10% of African-American mtDNAs matched mtDNA sequences from a single African region suggests that few African Americans might be able to trace their mtDNA lineages to a particular region of Africa, and even fewer will be able to trace their mtDNA to a single ethnic group. However, no firm conclusions should be made until a much larger database is available. It is clear, however, that when identical mtDNA haplotypes are shared among many ethnic groups from different parts of Africa, it is impossible to determine which single ethnic group was the source of a particular maternal ancestor based on the mtDNA sequence. PMID:17038170
Reference System of DNA and Protein Sequences on CD-ROM
NASA Astrophysics Data System (ADS)
Nasu, Hisanori; Ito, Toshiaki
DNASIS-DBREF31 is a database for DNA and Protein sequences in the form of optical Compact Disk (CD) ROM, developed and commercialized by Hitachi Software Engineering Co., Ltd. Both nucleic acid base sequences and protein amino acid sequences can be retrieved from a single CD-ROM. Existing database is offered in the form of on-line service, floppy disks, or magnetic tape, all of which have some problems or other, such as usability or storage capacity. DNASIS-DBREF31 newly adopt a CD-ROM as a database device to realize a mass storage and personal use of the database.
USDA-ARS?s Scientific Manuscript database
A database of Louisiana sugarcane molecular identity has been constructed and is being updated annually using FAM or HEX or NED fluorescence- and capillary electrophoresis (CE)-based microsatellite (SSR) fingerprinting information. The fingerprints are PCR-amplified from leaf DNA samples of current ...
Genes Downregulated in Endometriosis Are Located Near the Known Imprinting Genes
Higashiura, Yumi; Koike, Natsuki; Akasaka, Juria; Uekuri, Chiharu; Iwai, Kana; Niiro, Emiko; Morioka, Sachiko; Yamada, Yuki
2014-01-01
There is now accumulating evidence that endometriosis is a disease associated with an epigenetic disorder. Genomic imprinting is an epigenetic phenomenon known to regulate DNA methylation of either maternal or paternal alleles. We hypothesize that hypermethylated endometriosis-associated genes may be enriched at imprinted gene loci. We sought to determine whether downregulated genes associated with endometriosis susceptibility are associated with chromosomal location of the known paternally and maternally expressed imprinting genes. Gene information has been gathered from National Center for Biotechnology Information database geneimprint.com. Several researchers have identified specific loci with strong DNA methylation in eutopic endometrium and ectopic lesion with endometriosis. Of the 29 hypermethylated genes in endometriosis, 19 genes were located near 45 known imprinted foci. There may be an association of the genomic location between genes specifically downregulated in endometriosis and epigenetically imprinted genes. PMID:24615936
European securitization and biometric identification: the uses of genetic profiling.
Johnson, Paul; Williams, Robin
2007-01-01
The recent loss of confidence in textual and verbal methods for validating the identity claims of individual subjects has resulted in growing interest in the use of biometric technologies to establish corporeal uniqueness. Once established, this foundational certainty allows changing biographies and shifting category memberships to be anchored to unchanging bodily surfaces, forms or features. One significant source for this growth has been the "securitization" agendas of nation states that attempt the greater control and monitoring of population movement across geographical borders. Among the wide variety of available biometric schemes, DNA profiling is regarded as a key method for discerning and recording embodied individuality. This paper discusses the current limitations on the use of DNA profiling in civil identification practices and speculates on future uses of the technology with regard to its interoperability with other biometric databasing systems.
Diway, Bibian; Khoo, Eyen
2017-01-01
The development of timber tracking methods based on genetic markers can provide scientific evidence to verify the origin of timber products and fulfill the growing requirement for sustainable forestry practices. In this study, the origin of an important Dark Red Meranti wood, Shorea platyclados, was studied by using the combination of seven chloroplast DNA and 15 short tandem repeats (STRs) markers. A total of 27 natural populations of S. platyclados were sampled throughout Malaysia to establish population level and individual level identification databases. A haplotype map was generated from chloroplast DNA sequencing for population identification, resulting in 29 multilocus haplotypes, based on 39 informative intraspecific variable sites. Subsequently, a DNA profiling database was developed from 15 STRs allowing for individual identification in Malaysia. Cluster analysis divided the 27 populations into two genetic clusters, corresponding to the region of Eastern and Western Malaysia. The conservativeness tests showed that the Malaysia database is conservative after removal of bias from population subdivision and sampling effects. Independent self-assignment tests correctly assigned individuals to the database in an overall 60.60−94.95% of cases for identified populations, and in 98.99−99.23% of cases for identified regions. Both the chloroplast DNA database and the STRs appear to be useful for tracking timber originating in Malaysia. Hence, this DNA-based method could serve as an effective addition tool to the existing forensic timber identification system for ensuring the sustainably management of this species into the future. PMID:28430826
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis. PMID:26884678
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis.
Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.
Li, Guang; Wang, Yadong; Su, Xiaohong
2012-10-01
When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.
Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi
2018-01-01
We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.
Kim, Hyun Soo
2018-01-01
Aged population is increasing worldwide due to the aging process that is inevitable. Accordingly, longevity and healthy aging have been spotlighted to promote social contribution of aged population. Many studies in the past few decades have reported the process of aging and longevity, emphasizing the importance of maintaining genomic stability in exceptionally long-lived population. Underlying reason of longevity remains unclear due to its complexity involving multiple factors. With advances in sequencing technology and human genome-associated approaches, studies based on population-based genomic studies are increasing. In this review, we summarize recent longevity and healthy aging studies of human population focusing on DNA repair as a major factor in maintaining genome integrity. To keep pace with recent growth in genomic research, aging- and longevity-associated genomic databases are also briefly introduced. To suggest novel approaches to investigate longevity-associated genetic variants related to DNA repair using genomic databases, gene set analysis was conducted, focusing on DNA repair- and longevity-associated genes. Their biological networks were additionally analyzed to grasp major factors containing genetic variants of human longevity and healthy aging in DNA repair mechanisms. In summary, this review emphasizes DNA repair activity in human longevity and suggests approach to conduct DNA repair-associated genomic study on human healthy aging.
Hong, Seung Beom; Kim, Ki Cheol; Kim, Wook
2015-07-01
We generated complete mitochondrial DNA (mtDNA) control region sequences from 704 unrelated individuals residing in six major provinces in Korea. In addition to our earlier survey of the distribution of mtDNA haplogroup variation, a total of 560 different haplotypes characterized by 271 polymorphic sites were identified, of which 473 haplotypes were unique. The gene diversity and random match probability were 0.9989 and 0.0025, respectively. According to the pairwise comparison of the 704 control region sequences, the mean number of pairwise differences between individuals was 13.47±6.06. Based on the result of mtDNA control region sequences, pairwise FST genetic distances revealed genetic homogeneity of the Korean provinces on a peninsular level, except in samples from Jeju Island. This result indicates there may be a need to formulate a local mtDNA database for Jeju Island, to avoid bias in forensic parameter estimates caused by genetic heterogeneity of the population. Thus, the present data may help not only in personal identification but also in determining maternal lineages to provide an expanded and reliable Korean mtDNA database. These data will be available on the EMPOP database via accession number EMP00661. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Application of cytochrome b DNA sequences for the authentication of endangered snake species.
Wong, Ka-Lok; Wang, Jun; But, Paul Pui-Hay; Shaw, Pang-Chui
2004-01-06
In order to enforce the conservation program and curbing the illegal trading and consumption of endangered snake species, the value of cytochrome b sequence in the authentication of snake species was evaluated. As an illustration, DNA was extracted, selected cytochrome b DNA sequences amplified and sequenced from six snakes commonly consumed in Hong Kong. Cataloging with sequences available in public, a cytochrome b database containing 90 species of snakes was constructed. In this database, sequence homology between snakes ranged from 70.68 to 95.11%. On the other hand, intraspecific variation of three tested snakes was 0-0.98%. Using the database, we were able to determine the identity of six meat samples confiscated by the Agriculture, Fisheries and Conservation Department, HKSAR.
Database of amino acid-nucleotide contacts in contacts in DNA-homeodomain protein
NASA Astrophysics Data System (ADS)
Grokhlina, T. I.; Zrelov, P. V.; Ivanov, V. V.; Polozov, R. V.; Chirgadze, Yu. N.; Sivozhelezov, V. S.
2013-09-01
The analysis of amino acid-nucleotide contacts in interfaces of the protein-DNA complexes, intended to find consistencies in the protein-DNA recognition, is a complex problem that requires an analysis of the physicochemical characteristics of these contacts and the positions of the participating amino acids and nucleotides in the chains of the protein and the DNA, respectively, as well as conservatism of these contacts. Thus, those heterogeneous data should be systematized. For this purpose we have developed a database of amino acid-nucleotide contacts ANTPC (Amino acid Nucleotide Type Position Conservation) following the archetypal example of the proteins in the homeodomain family. We show that it can be used to compare and classify the interfaces of the protein-DNA complexes.
Evolutionary trends in animal ribosomal DNA loci: introduction to a new online database.
Sochorová, Jana; Garcia, Sònia; Gálvez, Francisco; Symonová, Radka; Kovařík, Aleš
2018-03-01
Ribosomal DNA (rDNA) loci encoding 5S and 45S (18S-5.8S-28S) rRNAs are important components of eukaryotic chromosomes. Here, we set up the animal rDNA database containing cytogenetic information about these loci in 1343 animal species (264 families) collected from 542 publications. The data are based on in situ hybridisation studies (both radioactive and fluorescent) carried out in major groups of vertebrates (fish, reptiles, amphibians, birds, and mammals) and invertebrates (mostly insects and mollusks). The database is accessible online at www.animalrdnadatabase.com . The median number of 45S and 5S sites was close to two per diploid chromosome set for both rDNAs despite large variation (1-74 for 5S and 1-54 for 45S sites). No significant correlation between the number of 5S and 45S rDNA loci was observed, suggesting that their distribution and amplification across the chromosomes follow independent evolutionary trajectories. Each group, irrespective of taxonomic classification, contained rDNA sites at any chromosome location. However, the distal and pericentromeric positions were the most prevalent (> 75% karyotypes) for 45S loci, while the position of 5S loci was more variable. We also examined potential relationships between molecular attributes of rDNA (homogenisation and expression) and cytogenetic parameters such as rDNA positions, chromosome number, and morphology.
Recent updates and developments to plant genome size databases
Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel; Bennett, Michael D.
2014-01-01
Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols. PMID:24288377
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.
Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy
2006-10-25
Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
Mock jurors' use of error rates in DNA database trawls.
Scurich, Nicholas; John, Richard S
2013-12-01
Forensic science is not infallible, as data collected by the Innocence Project have revealed. The rate at which errors occur in forensic DNA testing-the so-called "gold standard" of forensic science-is not currently known. This article presents a Bayesian analysis to demonstrate the profound impact that error rates have on the probative value of a DNA match. Empirical evidence on whether jurors are sensitive to this effect is equivocal: Studies have typically found they are not, while a recent, methodologically rigorous study found that they can be. This article presents the results of an experiment that examined this issue within the context of a database trawl case in which one DNA profile was tested against a multitude of profiles. The description of the database was manipulated (i.e., "medical" or "offender" database, or not specified) as was the rate of error (i.e., one-in-10 or one-in-1,000). Jury-eligible participants were nearly twice as likely to convict in the offender database condition compared to the condition not specified. The error rates did not affect verdicts. Both factors, however, affected the perception of the defendant's guilt, in the expected direction, although the size of the effect was meager compared to Bayesian prescriptions. The results suggest that the disclosure of an offender database to jurors might constitute prejudicial evidence, and calls for proficiency testing in forensic science as well as training of jurors are echoed. (c) 2013 APA, all rights reserved
The Hawaiian Algal Database: a laboratory LIMS and online resource for biodiversity data
Wang, Norman; Sherwood, Alison R; Kurihara, Akira; Conklin, Kimberly Y; Sauvage, Thomas; Presting, Gernot G
2009-01-01
Background Organization and presentation of biodiversity data is greatly facilitated by databases that are specially designed to allow easy data entry and organized data display. Such databases also have the capacity to serve as Laboratory Information Management Systems (LIMS). The Hawaiian Algal Database was designed to showcase specimens collected from the Hawaiian Archipelago, enabling users around the world to compare their specimens with our photographs and DNA sequence data, and to provide lab personnel with an organizational tool for storing various biodiversity data types. Description We describe the Hawaiian Algal Database, a comprehensive and searchable database containing photographs and micrographs, geo-referenced collecting information, taxonomic checklists and standardized DNA sequence data. All data for individual samples are linked through unique accession numbers. Users can search online for sample information by accession number, numerous levels of taxonomy, or collection site. At the present time the database contains data representing over 2,000 samples of marine, freshwater and terrestrial algae from the Hawaiian Archipelago. These samples are primarily red algae, although other taxa are being added. Conclusion The Hawaiian Algal Database is a digital repository for Hawaiian algal samples and acts as a LIMS for the laboratory. Users can make use of the online search tool to view and download specimen photographs and micrographs, DNA sequences and relevant habitat data, including georeferenced collecting locations. It is publicly available at . PMID:19728892
Average probability that a "cold hit" in a DNA database search results in an erroneous attribution.
Song, Yun S; Patil, Anand; Murphy, Erin E; Slatkin, Montgomery
2009-01-01
We consider a hypothetical series of cases in which the DNA profile of a crime-scene sample is found to match a known profile in a DNA database (i.e., a "cold hit"), resulting in the identification of a suspect based only on genetic evidence. We show that the average probability that there is another person in the population whose profile matches the crime-scene sample but who is not in the database is approximately 2(N - d)p(A), where N is the number of individuals in the population, d is the number of profiles in the database, and p(A) is the average match probability (AMP) for the population. The AMP is estimated by computing the average of the probabilities that two individuals in the population have the same profile. We show further that if a priori each individual in the population is equally likely to have left the crime-scene sample, then the average probability that the database search attributes the crime-scene sample to a wrong person is (N - d)p(A).
Machado, Helena; Silva, Susana
2015-10-01
The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of 'solidarity', traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Rice, Michael; Gladstone, William; Weir, Michael
2004-01-01
We discuss how relational databases constitute an ideal framework for representing and analyzing large-scale genomic data sets in biology. As a case study, we describe a Drosophila splice-site database that we recently developed at Wesleyan University for use in research and teaching. The database stores data about splice sites computed by a custom algorithm using Drosophila cDNA transcripts and genomic DNA and supports a set of procedures for analyzing splice-site sequence space. A generic Web interface permits the execution of the procedures with a variety of parameter settings and also supports custom structured query language queries. Moreover, new analytical procedures can be added by updating special metatables in the database without altering the Web interface. The database provides a powerful setting for students to develop informatic thinking skills.
2004-01-01
We discuss how relational databases constitute an ideal framework for representing and analyzing large-scale genomic data sets in biology. As a case study, we describe a Drosophila splice-site database that we recently developed at Wesleyan University for use in research and teaching. The database stores data about splice sites computed by a custom algorithm using Drosophila cDNA transcripts and genomic DNA and supports a set of procedures for analyzing splice-site sequence space. A generic Web interface permits the execution of the procedures with a variety of parameter settings and also supports custom structured query language queries. Moreover, new analytical procedures can be added by updating special metatables in the database without altering the Web interface. The database provides a powerful setting for students to develop informatic thinking skills. PMID:15592597
Molecular detection of Bartonella coopersplainsensis and B. henselae in rats from New Zealand.
Vijayan Genitha Helan, J N; Grinberg, A; Gedye, K; Potter, M A; Harrus, S
2018-06-25
To identify Bartonella spp. in rats from New Zealand using molecular methods. DNA was extracted from the spleens of 143 black rats (Rattus rattus) captured in the Tongariro National Park, New Zealand. PCR was performed using Bartonella genus-specific primers amplifying segments of the 16S-23S rRNA internal transcribed spacer and citrate synthase (gltA) and beta subunit of the RNA polymerase (rpoB) genes. PCR products were sequenced and compared online with sequences stored in the database of the National Center for Biotechnology Information of the United States of America. DNA sequences matching Bartonella coopersplainsensis and B. henselae were detected in samples from 22/143 (15.4%) and 3/143 (2.1%) rats, respectively. Co-occurrence of B. coopersplainsensis and B. henselae sequences was observed in the sample from one rat. Gram-negative fastidious bacteria belonging to the genus Bartonella are associated with a range of human diseases. Rodents play an important role as reservoirs of a broad range of Bartonella species. To our knowledge, this is the first report of a molecular detection of Bartonella spp. DNA in rodents from New Zealand, and the first identification of B. henselae DNA in rats, worldwide. Whereas the public health significance of B. coopersplainsensis remains undefined, B. henselae is the agent of cat scratch disease, and the presence of this bacterium in rats may have public health implications. Our results are preliminary and additional analyses of larger samples, preferably by bacterial culture, would provide more information on the prevalence and diversity of Bartonella spp., in particular B. henselae, in rats.
MethHC: a database of DNA methylation and gene expression in human cancer.
Huang, Wei-Yun; Hsu, Sheng-Da; Huang, Hsi-Yuan; Sun, Yi-Ming; Chou, Chih-Hung; Weng, Shun-Long; Huang, Hsien-Da
2015-01-01
We present MethHC (http://MethHC.mbc.nctu.edu.tw), a database comprising a systematic integration of a large collection of DNA methylation data and mRNA/microRNA expression profiles in human cancer. DNA methylation is an important epigenetic regulator of gene transcription, and genes with high levels of DNA methylation in their promoter regions are transcriptionally silent. Increasing numbers of DNA methylation and mRNA/microRNA expression profiles are being published in different public repositories. These data can help researchers to identify epigenetic patterns that are important for carcinogenesis. MethHC integrates data such as DNA methylation, mRNA expression, DNA methylation of microRNA gene and microRNA expression to identify correlations between DNA methylation and mRNA/microRNA expression from TCGA (The Cancer Genome Atlas), which includes 18 human cancers in more than 6000 samples, 6548 microarrays and 12 567 RNA sequencing data. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Hume, Maxwell A; Barrera, Luis A; Gisselbrecht, Stephen S; Bulyk, Martha L
2015-01-01
The Universal PBM Resource for Oligonucleotide Binding Evaluation (UniPROBE) serves as a convenient source of information on published data generated using universal protein-binding microarray (PBM) technology, which provides in vitro data about the relative DNA-binding preferences of transcription factors for all possible sequence variants of a length k ('k-mers'). The database displays important information about the proteins and displays their DNA-binding specificity data in terms of k-mers, position weight matrices and graphical sequence logos. This update to the database documents the growth of UniPROBE since the last update 4 years ago, and introduces a variety of new features and tools, including a new streamlined pipeline that facilitates data deposition by universal PBM data generators in the research community, a tool that generates putative nonbinding (i.e. negative control) DNA sequences for one or more proteins and novel motifs obtained by analyzing the PBM data using the BEEML-PBM algorithm for motif inference. The UniPROBE database is available at http://uniprobe.org. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.
Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki
2014-09-01
In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
Kuang, Xingyan; Dhroso, Andi; Han, Jing Ginger; Shyu, Chi-Ren; Korkin, Dmitry
2016-01-01
Macromolecular interactions are formed between proteins, DNA and RNA molecules. Being a principle building block in macromolecular assemblies and pathways, the interactions underlie most of cellular functions. Malfunctioning of macromolecular interactions is also linked to a number of diseases. Structural knowledge of the macromolecular interaction allows one to understand the interaction’s mechanism, determine its functional implications and characterize the effects of genetic variations, such as single nucleotide polymorphisms, on the interaction. Unfortunately, until now the interactions mediated by different types of macromolecules, e.g. protein–protein interactions or protein–DNA interactions, are collected into individual and unrelated structural databases. This presents a significant obstacle in the analysis of macromolecular interactions. For instance, the homogeneous structural interaction databases prevent scientists from studying structural interactions of different types but occurring in the same macromolecular complex. Here, we introduce DOMMINO 2.0, a structural Database Of Macro-Molecular INteractiOns. Compared to DOMMINO 1.0, a comprehensive database on protein-protein interactions, DOMMINO 2.0 includes the interactions between all three basic types of macromolecules extracted from PDB files. DOMMINO 2.0 is automatically updated on a weekly basis. It currently includes ∼1 040 000 interactions between two polypeptide subunits (e.g. domains, peptides, termini and interdomain linkers), ∼43 000 RNA-mediated interactions, and ∼12 000 DNA-mediated interactions. All protein structures in the database are annotated using SCOP and SUPERFAMILY family annotation. As a result, protein-mediated interactions involving protein domains, interdomain linkers, C- and N- termini, and peptides are identified. Our database provides an intuitive web interface, allowing one to investigate interactions at three different resolution levels: whole subunit network, binary interaction and interaction interface. Database URL: http://dommino.org PMID:26827237
Detection of a Diverse Marine Fish Fauna Using Environmental DNA from Seawater Samples
Iversen, Lars Lønsmann; Møller, Peter Rask; Rasmussen, Morten; Willerslev, Eske
2012-01-01
Marine ecosystems worldwide are under threat with many fish species and populations suffering from human over-exploitation. This is greatly impacting global biodiversity, economy and human health. Intriguingly, marine fish are largely surveyed using selective and invasive methods, which are mostly limited to commercial species, and restricted to particular areas with favourable conditions. Furthermore, misidentification of species represents a major problem. Here, we investigate the potential of using metabarcoding of environmental DNA (eDNA) obtained directly from seawater samples to account for marine fish biodiversity. This eDNA approach has recently been used successfully in freshwater environments, but never in marine settings. We isolate eDNA from ½-litre seawater samples collected in a temperate marine ecosystem in Denmark. Using next-generation DNA sequencing of PCR amplicons, we obtain eDNA from 15 different fish species, including both important consumption species, as well as species rarely or never recorded by conventional monitoring. We also detect eDNA from a rare vagrant species in the area; European pilchard (Sardina pilchardus). Additionally, we detect four bird species. Records in national databases confirmed the occurrence of all detected species. To investigate the efficiency of the eDNA approach, we compared its performance with 9 methods conventionally used in marine fish surveys. Promisingly, eDNA covered the fish diversity better than or equal to any of the applied conventional methods. Our study demonstrates that even small samples of seawater contain eDNA from a wide range of local fish species. Finally, in order to examine the potential dispersal of eDNA in oceans, we performed an experiment addressing eDNA degradation in seawater, which shows that even small (100-bp) eDNA fragments degrades beyond detectability within days. Although further studies are needed to validate the eDNA approach in varying environmental conditions, our findings provide a strong proof-of-concept with great perspectives for future monitoring of marine biodiversity and resources. PMID:22952584
Detection of a diverse marine fish fauna using environmental DNA from seawater samples.
Thomsen, Philip Francis; Kielgast, Jos; Iversen, Lars Lønsmann; Møller, Peter Rask; Rasmussen, Morten; Willerslev, Eske
2012-01-01
Marine ecosystems worldwide are under threat with many fish species and populations suffering from human over-exploitation. This is greatly impacting global biodiversity, economy and human health. Intriguingly, marine fish are largely surveyed using selective and invasive methods, which are mostly limited to commercial species, and restricted to particular areas with favourable conditions. Furthermore, misidentification of species represents a major problem. Here, we investigate the potential of using metabarcoding of environmental DNA (eDNA) obtained directly from seawater samples to account for marine fish biodiversity. This eDNA approach has recently been used successfully in freshwater environments, but never in marine settings. We isolate eDNA from ½-litre seawater samples collected in a temperate marine ecosystem in Denmark. Using next-generation DNA sequencing of PCR amplicons, we obtain eDNA from 15 different fish species, including both important consumption species, as well as species rarely or never recorded by conventional monitoring. We also detect eDNA from a rare vagrant species in the area; European pilchard (Sardina pilchardus). Additionally, we detect four bird species. Records in national databases confirmed the occurrence of all detected species. To investigate the efficiency of the eDNA approach, we compared its performance with 9 methods conventionally used in marine fish surveys. Promisingly, eDNA covered the fish diversity better than or equal to any of the applied conventional methods. Our study demonstrates that even small samples of seawater contain eDNA from a wide range of local fish species. Finally, in order to examine the potential dispersal of eDNA in oceans, we performed an experiment addressing eDNA degradation in seawater, which shows that even small (100-bp) eDNA fragments degrades beyond detectability within days. Although further studies are needed to validate the eDNA approach in varying environmental conditions, our findings provide a strong proof-of-concept with great perspectives for future monitoring of marine biodiversity and resources.
Construction of a primary DNA fingerprint database for cotton cultivars.
Zhang, Y C; Kuang, M; Yang, W H; Xu, H X; Zhou, D Y; Wang, Y Q; Feng, X A; Su, C; Wang, F
2013-06-13
Forty core primers were used to construct a DNA fingerprint database of 132 cotton species based on multiplex fluorescence detection technology. A high first successful ratio of 99.04% was demonstrated with tetraplex polymerase chain reaction. Forty primer pairs amplified a total of 262 genotypes among 132 species, with an average of 6.55 per primer and values of polymorphism information content varying from 0.340 to 0.882. Conflicting DNA homozygous ratios were found in various species. The highest DNA homozygous ratio was found in landrace standard cultivars, which had an 81.46% DNA homozygous ratio. The lowest occurred in a group of 2010 leading cultivars with a homozygous ratio of 63.04%. Genetic diversity of the 132 species was briefly analyzed using unweighted pair-group method with arithmetic means.
Gene Therapy in Cardiac Surgery: Clinical Trials, Challenges, and Perspectives
Katz, Michael G.; Fargnoli, Anthony S.; Kendle, Andrew P.; Hajjar, Roger J.; Bridges, Charles R.
2016-01-01
The concept of gene therapy was introduced in the 1970s after the development of recombinant DNA technology. Despite the initial great expectations, this field experienced early setbacks. Recent years have seen a revival of clinical programs of gene therapy in different fields of medicine. There are many promising targets for genetic therapy as an adjunct to cardiac surgery. The first positive long-term results were published for adenoviral administration of vascular endothelial growth factor with coronary artery bypass grafting. In this review we analyze the past, present, and future of gene therapy in cardiac surgery. The articles discussed were collected through PubMed and from author experience. The clinical trials referenced were found through the Wiley clinical trial database (http://www.wiley.com/legacy/wileychi/genmed/clinical/) as well as the National Institutes of Health clinical trial database (Clinicaltrials.gov). PMID:26801060
Jérôme, Marc; Martinsohn, Jann Thorsten; Ortega, Delphine; Carreau, Philippe; Verrez-Bagnis, Véronique; Mouchel, Olivier
2008-05-28
Traceability in the fish food sector plays an increasingly important role for consumer protection and confidence building. This is reflected by the introduction of legislation and rules covering traceability on national and international levels. Although traceability through labeling is well established and supported by respective regulations, monitoring and enforcement of these rules are still hampered by the lack of efficient diagnostic tools. We describe protocols using a direct sequencing method based on 212-274-bp diagnostic sequences derived from species-specific mitochondria DNA cytochrome b, 16S rRNA, and cytochrome oxidase subunit I sequences which can efficiently be applied to unambiguously determine even closely related fish species in processed food products labeled "anchovy". Traceability of anchovy-labeled products is supported by the public online database AnchovyID ( http://anchovyid.jrc.ec.europa.eu), which provided data obtained during our study and tools for analytical purposes.
Cloud-based adaptive exon prediction for DNA analysis.
Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen
2018-02-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
47 CFR 54.404 - The National Lifeline Accountability Database.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 47 Telecommunication 3 2012-10-01 2012-10-01 false The National Lifeline Accountability Database... National Lifeline Accountability Database. (a) State certification. An eligible telecommunications carrier... within 90 days of filing. (b) The National Lifeline Accountability Database. In order to receive Lifeline...
47 CFR 54.404 - The National Lifeline Accountability Database.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 47 Telecommunication 3 2013-10-01 2013-10-01 false The National Lifeline Accountability Database... National Lifeline Accountability Database. (a) State certification. An eligible telecommunications carrier... within 90 days of filing. (b) The National Lifeline Accountability Database. In order to receive Lifeline...
47 CFR 54.404 - The National Lifeline Accountability Database.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 47 Telecommunication 3 2014-10-01 2014-10-01 false The National Lifeline Accountability Database... National Lifeline Accountability Database. (a) State certification. An eligible telecommunications carrier... within 90 days of filing. (b) The National Lifeline Accountability Database. In order to receive Lifeline...
National Transportation Atlas Databases : 1999
DOT National Transportation Integrated Search
1999-01-01
The National Transportation Atlas Databases -- 1999 (NTAD99) is a set of national : geographic databases of transportation facilities. These databases include geospatial : information for transportation modal networks and intermodal terminals, and re...
National Transportation Atlas Databases : 2001
DOT National Transportation Integrated Search
2001-01-01
The National Transportation Atlas Databases-2001 (NTAD-2001) is a set of national geographic databases of transportation facilities. These databases include geospatial information for transportation modal networks and intermodal terminals and related...
National Transportation Atlas Databases : 1996
DOT National Transportation Integrated Search
1996-01-01
The National Transportation Atlas Databases -- 1996 (NTAD96) is a set of national : geographic databases of transportation facilities. These databases include geospatial : information for transportation modal networks and intermodal terminals, and re...
National Transportation Atlas Databases : 2000
DOT National Transportation Integrated Search
2000-01-01
The National Transportation Atlas Databases-2000 (NTAD-2000) is a set of national geographic databases of transportation facilities. These databases include geospatial information for transportation modal networks and intermodal terminals and related...
National Transportation Atlas Databases : 1997
DOT National Transportation Integrated Search
1997-01-01
The National Transportation Atlas Databases -- 1997 (NTAD97) is a set of national : geographic databases of transportation facilities. These databases include geospatial : information for transportation modal networks and intermodal terminals, and re...
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2005-01-01
GenBank is a comprehensive database that contains publicly available DNA sequences for more than 165,000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in the UK and the DNA Data Bank of Japan helps to ensure worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2006-01-01
GenBank (R) is a comprehensive database that contains publicly available DNA sequences for more than 205 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the Web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at www.ncbi.nlm.nih.gov.
Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders.
Hamosh, Ada; Scott, Alan F; Amberger, Joanna; Bocchini, Carol; Valle, David; McKusick, Victor A
2002-01-01
Online Mendelian Inheritance in Man (OMIM) is a comprehensive, authoritative and timely knowledgebase of human genes and genetic disorders compiled to support research and education in human genomics and the practice of clinical genetics. Started by Dr Victor A. McKusick as the definitive reference Mendelian Inheritance in Man, OMIM (www.ncbi.nlm.nih.gov/omim) is now distributed electronically by the National Center for Biotechnology Information (NCBI), where it is integrated with the Entrez suite of databases. Derived from the biomedical literature, OMIM is written and edited at Johns Hopkins University with input from scientists and physicians around the world. Each OMIM entry has a full-text summary of a genetically determined phenotype and/or gene and has numerous links to other genetic databases such as DNA and protein sequence, PubMed references, general and locus-specific mutation databases, approved gene nomenclature, and the highly detailed mapviewer, as well as patient support groups and many others. OMIM is an easy and straightforward portal to the burgeoning information in human genetics.
Establishing a database of Canadian feline mitotypes for forensic use.
Arcieri, M; Agostinelli, G; Gray, Z; Spadaro, A; Lyons, L A; Webb, K M
2016-05-01
Hair shed by pet animals is often found and collected as evidence from crime scenes. Due to limitations such as small amount and low quality, mitochondrial DNA (mtDNA) is often the only type of DNA that can be used for linking the hair to a potential contributor. mtDNA has lower discriminatory power than nuclear DNA because multiple, unrelated individuals within a population can have the same mtDNA sequence, or mitotype. Therefore, to determine the evidentiary value of a match between crime scene evidence and a suspected contributor, the frequency of the mitotype must be known within the regional population. While mitotype frequencies have been determined for the United States' cat population, the frequencies are unknown for the Canadian cat population. Given the countries' close proximity and similar human settlement patterns, these populations may be homogenous, meaning a single, regional database may be used for estimating cat population mitotype frequencies. Here we determined the mitotype frequencies of the Canadian cat population and compared them to the United States' cat population. The two cat populations are statistically homogenous, however mitotype B6 was found in high frequency in Canada and extremely low frequency in the United States, meaning a single database would not be appropriate for North America. Furthermore, this work calls attention to these local spikes in frequency of otherwise rare mitotypes, instances of which exist around the world and have the potential to misrepresent the evidentiary value of matches compared to a regional database. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Toren, Dmitri; Barzilay, Thomer; Tacutu, Robi; Lehmann, Gilad; Muradian, Khachik K; Fraifeld, Vadim E
2016-01-04
Mitochondria are the only organelles in the animal cells that have their own genome. Due to a key role in energy production, generation of damaging factors (ROS, heat), and apoptosis, mitochondria and mtDNA in particular have long been considered one of the major players in the mechanisms of aging, longevity and age-related diseases. The rapidly increasing number of species with fully sequenced mtDNA, together with accumulated data on longevity records, provides a new fascinating basis for comparative analysis of the links between mtDNA features and animal longevity. To facilitate such analyses and to support the scientific community in carrying these out, we developed the MitoAge database containing calculated mtDNA compositional features of the entire mitochondrial genome, mtDNA coding (tRNA, rRNA, protein-coding genes) and non-coding (D-loop) regions, and codon usage/amino acids frequency for each protein-coding gene. MitoAge includes 922 species with fully sequenced mtDNA and maximum lifespan records. The database is available through the MitoAge website (www.mitoage.org or www.mitoage.info), which provides the necessary tools for searching, browsing, comparing and downloading the data sets of interest for selected taxonomic groups across the Kingdom Animalia. The MitoAge website assists in statistical analysis of different features of the mtDNA and their correlative links to longevity. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Goodwin, William H
2017-09-01
DNA analysis was first applied to the identification of victims of armed conflicts and other situations of violence (ACOSV) in the mid-1990s, starting in South America and the Balkans. Argentina was the first country to establish a genetic database specifically developed to identify disappeared children. Following on from these programs the early 2000s marked major programs, using a largely DNA-led approach, identifying missing persons in the Balkans and following the attack on the World Trade Center in New York. These two identification programs significantly expanded the magnitude of events to which DNA analysis was used to help provide the identity of missing persons. Guidelines developed by Interpol (2014) [1] related to best practice for identification of human remains following DVI type scenarios have been widely disseminated around the forensic community; in numerous cases these guidelines have been adopted or incorporated into national guidelines/standards/practice. However, given the complexity of many humanitarian contexts in which forensic science is employed there is a lack of internationally accepted guidelines, related to these contexts, for authorities to reference. In response the Argentine government's Human Rights Division in the Ministry of Foreign Affairs and Worship (MREC) proposed that the United Nations (UN) should promote best practice in the use of forensic genetics in humanitarian forensic action: this was adopted by the UN in Resolutions A/HRC/RES/10/26 and A/HRC/RES/15/5. Following on from the adoption of the resolutions MREC has coordinated, with the support of the International Committee of the Red Cross (ICRC), the drafting of a set of guidelines (MREC, ICRC, 2014) [2], with input from national and international agencies. To date the guidelines have been presented to South America's MERCOSUR and the UN and have been disseminated to interested parties. Copyright © 2017 Elsevier B.V. All rights reserved.
Typing DNA profiles from previously enhanced fingerprints using direct PCR.
Templeton, Jennifer E L; Taylor, Duncan; Handt, Oliva; Linacre, Adrian
2017-07-01
Fingermarks are a source of human identification both through the ridge patterns and DNA profiling. Typing nuclear STR DNA markers from previously enhanced fingermarks provides an alternative method of utilising the limited fingermark deposit that can be left behind during a criminal act. Dusting with fingerprint powders is a standard method used in classical fingermark enhancement and can affect DNA data. The ability to generate informative DNA profiles from powdered fingerprints using direct PCR swabs was investigated. Direct PCR was used as the opportunity to generate usable DNA profiles after performing any of the standard DNA extraction processes is minimal. Omitting the extraction step will, for many samples, be the key to success if there is limited sample DNA. DNA profiles were generated by direct PCR from 160 fingermarks after treatment with one of the following dactyloscopic fingerprint powders: white hadonite; silver aluminium; HiFi Volcano silk black; or black magnetic fingerprint powder. This was achieved by a combination of an optimised double-swabbing technique and swab media, omission of the extraction step to minimise loss of critical low-template DNA, and additional AmpliTaq Gold ® DNA polymerase to boost the PCR. Ninety eight out of 160 samples (61%) were considered 'up-loadable' to the Australian National Criminal Investigation DNA Database (NCIDD). The method described required a minimum of working steps, equipment and reagents, and was completed within 4h. Direct PCR allows the generation of DNA profiles from enhanced prints without the need to increase PCR cycle numbers beyond manufacturer's recommendations. Particular emphasis was placed on preventing contamination by applying strict protocols and avoiding the use of previously used fingerprint brushes. Based on this extensive survey, the data provided indicate minimal effects of any of these four powders on the chance of obtaining DNA profiles from enhanced fingermarks. Copyright © 2017 Elsevier B.V. All rights reserved.
Zou, Dong; Sun, Shixiang; Li, Rujiao; Liu, Jiang; Zhang, Jing; Zhang, Zhang
2015-01-01
DNA methylation plays crucial roles during embryonic development. Here we present MethBank (http://dnamethylome.org), a DNA methylome programming database that integrates the genome-wide single-base nucleotide methylomes of gametes and early embryos in different model organisms. Unlike extant relevant databases, MethBank incorporates the whole-genome single-base-resolution methylomes of gametes and early embryos at multiple different developmental stages in zebrafish and mouse. MethBank allows users to retrieve methylation levels, differentially methylated regions, CpG islands, gene expression profiles and genetic polymorphisms for a specific gene or genomic region. Moreover, it offers a methylome browser that is capable of visualizing high-resolution DNA methylation profiles as well as other related data in an interactive manner and thus is of great helpfulness for users to investigate methylation patterns and changes of gametes and early embryos at different developmental stages. Ongoing efforts are focused on incorporation of methylomes and related data from other organisms. Together, MethBank features integration and visualization of high-resolution DNA methylation data as well as other related data, enabling identification of potential DNA methylation signatures in different developmental stages and accordingly providing an important resource for the epigenetic and developmental studies. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
D.J. Glass; N. Takebayashi; L. Olson; D.L. Taylor
2013-01-01
The number of sequences from both formally described taxa and uncultured environmental DNA deposited in the International Nucleotide Sequence Databases has increased substantially over the last two decades. Although the majority of these sequences represent authentic gene copies, there is evidence of DNA artifacts in these databases as well. These include lab artifacts...
Forensic DNA Profiling and Database
Panneerchelvam, S.; Norazmi, M.N.
2003-01-01
The incredible power of DNA technology as an identification tool had brought a tremendous change in crimnal justice . DNA data base is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. This article discusses the essential steps in compilation of COmbined DNA Index System (CODIS) on validated polymerase chain amplified STRs and their use in crime detection. PMID:23386793
Toward unification of taxonomy databases in a distributed computer environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kitakami, Hajime; Tateno, Yoshio; Gojobori, Takashi
1994-12-31
All the taxonomy databases constructed with the DNA databases of the international DNA data banks are powerful electronic dictionaries which aid in biological research by computer. The taxonomy databases are, however not consistently unified with a relational format. If we can achieve consistent unification of the taxonomy databases, it will be useful in comparing many research results, and investigating future research directions from existent research results. In particular, it will be useful in comparing relationships between phylogenetic trees inferred from molecular data and those constructed from morphological data. The goal of the present study is to unify the existent taxonomymore » databases and eliminate inconsistencies (errors) that are present in them. Inconsistencies occur particularly in the restructuring of the existent taxonomy databases, since classification rules for constructing the taxonomy have rapidly changed with biological advancements. A repair system is needed to remove inconsistencies in each data bank and mismatches among data banks. This paper describes a new methodology for removing both inconsistencies and mismatches from the databases on a distributed computer environment. The methodology is implemented in a relational database management system, SYBASE.« less
JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles
Portales-Casamar, Elodie; Thongjuea, Supat; Kwon, Andrew T.; Arenillas, David; Zhao, Xiaobei; Valen, Eivind; Yusuf, Dimas; Lenhard, Boris; Wasserman, Wyeth W.; Sandelin, Albin
2010-01-01
JASPAR (http://jaspar.genereg.net) is the leading open-access database of matrix profiles describing the DNA-binding patterns of transcription factors (TFs) and other proteins interacting with DNA in a sequence-specific manner. Its fourth major release is the largest expansion of the core database to date: the database now holds 457 non-redundant, curated profiles. The new entries include the first batch of profiles derived from ChIP-seq and ChIP-chip whole-genome binding experiments, and 177 yeast TF binding profiles. The introduction of a yeast division brings the convenience of JASPAR to an active research community. As binding models are refined by newer data, the JASPAR database now uses versioning of matrices: in this release, 12% of the older models were updated to improved versions. Classification of TF families has been improved by adopting a new DNA-binding domain nomenclature. A curated catalog of mammalian TFs is provided, extending the use of the JASPAR profiles to additional TFs belonging to the same structural family. The changes in the database set the system ready for more rapid acquisition of new high-throughput data sources. Additionally, three new special collections provide matrix profile data produced by recent alternative high-throughput approaches. PMID:19906716
DNA barcoding of vouchered xylarium wood specimens of nine endangered Dalbergia species
Min Yu; Lichao Jiao; Juan Guo; Alex C. Wiedenhoeft; Tuo He; Xiaomei Jiang; Yafang Yin
2017-01-01
ITS2+trnH-psbA was the best combination of DNA barcode to resolve the Dalbergia wood species studied. We demonstrate the feasibility of building a DNA barcode reference database using xylarium wood specimens.
Jakobsen, Tanja Roien; Clausen, Frederik Banch; Rode, Line; Dziegiel, Morten Hanefeld; Tabor, Ann
2013-09-01
The objective was to investigate whether women who develop preeclampsia can be identified in a routine analysis when determining fetal RHD status at 25 weeks' gestation in combination with PAPP-A levels at the first-trimester combined risk assessment for Trisomy 21. D- women participating in the routine antenatal RHD screening program in the capital region of Denmark were retrospectively studied. We used a standard dilution curve to quantify the amounts of cell-free fetal DNA (cffDNA) and divided women into groups according to cffDNA levels. PAPP-A was measured at 11 to 14 weeks. Information about pregnancy outcome and complications was obtained from the National Fetal Medicine Database, medical charts, and discharge letters. The odds ratio (OR) of developing severe preeclampsia given a cffDNA level above the 90th percentile compared to cffDNA below the 90th percentile was 8.1 (95% confidence interval [CI], 2.6-25.5). The OR of developing mild preeclampsia given a cffDNA level below the 5th percentile compared to cffDNA levels above the 5th percentile was 3.6 (95% CI, 1.1-11.7). PAPP-A levels below the 5th percentile were associated with mild preeclampsia, but adding it to the analysis did not increase the detection rate (DR). Women with cffDNA levels below the 5th percentile and above the 90th percentile quantified at 25 weeks' gestation are at increased risk of developing preeclampsia. Adding PAPP-A levels to the analysis did not increase the DR of preeclampsia. © 2013 American Association of Blood Banks.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes.
Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar
2012-01-01
DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. The database is available for free at http://mail.nbfgr.res.in/fbis/
Jacewicz, Renata; Ossowski, Andrzej; Ławrynowicz, Olgierd; Jędrzejczyk, Maciej; Prośniak, Adam; Bąbol-Pokora, Katarzyna; Diepenbroek, Marta; Szargut, Maria; Zielińska, Grażyna; Berent, Jarosław
2017-01-01
It can be reasonably assumed that remains exhumed in 2012 and 2013 during archaeological explorations conducted in the Lućmierz Forest, an important area on the map of the German Nazi terror in the region of Lodz (Poland), are in fact the remains of a hundred Poles murdered by the Nazis in Zgierz on March 20, 1942. By virtue of a decision of the Polish Institute of National Remembrance's Commission for the Prosecution of Crimes Against the Polish Nation, the verification of this research hypothesis was entrusted to SIGO (Network for Genetic Identification of Victims) Consortium appointed by virtue of an agreement of December 11, 2015. The Consortium is an extension of the PBGOT (Polish Genetic Database of Totalitarianisms Victims). So far, the researchers have retrieved 14 DNA profiles from among the examined remains, including 12 male and 2 female profiles. Furthermore, 12 DNA profiles of the victims' family members have been collected. Due to the fact that next-of-kin relatives of the victims of the Zgierz massacre are of advanced age, it is of key importance to collect genetic material as soon as possible from the other surviving family members, identified on the basis of a list of victims that has been nearly completely compiled by the Polish Institute of National Remembrance (IPN) and is presented in this paper.
Fleet DNA Phase 1 Refinement & Phase 2 Implementation; NREL (National Renewable Energy Laboratory)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kelly, Kenneth; Duran, Adam
2015-06-11
Fleet DNA acts as a secure data warehouse for medium- and heavy-duty vehicle data. It demonstrates that vehicle drive cycle data can be collected and stored for large-scale analysis and modeling applications. The data serve as a real-world data source for model development and validation. Storage of the results of past/present/future data collection efforts improves analysis efficiency through pooling of shared data and provides the opportunity for 'big data' type analyses. Fleet DNA shows it is possible to develop a common database structure that can store/analyze/report on data sourced from multiple parties, each with unique data formats/types. Data filtration andmore » normalization algorithms developed for the project allow for a wide range of data types and inputs, expanding the project’s potential. Fleet DNA demonstrates the power of integrating Big Data with existing and future tools and analyses: it provides an enhanced understanding and education of users, users can explore greenhouse gases and economic opportunities via AFLEET and ADOPT modeling, drive cycles can be characterized and visualized using DRIVE, high-level vehicle modeling can be performed using real-world drive cycles via FASTSim, and data reporting through Fleet DNA Phase 1 and 2 websites provides external users access to analysis results and gives the opportunity to explore on their own.« less
DNA Profiling of Convicted Offender Samples for the Combined DNA Index System
ERIC Educational Resources Information Center
Millard, Julie T
2011-01-01
The cornerstone of forensic chemistry is that a perpetrator inevitably leaves trace evidence at a crime scene. One important type of evidence is DNA, which has been instrumental in both the implication and exoneration of thousands of suspects in a wide range of crimes. The Combined DNA Index System (CODIS), a network of DNA databases, provides…
Criminal genomic pragmatism: prisoners' representations of DNA technology and biosecurity.
Machado, Helena; Silva, Susana
2012-01-01
Within the context of the use of DNA technology in crime investigation, biosecurity is perceived by different stakeholders according to their particular rationalities and interests. Very little is known about prisoners' perceptions and assessments of the uses of DNA technology in solving crime. To propose a conceptual model that serves to analyse and interpret prisoners' representations of DNA technology and biosecurity. A qualitative study using an interpretative approach based on 31 semi-structured tape-recorded interviews was carried out between May and September 2009, involving male inmates in three prisons located in the north of Portugal. The content analysis focused on the following topics: the meanings attributed to DNA and assessments of the risks and benefits of the uses of DNA technology and databasing in forensic applications. DNA was described as a record of identity, an exceptional material, and a powerful biometric identifier. The interviewees believed that DNA can be planted to incriminate suspects. Convicted offenders argued for the need to extend the criteria for the inclusion of DNA profiles in forensic databases and to restrict the removal of profiles. The conceptual model entitled criminal genomic pragmatism allows for an understanding of the views of prison inmates regarding DNA technology and biosecurity.
Szymanski, Maciej; Barciszewska, Miroslawa Z.; Barciszewski, Jan; Erdmann, Volker A.
2000-01-01
This paper presents the updated version (Y2K) of the database of ribosomal 5S ribonucleic acids (5S rRNA) and their genes (5S rDNA), http://rose.man/poznan. pl/5SData/index.html . This edition of the database contains 1985 primary structures of 5S rRNA and 5S rDNA. They include 60 archaebacterial, 470 eubacterial, 63 plastid, nine mitochondrial and 1383 eukaryotic sequences. The nucleotide sequences of the 5S rRNAs or 5S rDNAs are divided according to the taxonomic position of the source organisms. PMID:10592212
5S ribosomal RNA database Y2K.
Szymanski, M; Barciszewska, M Z; Barciszewski, J; Erdmann, V A
2000-01-01
This paper presents the updated version (Y2K) of the database of ribosomal 5S ribonucleic acids (5S rRNA) and their genes (5S rDNA), http://rose.man/poznan.pl/5SData/index.html. This edition of the database contains 1985primary structures of 5S rRNA and 5S rDNA. They include 60 archaebacterial, 470 eubacterial, 63 plastid, nine mitochondrial and 1383 eukaryotic sequences. The nucleotide sequences of the 5S rRNAs or 5S rDNAs are divided according to the taxonomic position of the source organisms.
Integration of DNA sample collection into a multi-site birth defects case-control study.
Rasmussen, Sonja A; Lammer, Edward J; Shaw, Gary M; Finnell, Richard H; McGehee, Robert E; Gallagher, Margaret; Romitti, Paul A; Murray, Jeffrey C
2002-10-01
Advances in quantitative analysis and molecular genotyping have provided unprecedented opportunities to add biological sampling and genetic information to epidemiologic studies. The purpose of this article is to describe the incorporation of DNA sample collection into the National Birth Defects Prevention Study (NBDPS), an ongoing case-control study in an eight-state consortium with a primary goal to identify risk factors for birth defects. Babies with birth defects are identified through birth defects surveillance systems in the eight participating centers. Cases are infants with one or more of over 30 major birth defects. Controls are infants without defects from the same geographic area. Epidemiologic information is collected through an hour-long interview with mothers of both cases and controls. We added the collection of buccal cytobrush DNA samples for case-infants, control-infants, and their parents to this study. We describe here the methods by which the samples have been collected and processed, establishment of a centralized resource for DNA banking, and quality control, database management, access, informed consent, and confidentiality issues. Biological sampling and genetic analyses are important components to epidemiologic studies of birth defects aimed at identifying risk factors. The DNA specimens collected in this study can be used for detection of mutations, study of polymorphic variants that confer differential susceptibility to teratogens, and examination of interactions among genetic risk factors. Information on the methods used and issues faced by the NBDPS may be of value to others considering the addition of DNA sampling to epidemiologic studies.
García-Sancho, Miguel
2011-01-01
This paper explores the introduction of professional systems engineers and information management practices into the first centralized DNA sequence database, developed at the European Molecular Biology Laboratory (EMBL) during the 1980s. In so doing, it complements the literature on the emergence of an information discourse after World War II and its subsequent influence in biological research. By the careers of the database creators and the computer algorithms they designed, analyzing, from the mid-1960s onwards information in biology gradually shifted from a pervasive metaphor to be embodied in practices and professionals such as those incorporated at the EMBL. I then investigate the reception of these database professionals by the EMBL biological staff, which evolved from initial disregard to necessary collaboration as the relationship between DNA, genes, and proteins turned out to be more complex than expected. The trajectories of the database professionals at the EMBL suggest that the initial subject matter of the historiography of genomics should be the long-standing practices that emerged after World War II and to a large extent originated outside biomedicine and academia. Only after addressing these practices, historians may turn to their further disciplinary assemblage in fields such as bioinformatics or biotechnology.
Robasky, Kimberly; Bulyk, Martha L
2011-01-01
The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.
Tay, Wee Tek; Walsh, Thomas K.; Downes, Sharon; Anderson, Craig; Jermiin, Lars S.; Wong, Thomas K. F.; Piper, Melissa C.; Chang, Ester Silva; Macedo, Isabella Barony; Czepak, Cecilia; Behere, Gajanan T.; Silvie, Pierre; Soria, Miguel F.; Frayssinet, Marie; Gordon, Karl H. J.
2017-01-01
The Old World bollworm Helicoverpa armigera is now established in Brazil but efforts to identify incursion origin(s) and pathway(s) have met with limited success due to the patchiness of available data. Using international agricultural/horticultural commodity trade data and mitochondrial DNA (mtDNA) cytochrome oxidase I (COI) and cytochrome b (Cyt b) gene markers, we inferred the origins and incursion pathways into Brazil. We detected 20 mtDNA haplotypes from six Brazilian states, eight of which were new to our 97 global COI-Cyt b haplotype database. Direct sequence matches indicated five Brazilian haplotypes had Asian, African, and European origins. We identified 45 parsimoniously informative sites and multiple substitutions per site within the concatenated (945 bp) nucleotide dataset, implying that probabilistic phylogenetic analysis methods are needed. High diversity and signatures of uniquely shared haplotypes with diverse localities combined with the trade data suggested multiple incursions and introduction origins in Brazil. Increasing agricultural/horticultural trade activities between the Old and New Worlds represents a significant biosecurity risk factor. Identifying pest origins will enable resistance profiling that reflects countries of origin to be included when developing a resistance management strategy, while identifying incursion pathways will improve biosecurity protocols and risk analysis at biosecurity hotspots including national ports. PMID:28350004
Tay, Wee Tek; Walsh, Thomas K; Downes, Sharon; Anderson, Craig; Jermiin, Lars S; Wong, Thomas K F; Piper, Melissa C; Chang, Ester Silva; Macedo, Isabella Barony; Czepak, Cecilia; Behere, Gajanan T; Silvie, Pierre; Soria, Miguel F; Frayssinet, Marie; Gordon, Karl H J
2017-03-28
The Old World bollworm Helicoverpa armigera is now established in Brazil but efforts to identify incursion origin(s) and pathway(s) have met with limited success due to the patchiness of available data. Using international agricultural/horticultural commodity trade data and mitochondrial DNA (mtDNA) cytochrome oxidase I (COI) and cytochrome b (Cyt b) gene markers, we inferred the origins and incursion pathways into Brazil. We detected 20 mtDNA haplotypes from six Brazilian states, eight of which were new to our 97 global COI-Cyt b haplotype database. Direct sequence matches indicated five Brazilian haplotypes had Asian, African, and European origins. We identified 45 parsimoniously informative sites and multiple substitutions per site within the concatenated (945 bp) nucleotide dataset, implying that probabilistic phylogenetic analysis methods are needed. High diversity and signatures of uniquely shared haplotypes with diverse localities combined with the trade data suggested multiple incursions and introduction origins in Brazil. Increasing agricultural/horticultural trade activities between the Old and New Worlds represents a significant biosecurity risk factor. Identifying pest origins will enable resistance profiling that reflects countries of origin to be included when developing a resistance management strategy, while identifying incursion pathways will improve biosecurity protocols and risk analysis at biosecurity hotspots including national ports.
NASA Astrophysics Data System (ADS)
Tay, Wee Tek; Walsh, Thomas K.; Downes, Sharon; Anderson, Craig; Jermiin, Lars S.; Wong, Thomas K. F.; Piper, Melissa C.; Chang, Ester Silva; Macedo, Isabella Barony; Czepak, Cecilia; Behere, Gajanan T.; Silvie, Pierre; Soria, Miguel F.; Frayssinet, Marie; Gordon, Karl H. J.
2017-03-01
The Old World bollworm Helicoverpa armigera is now established in Brazil but efforts to identify incursion origin(s) and pathway(s) have met with limited success due to the patchiness of available data. Using international agricultural/horticultural commodity trade data and mitochondrial DNA (mtDNA) cytochrome oxidase I (COI) and cytochrome b (Cyt b) gene markers, we inferred the origins and incursion pathways into Brazil. We detected 20 mtDNA haplotypes from six Brazilian states, eight of which were new to our 97 global COI-Cyt b haplotype database. Direct sequence matches indicated five Brazilian haplotypes had Asian, African, and European origins. We identified 45 parsimoniously informative sites and multiple substitutions per site within the concatenated (945 bp) nucleotide dataset, implying that probabilistic phylogenetic analysis methods are needed. High diversity and signatures of uniquely shared haplotypes with diverse localities combined with the trade data suggested multiple incursions and introduction origins in Brazil. Increasing agricultural/horticultural trade activities between the Old and New Worlds represents a significant biosecurity risk factor. Identifying pest origins will enable resistance profiling that reflects countries of origin to be included when developing a resistance management strategy, while identifying incursion pathways will improve biosecurity protocols and risk analysis at biosecurity hotspots including national ports.
Turchi, Chiara; Stanciu, Florin; Paselli, Giorgia; Buscemi, Loredana; Parson, Walther; Tagliabracci, Adriano
2016-09-01
To evaluate the pattern of Romanian population from a mitochondrial perspective and to establish an appropriate mtDNA forensic database, we generated a high-quality mtDNA control region dataset from 407 Romanian subjects belonging to four major historical regions: Moldavia, Transylvania, Wallachia and Dobruja. The entire control region (CR) was analyzed by Sanger-type sequencing assays and the resulting 306 different haplotypes were classified into haplogroups according to the most updated mtDNA phylogeny. The Romanian gene pool is mainly composed of West Eurasian lineages H (31.7%), U (12.8%), J (10.8%), R (10.1%), T (9.1%), N (8.1%), HV (5.4%),K (3.7%), HV0 (4.2%), with exceptions of East Asian haplogroup M (3.4%) and African haplogroup L (0.7%). The pattern of mtDNA variation observed in this study indicates that the mitochondrial DNA pool is geographically homogeneous across Romania and that the haplogroup composition reveals signals of admixture of populations of different origin. The PCA scatterplot supported this scenario, with Romania located in southeastern Europe area, close to Bulgaria and Hungary, and as a borderland with respect to east Mediterranean and other eastern European countries. High haplotype diversity (0.993) and nucleotide diversity indices (0.00838±0.00426), together with low random match probability (0.0087) suggest the usefulness of this control region dataset as a forensic database in routine forensic mtDNA analysis and in the investigation of maternal genetic lineages in the Romanian population. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
CRITICA: coding region identification tool invoking comparative analysis
NASA Technical Reports Server (NTRS)
Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)
1999-01-01
Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).
USDA-ARS?s Scientific Manuscript database
Welcome to the Morchella MLST database. This dedicated database was set up at the CBS-KNAW Biodiversity Center by Vincent Robert in February 2012, using BioloMICS software (Robert et al., 2011), to facilitate DNA sequence-based identifications of Morchella species via the Internet. The current datab...
Tsybovskii, I S; Veremeichik, V M; Kotova, S A; Kritskaya, S V; Evmenenko, S A; Udina, I G
2017-02-01
For the Republic of Belarus, development of a forensic reference database on the basis of 18 autosomal microsatellites (STR) using a population dataset (N = 1040), “familial” genotypic dataset (N = 2550) obtained from expertise performance of paternity testing, and a dataset of genotypes from a criminal registration database (N = 8756) is described. Population samples studied consist of 80% ethnic Belarusians and 20% individuals of other nationality or of mixed origin (by questionnaire data). Genotypes of 12346 inhabitants of the Republic of Belarus from 118 regional samples studied by 18 autosomal microsatellites are included in the sample: 16 tetranucleotide STR (D2S1338, TPOX, D3S1358, CSF1PO, D5S818, D8S1179, D7S820, THO1, vWA, D13S317, D16S539, D18S51, D19S433, D21S11, F13B, and FGA) and two pentanucleotide STR (Penta D and Penta E). The samples studied are in Hardy–Weinberg equilibrium according to distribution of genotypes by 18 STR. Significant differences were not detected between discrete populations or between samples from various historical ethnographic regions of the Republic of Belarus (Western and Eastern Polesie, Podneprovye, Ponemanye, Poozerye, and Center), which indicates the absence of prominent genetic differentiation. Statistically significant differences between the studied genotypic datasets also were not detected, which made it possible to combine the datasets and consider the total sample as a unified forensic reference database for 18 “criminalistic” STR loci. Differences between reference database of the Republic of Belarus and Russians and Ukrainians by the distribution of the range of autosomal STR also were not detected, corresponding to a close genetic relationship of the three Eastern Slavic nations mediated by common origin and intense mutual migrations. Significant differences by separate STR loci between the reference database of Republic of Belarus and populations of Southern and Western Slavs were observed. The necessity of using original reference database for support of forensic expertise practice in the Republic of Belarus was demonstrated.
Code of Federal Regulations, 2012 CFR
2012-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.4 Requirements. (a) National Transit Database Reporting System... from the National Transit Database Web site located at http://www.ntdprogram.gov. These reference... Transit Database Web site and a notice of any significant changes to the reporting requirements specified...
Code of Federal Regulations, 2011 CFR
2011-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.4 Requirements. (a) National Transit Database Reporting System... from the National Transit Database Web site located at http://www.ntdprogram.gov. These reference... Transit Database Web site and a notice of any significant changes to the reporting requirements specified...
Code of Federal Regulations, 2010 CFR
2010-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.4 Requirements. (a) National Transit Database Reporting System... from the National Transit Database Web site located at http://www.ntdprogram.gov. These reference... Transit Database Web site and a notice of any significant changes to the reporting requirements specified...
Code of Federal Regulations, 2014 CFR
2014-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.4 Requirements. (a) National Transit Database Reporting System... from the National Transit Database Web site located at http://www.ntdprogram.gov. These reference... Transit Database Web site and a notice of any significant changes to the reporting requirements specified...
Code of Federal Regulations, 2013 CFR
2013-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.4 Requirements. (a) National Transit Database Reporting System... from the National Transit Database Web site located at http://www.ntdprogram.gov. These reference... Transit Database Web site and a notice of any significant changes to the reporting requirements specified...
mtDNA sequence diversity of Hazara ethnic group from Pakistan.
Rakha, Allah; Fatima; Peng, Min-Sheng; Adan, Atif; Bi, Rui; Yasmin, Memona; Yao, Yong-Gang
2017-09-01
The present study was undertaken to investigate mitochondrial DNA (mtDNA) control region sequences of Hazaras from Pakistan, so as to generate mtDNA reference database for forensic casework in Pakistan and to analyze phylogenetic relationship of this particular ethnic group with geographically proximal populations. Complete mtDNA control region (nt 16024-576) sequences were generated through Sanger Sequencing for 319 Hazara individuals from Quetta, Baluchistan. The population sample set showed a total of 189 distinct haplotypes, belonging mainly to West Eurasian (51.72%), East & Southeast Asian (29.78%) and South Asian (18.50%) haplogroups. Compared with other populations from Pakistan, the Hazara population had a relatively high haplotype diversity (0.9945) and a lower random match probability (0.0085). The dataset has been incorporated into EMPOP database under accession number EMP00680. The data herein comprises the largest, and likely most thoroughly examined, control region mtDNA dataset from Hazaras of Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.
Surgical research using national databases
Leland, Hyuma; Heckmann, Nathanael
2016-01-01
Recent changes in healthcare and advances in technology have increased the use of large-volume national databases in surgical research. These databases have been used to develop perioperative risk stratification tools, assess postoperative complications, calculate costs, and investigate numerous other topics across multiple surgical specialties. The results of these studies contain variable information but are subject to unique limitations. The use of large-volume national databases is increasing in popularity, and thorough understanding of these databases will allow for a more sophisticated and better educated interpretation of studies that utilize such databases. This review will highlight the composition, strengths, and weaknesses of commonly used national databases in surgical research. PMID:27867945
Surgical research using national databases.
Alluri, Ram K; Leland, Hyuma; Heckmann, Nathanael
2016-10-01
Recent changes in healthcare and advances in technology have increased the use of large-volume national databases in surgical research. These databases have been used to develop perioperative risk stratification tools, assess postoperative complications, calculate costs, and investigate numerous other topics across multiple surgical specialties. The results of these studies contain variable information but are subject to unique limitations. The use of large-volume national databases is increasing in popularity, and thorough understanding of these databases will allow for a more sophisticated and better educated interpretation of studies that utilize such databases. This review will highlight the composition, strengths, and weaknesses of commonly used national databases in surgical research.
The Design and Product of National 1:1000000 Cartographic Data of Topographic Map
NASA Astrophysics Data System (ADS)
Wang, Guizhi
2016-06-01
National administration of surveying, mapping and geoinformation started to launch the project of national fundamental geographic information database dynamic update in 2012. Among them, the 1:50000 database was updated once a year, furthermore the 1:250000 database was downsized and linkage-updated on the basis. In 2014, using the latest achievements of 1:250000 database, comprehensively update the 1:1000000 digital line graph database. At the same time, generate cartographic data of topographic map and digital elevation model data. This article mainly introduce national 1:1000000 cartographic data of topographic map, include feature content, database structure, Database-driven Mapping technology, workflow and so on.
45 CFR 1356.80 - Scope of the National Youth in Transition Database.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 45 Public Welfare 4 2011-10-01 2011-10-01 false Scope of the National Youth in Transition Database... REQUIREMENTS APPLICABLE TO TITLE IV-E § 1356.80 Scope of the National Youth in Transition Database. The requirements of the National Youth in Transition Database (NYTD) §§ 1356.81 through 1356.86 of this part apply...
45 CFR 1356.80 - Scope of the National Youth in Transition Database.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 45 Public Welfare 4 2013-10-01 2013-10-01 false Scope of the National Youth in Transition Database... REQUIREMENTS APPLICABLE TO TITLE IV-E § 1356.80 Scope of the National Youth in Transition Database. The requirements of the National Youth in Transition Database (NYTD) §§ 1356.81 through 1356.86 of this part apply...
45 CFR 1356.80 - Scope of the National Youth in Transition Database.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 45 Public Welfare 4 2010-10-01 2010-10-01 false Scope of the National Youth in Transition Database... REQUIREMENTS APPLICABLE TO TITLE IV-E § 1356.80 Scope of the National Youth in Transition Database. The requirements of the National Youth in Transition Database (NYTD) §§ 1356.81 through 1356.86 of this part apply...
45 CFR 1356.80 - Scope of the National Youth in Transition Database.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 45 Public Welfare 4 2012-10-01 2012-10-01 false Scope of the National Youth in Transition Database... REQUIREMENTS APPLICABLE TO TITLE IV-E § 1356.80 Scope of the National Youth in Transition Database. The requirements of the National Youth in Transition Database (NYTD) §§ 1356.81 through 1356.86 of this part apply...
45 CFR 1356.80 - Scope of the National Youth in Transition Database.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 45 Public Welfare 4 2014-10-01 2014-10-01 false Scope of the National Youth in Transition Database... REQUIREMENTS APPLICABLE TO TITLE IV-E § 1356.80 Scope of the National Youth in Transition Database. The requirements of the National Youth in Transition Database (NYTD) §§ 1356.81 through 1356.86 of this part apply...
Cloud, Joann L; Conville, Patricia S; Croft, Ann; Harmsen, Dag; Witebsky, Frank G; Carroll, Karen C
2004-02-01
Identification of clinically significant nocardiae to the species level is important in patient diagnosis and treatment. A study was performed to evaluate Nocardia species identification obtained by partial 16S ribosomal DNA (rDNA) sequencing by the MicroSeq 500 system with an expanded database. The expanded portion of the database was developed from partial 5' 16S rDNA sequences derived from 28 reference strains (from the American Type Culture Collection and the Japanese Collection of Microorganisms). The expanded MicroSeq 500 system was compared to (i). conventional identification obtained from a combination of growth characteristics with biochemical and drug susceptibility tests; (ii). molecular techniques involving restriction enzyme analysis (REA) of portions of the 16S rRNA and 65-kDa heat shock protein genes; and (iii). when necessary, sequencing of a 999-bp fragment of the 16S rRNA gene. An unknown isolate was identified as a particular species if the sequence obtained by partial 16S rDNA sequencing by the expanded MicroSeq 500 system was 99.0% similar to that of the reference strain. Ninety-four nocardiae representing 10 separate species were isolated from patient specimens and examined by using the three different methods. Sequencing of partial 16S rDNA by the expanded MicroSeq 500 system resulted in only 72% agreement with conventional methods for species identification and 90% agreement with the alternative molecular methods. Molecular methods for identification of Nocardia species provide more accurate and rapid results than the conventional methods using biochemical and susceptibility testing. With an expanded database, the MicroSeq 500 system for partial 16S rDNA was able to correctly identify the human pathogens N. brasiliensis, N. cyriacigeorgica, N. farcinica, N. nova, N. otitidiscaviarum, and N. veterana.
Sequence verification as quality-control step for production of cDNA microarrays.
Taylor, E; Cogdell, D; Coombes, K; Hu, L; Ramdas, L; Tabor, A; Hamilton, S; Zhang, W
2001-07-01
To generate cDNA arrays in our core laboratory, we amplified about 2300 PCR products from a human, sequence-verified cDNA clone library. As a quality-control step, we sequenced the PCR products immediately before printing. The sequence information was used to search the GenBank database to confirm the identities. Although these clones were previously sequence verified by the company, we found that only 79% of the clones matched the original database after handling. Our experience strongly indicates the necessity to sequence verify the clones at the final stage before printing on microarray slides and to modify the gene list accordingly.
Choice of population database for forensic DNA profile analysis.
Steele, Christopher D; Balding, David J
2014-12-01
When evaluating the weight of evidence (WoE) for an individual to be a contributor to a DNA sample, an allele frequency database is required. The allele frequencies are needed to inform about genotype probabilities for unknown contributors of DNA to the sample. Typically databases are available from several populations, and a common practice is to evaluate the WoE using each available database for each unknown contributor. Often the most conservative WoE (most favourable to the defence) is the one reported to the court. However the number of human populations that could be considered is essentially unlimited and the number of contributors to a sample can be large, making it impractical to perform every possible WoE calculation, particularly for complex crime scene profiles. We propose instead the use of only the database that best matches the ancestry of the queried contributor, together with a substantial FST adjustment. To investigate the degree of conservativeness of this approach, we performed extensive simulations of one- and two-contributor crime scene profiles, in the latter case with, and without, the profile of the second contributor available for the analysis. The genotypes were simulated using five population databases, which were also available for the analysis, and evaluations of WoE using our heuristic rule were compared with several alternative calculations using different databases. Using FST=0.03, we found that our heuristic gave WoE more favourable to the defence than alternative calculations in well over 99% of the comparisons we considered; on average the difference in WoE was just under 0.2 bans (orders of magnitude) per locus. The degree of conservativeness of the heuristic rule can be adjusted through the FST value. We propose the use of this heuristic for DNA profile WoE calculations, due to its ease of implementation, and efficient use of the evidence while allowing a flexible degree of conservativeness. Copyright © 2014. Published by Elsevier Ireland Ltd.
The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika
2010-01-27
Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set ofmore » tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.« less
Gene Expression Differences in Infected and Noninfected Middle Ear Complementary DNA Libraries
Kerschner, Joseph E.; Horsey, Edward; Ahmed, Azad; Erbe, Christy; Khampang, Pawjai; Cioffi, Joseph; Hu, Fen Ze; Post, James Christopher; Ehrlich, Garth D.
2010-01-01
Objectives To investigate genetic differences in middle ear mucosa (MEM) with nontypeable Haemophilus influenzae (NTHi) infection. Genetic upregulation and downregulation occurs in MEM during otitis media (OM) pathogenesis. A comprehensive assessment of these genetic differences using the techniques of complementary DNA (cDNA) library creation has not been performed. Design The cDNA libraries were constructed from NTHi-infected and noninfected chinchilla MEM. Random clones were picked, sequenced bidirectionally, and submitted to the National Center for Biotechnology Information (NCBI) Expressed Sequence Tags database, where they were assigned accession numbers. These numbers were used with the basic local alignment search tool (BLAST) to align clones against the nonredundant nucleotide database at NCBI. Results Analysis with the Web-based statistical program FatiGO identified several biological processes with significant differences in numbers of represented genes. Processes involved in immune, stress, and wound responses were more prevalent in the NTHi-infected library. S100 calcium-binding protein A9 (S100A9); secretory leukoprotease inhibitor (SLPI); β2-microglobulin (B2M); ferritin, heavy-chain polypeptide 1 (FTH1); and S100 calcium-binding protein A8 (S100A8) were expressed at significantly higher levels in the NTHi-infected library. Calcium-binding proteins S100A9 and S100A8 serve as markers for inflammation and have antibacterial effects. Secretory leukoprotease inhibitor is an antibacterial protein that inhibits stimuli-induced MUC1, MUC2, and MUC5AC production. Conclusions A number of genes demonstrate changes during the pathogenesis of OM, including SLPI, which has an impact on mucin gene expression; this expression is known to be an important regulator in OM. The techniques described herein provide a framework for future investigations to more thoroughly understand molecular changes in the middle ear, which will likely be important in developing new therapeutic and intervention strategies. PMID:19153305
76 FR 30997 - National Transit Database: Amendments to Urbanized Area Annual Reporting Manual
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-27
... Transit Database: Amendments to Urbanized Area Annual Reporting Manual AGENCY: Federal Transit Administration (FTA), DOT. ACTION: Notice of Amendments to 2011 National Transit Database Urbanized Area Annual... Administration's (FTA) 2011 National Transit Database (NTD) Urbanized Area Annual Reporting Manual (Annual Manual...
Ebbie: automated analysis and storage of small RNA cloning data using a dynamic web server
Ebhardt, H Alexander; Wiese, Kay C; Unrau, Peter J
2006-01-01
Background DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2-5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study. Results Ebbie is a semi-automated smRNA cloning data processing algorithm, which initially searches for any substring within a DNA sequencing text file, which is flanked by two constant strings. The substring, also termed smRNA or insert, is stored in a MySQL and BlastN database. These inserts are then compared using BlastN to locally installed databases allowing the rapid comparison of the insert to both the growing smRNA database and to other static sequence databases. Our laboratory used Ebbie to analyze scores of DNA sequencing data originating from an smRNA cloning project[6]. Through its built-in instant analysis of all inserts using BlastN, we were able to quickly identify 33 groups of smRNAs from ~700 database entries. This clustering allowed the easy identification of novel and highly expressed clusters of smRNAs. Ebbie is available under GNU GPL and currently implemented on Conclusion Ebbie was designed for medium sized smRNA cloning projects with about 1,000 database entries [6-8].Ebbie can be used for any type of sequence analysis where two constant primer regions flank a sequence of interest. The reliable storage of inserts, and their annotation in a MySQL database, BlastN[9] comparison of new inserts to dynamic and static databases make it a powerful new tool in any laboratory using DNA sequencing. Ebbie also prevents manual mistakes during the excision process and speeds up annotation and data-entry. Once the server is installed locally, its access can be restricted to protect sensitive new DNA sequencing data. Ebbie was primarily designed for smRNA cloning projects, but can be applied to a variety of RNA and DNA cloning projects[2,3,10,11]. PMID:16584563
Generating equilateral random polygons in confinement
NASA Astrophysics Data System (ADS)
Diao, Y.; Ernst, C.; Montemayor, A.; Ziegler, U.
2011-10-01
One challenging problem in biology is to understand the mechanism of DNA packing in a confined volume such as a cell. It is known that confined circular DNA is often knotted and hence the topology of the extracted (and relaxed) circular DNA can be used as a probe of the DNA packing mechanism. However, in order to properly estimate the topological properties of the confined circular DNA structures using mathematical models, it is necessary to generate large ensembles of simulated closed chains (i.e. polygons) of equal edge lengths that are confined in a volume such as a sphere of certain fixed radius. Finding efficient algorithms that properly sample the space of such confined equilateral random polygons is a difficult problem. In this paper, we propose a method that generates confined equilateral random polygons based on their probability distribution. This method requires the creation of a large database initially. However, once the database has been created, a confined equilateral random polygon of length n can be generated in linear time in terms of n. The errors introduced by the method can be controlled and reduced by the refinement of the database. Furthermore, our numerical simulations indicate that these errors are unbiased and tend to cancel each other in a long polygon.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes
Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar
2012-01-01
DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. Availability The database is available for free at http://mail.nbfgr.res.in/fbis/ PMID:22715304
Singular over-representation of an octameric palindrome, HIP1, in DNA from many cyanobacteria.
Robinson, N J; Robinson, P J; Gupta, A; Bleasby, A J; Whitton, B A; Morby, A P
1995-03-11
An octameric palindrome (5'-GCGATCGC-3') is abundant in cyanobacterial sequences within databases (GenBank/EMBL) and was designated HIP1 (highly iterated palindrome). The frequency of occurrence of all 256 octameric palindromes has now been determined in sub-databases revealing large and unique over-representation of HIP1 in cyanobacterial entries. DNA sequences from other bacteria were searched for any over-represented octameric palindromes analogous to HIP1. Only two sequences were identified, in the genomes of a thermophile and halophilic archaebacteria, although these were less abundant than HIP1 in cyanobacteria and relate to codon usage. To test the proposed widespread distribution of HIP1 in DNA from the cyanobacterium Synechococcus PCC 6301, randomly selected genomic clones were partly sequenced. HIP1 constituted 2.5% of the novel sequences, equivalent to a site on average once every 320 nucleotides. An oligonucleotide including HIP1 was also tested in PCR. Multiple products were obtained using template DNA from cyanobacterial strains in which HIP1 is abundant in known sequences, and some strains generated characteristic HIP-PCR banding patterns. However, analysis of DNA from one strain (not previously represented in databases) by random sequencing, HIP-PCR and Pvul digestion, confirms that not all cyanobacterial genomes are rich in HIP1.
Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.
1992-05-01
DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-05
... Transit Database: Amendments to the Urbanized Area Annual Reporting Manual and to the Safety and Security... the 2011 National Transit Database Urbanized Area Annual Reporting Manual and Announcement of... Transit Administration's (FTA) National Transit Database (NTD) reporting requirements, including...
The construction of an EST database for Bombyx mori and its application
Mita, Kazuei; Morimyo, Mitsuoki; Okano, Kazuhiro; Koike, Yoshiko; Nohata, Junko; Kawasaki, Hideki; Kadono-Okuda, Keiko; Yamamoto, Kimiko; Suzuki, Masataka G.; Shimada, Toru; Goldsmith, Marian R.; Maeda, Susumu
2003-01-01
To build a foundation for the complete genome analysis of Bombyx mori, we have constructed an EST database. Because gene expression patterns deeply depend on tissues as well as developmental stages, we analyzed many cDNA libraries prepared from various tissues and different developmental stages to cover the entire set of Bombyx genes. So far, the Bombyx EST database contains 35,000 ESTs from 36 cDNA libraries, which are grouped into ≈11,000 nonredundant ESTs with the average length of 1.25 kb. The comparison with FlyBase suggests that the present EST database, SilkBase, covers >55% of all genes of Bombyx. The fraction of library-specific ESTs in each cDNA library indicates that we have not yet reached saturation, showing the validity of our strategy for constructing an EST database to cover all genes. To tackle the coming saturation problem, we have checked two methods, subtraction and normalization, to increase coverage and decrease the number of housekeeping genes, resulting in a 5–11% increase of library-specific ESTs. The identification of a number of genes and comprehensive cloning of gene families have already emerged from the SilkBase search. Direct links of SilkBase with FlyBase and WormBase provide ready identification of candidate Lepidoptera-specific genes. PMID:14614147
... this page please turn Javascript on. Unique DNA database has helped advance scientific discoveries worldwide Since its origin 25 years ago, the database of nucleic acid sequences known as GenBank has ...
Assessing the utility of eDNA as a tool to survey reef-fish communities in the Red Sea
NASA Astrophysics Data System (ADS)
DiBattista, Joseph D.; Coker, Darren J.; Sinclair-Taylor, Tane H.; Stat, Michael; Berumen, Michael L.; Bunce, Michael
2017-12-01
Relatively small volumes of water may contain sufficient environmental DNA (eDNA) to detect target aquatic organisms via genetic sequencing. We therefore assessed the utility of eDNA to document the diversity of coral reef fishes in the central Red Sea. DNA from seawater samples was extracted, amplified using fish-specific 16S mitochondrial DNA primers, and sequenced using a metabarcoding workflow. DNA sequences were assigned to taxa using available genetic repositories or custom genetic databases generated from reference fishes. Our approach revealed a diversity of conspicuous, cryptobenthic, and commercially relevant reef fish at the genus level, with select genera in the family Labridae over-represented. Our approach, however, failed to capture a significant fraction of the fish fauna known to inhabit the Red Sea, which we attribute to limited spatial sampling, amplification stochasticity, and an apparent lack of sequencing depth. Given an increase in fish species descriptions, completeness of taxonomic checklists, and improvement in species-level assignment with custom genetic databases as shown here, we suggest that the Red Sea region may be ideal for further testing of the eDNA approach.
footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.
Sebastian, Alvaro; Contreras-Moreira, Bruno
2014-01-15
Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.
Rojas-Cartagena, Carmencita; Ortíz-Pineda, Pablo; Ramírez-Gómez, Francisco; Suárez-Castillo, Edna C.; Matos-Cruz, Vanessa; Rodríguez, Carlos; Ortíz-Zuazaga, Humberto; García-Arrarás, José E.
2010-01-01
Repair and regeneration are key processes for tissue maintenance, and their disruption may lead to disease states. Little is known about the molecular mechanisms that underline the repair and regeneration of the digestive tract. The sea cucumber Holothuria glaberrima represents an excellent model to dissect and characterize the molecular events during intestinal regeneration. To study the gene expression profile, cDNA libraries were constructed from normal, 3-day, and 7-day regenerating intestines of H. glaberrima. Clones were randomly sequenced and queried against the nonredundant protein database at the National Center for Biotechnology Information. RT-PCR analyses were made of several genes to determine their expression profile during intestinal regeneration. A total of 5,173 sequences from three cDNA libraries were obtained. About 46.2, 35.6, and 26.2% of the sequences for the normal, 3-days, and 7-days cDNA libraries, respectively, shared significant similarity with known sequences in the protein database of GenBank but only present 10% of similarity among them. Analysis of the libraries in terms of functional processes, protein domains, and most common sequences suggests that a differential expression profile is taking place during the regeneration process. Further examination of the expressed sequence tag dataset revealed that 12 putative genes are differentially expressed at significant level (R > 6). Experimental validation by RT-PCR analysis reveals that at least three genes (unknown C-4677-1, melanotransferrin, and centaurin) present a differential expression during regeneration. These findings strongly suggest that the gene expression profile varies among regeneration stages and provide evidence for the existence of differential gene expression. PMID:17579180
Abugessaisa, Imad; Gomez-Cabrero, David; Snir, Omri; Lindblad, Staffan; Klareskog, Lars; Malmström, Vivianne; Tegnér, Jesper
2013-04-02
Sequencing of the human genome and the subsequent analyses have produced immense volumes of data. The technological advances have opened new windows into genomics beyond the DNA sequence. In parallel, clinical practice generate large amounts of data. This represents an underused data source that has much greater potential in translational research than is currently realized. This research aims at implementing a translational medicine informatics platform to integrate clinical data (disease diagnosis, diseases activity and treatment) of Rheumatoid Arthritis (RA) patients from Karolinska University Hospital and their research database (biobanks, genotype variants and serology) at the Center for Molecular Medicine, Karolinska Institutet. Requirements engineering methods were utilized to identify user requirements. Unified Modeling Language and data modeling methods were used to model the universe of discourse and data sources. Oracle11g were used as the database management system, and the clinical development center (CDC) was used as the application interface. Patient data were anonymized, and we employed authorization and security methods to protect the system. We developed a user requirement matrix, which provided a framework for evaluating three translation informatics systems. The implementation of the CDC successfully integrated biological research database (15172 DNA, serum and synovial samples, 1436 cell samples and 65 SNPs per patient) and clinical database (5652 clinical visit) for the cohort of 379 patients presents three profiles. Basic functionalities provided by the translational medicine platform are research data management, development of bioinformatics workflow and analysis, sub-cohort selection, and re-use of clinical data in research settings. Finally, the system allowed researchers to extract subsets of attributes from cohorts according to specific biological, clinical, or statistical features. Research and clinical database integration is a real challenge and a road-block in translational research. Through this research we addressed the challenges and demonstrated the usefulness of CDC. We adhered to ethical regulations pertaining to patient data, and we determined that the existing software solutions cannot meet the translational research needs at hand. We used RA as a test case since we have ample data on active and longitudinal cohort.
2013-01-01
Background Sequencing of the human genome and the subsequent analyses have produced immense volumes of data. The technological advances have opened new windows into genomics beyond the DNA sequence. In parallel, clinical practice generate large amounts of data. This represents an underused data source that has much greater potential in translational research than is currently realized. This research aims at implementing a translational medicine informatics platform to integrate clinical data (disease diagnosis, diseases activity and treatment) of Rheumatoid Arthritis (RA) patients from Karolinska University Hospital and their research database (biobanks, genotype variants and serology) at the Center for Molecular Medicine, Karolinska Institutet. Methods Requirements engineering methods were utilized to identify user requirements. Unified Modeling Language and data modeling methods were used to model the universe of discourse and data sources. Oracle11g were used as the database management system, and the clinical development center (CDC) was used as the application interface. Patient data were anonymized, and we employed authorization and security methods to protect the system. Results We developed a user requirement matrix, which provided a framework for evaluating three translation informatics systems. The implementation of the CDC successfully integrated biological research database (15172 DNA, serum and synovial samples, 1436 cell samples and 65 SNPs per patient) and clinical database (5652 clinical visit) for the cohort of 379 patients presents three profiles. Basic functionalities provided by the translational medicine platform are research data management, development of bioinformatics workflow and analysis, sub-cohort selection, and re-use of clinical data in research settings. Finally, the system allowed researchers to extract subsets of attributes from cohorts according to specific biological, clinical, or statistical features. Conclusions Research and clinical database integration is a real challenge and a road-block in translational research. Through this research we addressed the challenges and demonstrated the usefulness of CDC. We adhered to ethical regulations pertaining to patient data, and we determined that the existing software solutions cannot meet the translational research needs at hand. We used RA as a test case since we have ample data on active and longitudinal cohort. PMID:23548156
Smith, Steven M.; Neilson, Ryan T.; Giles, Stuart A.
2015-01-01
Government-sponsored, national-scale, soil and sediment geochemical databases are used to estimate regional and local background concentrations for environmental issues, identify possible anthropogenic contamination, estimate mineral endowment, explore for new mineral deposits, evaluate nutrient levels for agriculture, and establish concentration relationships with human or animal health. Because of these different uses, it is difficult for any single database to accommodate all the needs of each client. Smith et al. (2013, p. 168) reviewed six national-scale soil and sediment geochemical databases for the United States (U.S.) and, for each, evaluated “its appropriateness as a national-scale geochemical database and its usefulness for national-scale geochemical mapping.” Each of the evaluated databases has strengths and weaknesses that were listed in that review.Two of these U.S. national-scale geochemical databases are similar in their sample media and collection protocols but have different strengths—primarily sampling density and analytical consistency. This project was implemented to determine whether those databases could be merged to produce a combined dataset that could be used for mineral resource assessments. The utility of the merged database was tested to see whether mapped distributions could identify metalliferous black shales at a national scale.
National Databases for Neurosurgical Outcomes Research: Options, Strengths, and Limitations.
Karhade, Aditya V; Larsen, Alexandra M G; Cote, David J; Dubois, Heloise M; Smith, Timothy R
2017-08-05
Quality improvement, value-based care delivery, and personalized patient care depend on robust clinical, financial, and demographic data streams of neurosurgical outcomes. The neurosurgical literature lacks a comprehensive review of large national databases. To assess the strengths and limitations of various resources for outcomes research in neurosurgery. A review of the literature was conducted to identify surgical outcomes studies using national data sets. The databases were assessed for the availability of patient demographics and clinical variables, longitudinal follow-up of patients, strengths, and limitations. The number of unique patients contained within each data set ranged from thousands (Quality Outcomes Database [QOD]) to hundreds of millions (MarketScan). Databases with both clinical and financial data included PearlDiver, Premier Healthcare Database, Vizient Clinical Data Base and Resource Manager, and the National Inpatient Sample. Outcomes collected by databases included patient-reported outcomes (QOD); 30-day morbidity, readmissions, and reoperations (National Surgical Quality Improvement Program); and disease incidence and disease-specific survival (Surveillance, Epidemiology, and End Results-Medicare). The strengths of large databases included large numbers of rare pathologies and multi-institutional nationally representative sampling; the limitations of these databases included variable data veracity, variable data completeness, and missing disease-specific variables. The improvement of existing large national databases and the establishment of new registries will be crucial to the future of neurosurgical outcomes research. Copyright © 2017 by the Congress of Neurological Surgeons
Cloud-based adaptive exon prediction for DNA analysis
Putluri, Srinivasareddy; Fathima, Shaik Yasmeen
2018-01-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813
Bradford, Laurie; Heal, Jennifer; Anderson, Jeff; Faragher, Nichole; Duval, Kristin; Lalonde, Sylvain
2011-08-01
Members of the National DNA Data Bank (NDDB) of Canada designed and searched two simulated mass disaster (MD) scenarios for User Acceptance Testing (UAT) of the Combined DNA Index System (CODIS) 6.0, developed by the Federal Bureau of Investigation (FBI) and the US Department of Justice. A simulated airplane MD and inland Tsunami MD were designed representing a closed and open environment respectively. An in-house software program was written to randomly generate DNA profiles from a mock Caucasian population database. As part of the UAT, these two MDs were searched separately using CODIS 6.0. The new options available for identity and pedigree searching in addition to the inclusion of mitochondrial DNA (mtDNA) and Y-STR (short tandem repeat) information in CODIS 6.0, led to rapid identification of all victims. A Joint Pedigree Likelihood Ratio (JPLR) was calculated from the pedigree searches and ranks were stored in Rank Manager providing confidence to the user in assigning an Unidentified Human Remain (UHR) to a pedigree tree. Analyses of the results indicated that primary relatives were more useful in Disaster Victim Identification (DVI) compared to secondary or tertiary relatives and that inclusion of mtDNA and/or Y-STR technologies helped to link family units together as shown by the software searches. It is recommended that UHRs have as many informative loci possible to assist with their identification. CODIS 6.0 is a valuable technological tool for rapidly and confidently identifying victims of mass disasters. Crown Copyright © 2010. Published by Elsevier Ireland Ltd. All rights reserved.
9 CFR 55.25 - Animal identification.
Code of Federal Regulations, 2014 CFR
2014-01-01
... CWD National Database or in an approved State database. The second animal identification must be... CWD National Database or in an approved State database. The means of animal identification must be...
9 CFR 55.25 - Animal identification.
Code of Federal Regulations, 2013 CFR
2013-01-01
... CWD National Database or in an approved State database. The second animal identification must be... CWD National Database or in an approved State database. The means of animal identification must be...
NCBI GEO: archive for functional genomics data sets--10 years on.
Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra
2011-01-01
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20,000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-11-06
... DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health Submission for OMB Review; Comment Request: National Database for Autism Research (NDAR) Data Access Request SUMMARY: Under the... currently valid OMB control number. Proposed Collection: Title: National Database for Autism Research (NDAR...
Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing
NASA Astrophysics Data System (ADS)
Chen, K.
2017-01-01
With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).
Current projects in Pre-analytics: where to go?
Sapino, Anna; Annaratone, Laura; Marchiò, Caterina
2015-01-01
The current clinical practice of tissue handling and sample preparation is multifaceted and lacks strict standardisation: this scenario leads to significant variability in the quality of clinical samples. Poor tissue preservation has a detrimental effect thus leading to morphological artefacts, hampering the reproducibility of immunocytochemical and molecular diagnostic results (protein expression, DNA gene mutations, RNA gene expression) and affecting the research outcomes with irreproducible gene expression and post-transcriptional data. Altogether, this limits the opportunity to share and pool national databases into European common databases. At the European level, standardization of pre-analytical steps is just at the beginning and issues regarding bio-specimen collection and management are still debated. A joint (public-private) project entitled on standardization of tissue handling in pre-analytical procedures has been recently funded in Italy with the aim of proposing novel approaches to the neglected issue of pre-analytical procedures. In this chapter, we will show how investing in pre-analytics may impact both public health problems and practical innovation in solid tumour processing.
Online Mendelian Inheritance in Man (OMIM).
Hamosh, A; Scott, A F; Amberger, J; Valle, D; McKusick, V A
2000-01-01
Online Mendelian Inheritance In Man (OMIM) is a public database of bibliographic information about human genes and genetic disorders. Begun by Dr. Victor McKusick as the authoritative reference Mendelian Inheritance in Man, it is now distributed electronically by the National Center for Biotechnology Information (NCBI). Material in OMIM is derived from the biomedical literature and is written by Dr. McKusick and his colleagues at Johns Hopkins University and elsewhere. Each OMIM entry has a full text summary of a genetic phenotype and/or gene and has copious links to other genetic resources such as DNA and protein sequence, PubMed references, mutation databases, approved gene nomenclature, and more. In addition, NCBI's neighboring feature allows users to identify related articles from PubMed selected on the basis of key words in the OMIM entry. Through its many features, OMIM is increasingly becoming a major gateway for clinicians, students, and basic researchers to the ever-growing literature and resources of human genetics. Copyright 2000 Wiley-Liss, Inc.
Progress towards a Spacecraft-Associated Microbial Meta-database (SAMM)
NASA Astrophysics Data System (ADS)
Mogul, Rakesh; Keagy, Laura; Nava, Argelia; Zerehi, Farah
The microbial inventories within the assembly facilities for spacecraft represent the primary pool of forward contaminants that may compromise life-detection missions. Accordingly, we are constructing a meta-database of these microorganisms for the purpose of building a bioinformatic resource for planetary protection and astrobiology-related endeavors. Using student-led efforts, the meta-database is being constructed from literature reports and is inclusive of both isolated microorganisms and those solely detected through DNA-based techniques. The Spacecraft-Associated Microbial Meta-database (SAMM) currently includes over 800 entries that are organized using 32 meta-tags involving taxonomy, location of isolation (facility and component), category of characterization (culture and/or genetic), types of characterizations (e.g., culture, 16s rDNA, phylochip, FAME, and DNA hybridization), growth conditions, Gram stain, and general physiological traits (e.g., sporulation, extremotolerance, and respiration properties). Interrogations on the database show that the cleanrooms at Kennedy Space Center (KSC) are ~ 2-fold greater in diversity in bacterial genera when compared to the Jet Propulsion Laboratory (JPL), and that bacteria related to water, plant, and human environments are more often associated with the KSC-specific genera. These results are parallel to those reported in the literature, and hence serve as benchmarks demonstrating the bioinformatic potential of this meta-database. The ultimate plans for SAMM include public availability, expansion through crowdsourcing efforts, and potential use as a companion resource to the culture collections assembled by DSMZ and JPL.
Estimating haplotype frequencies by combining data from large DNA pools with database information.
Gasbarra, Dario; Kulathinal, Sangita; Pirinen, Matti; Sillanpää, Mikko J
2011-01-01
We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of sampled individuals. Our goal is to estimate the haplotype frequencies among the sampled individuals by combining the pooled allele frequency data with prior knowledge about the set of possible haplotypes. Such prior information can be obtained, for example, from a database such as HapMap. We present a Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution. The proposed method is applicable when the sizes of the DNA pools and/or the number of considered loci exceed the limits of several earlier methods. In the example analyses, the proposed model clearly outperforms a deterministic greedy algorithm on real data from the HapMap database. With a small number of loci, the performance of the proposed method is similar to that of an EM-algorithm, which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes. The method has been implemented using Matlab and the code is available upon request from the authors.
Sauerbrei, Andreas; Bohn-Wippert, Kathrin; Kaspar, Marisa; Krumbholz, Andi; Karrasch, Matthias; Zell, Roland
2016-01-01
The use of genotypic resistance testing of herpes simplex virus types 1 and 2 (HSV-1 and HSV-2) is increasing because the rapid availability of results significantly improves the treatment of severe infections, especially in immunocompromised patients. However, an essential precondition is a broad knowledge of natural polymorphisms and resistance-associated mutations in the thymidine kinase (TK) and DNA polymerase (pol) genes, of which the DNA polymerase (Pol) enzyme is targeted by the highly effective antiviral drugs in clinical use. Thus, this review presents a database of all non-synonymous mutations of TK and DNA pol genes of HSV-1 and HSV-2 whose association with resistance or natural gene polymorphism has been clarified by phenotypic and/or functional assays. In addition, the laboratory methods for verifying natural polymorphisms or resistance mutations are summarized. This database can help considerably to facilitate the interpretation of genotypic resistance findings in clinical HSV-1 and HSV-2 strains. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Finding Protein and Nucleotide Similarities with FASTA
Pearson, William R.
2016-01-01
The FASTA programs provide a comprehensive set of rapid similarity searching tools ( fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local and global similarity searches ( ssearch36, ggsearch36) and for searching with short peptides and oligonucleotides ( fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity (Unit 3.5). The FASTA programs can produce “BLAST-like” alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases (Unit 9.4). The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons. PMID:27010337
Finding Protein and Nucleotide Similarities with FASTA.
Pearson, William R
2016-03-24
The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity. The FASTA programs can produce "BLAST-like" alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases. The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons. Copyright © 2016 John Wiley & Sons, Inc.
Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin
2011-01-01
The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
DNA barcoding commercially important aquatic invertebrates of Turkey.
Keskin, Emre; Atar, Hasan Hüseyin
2013-08-01
DNA barcoding was used in order to identify aquatic invertebrates sampled from fisheries bycatch and discards. A total of 440 unique cytochrome c oxidase sub unit I (COI) barcodes were generated for 22 species from three important phyla (Arthropoda, Cnidaria, and Mollusca). All the species were sequenced and submitted to GenBank and Barcode of Life Database (BOLD) databases using 654 bp-long fragment of mitochondrial COI gene. Two of them (Pontastacus leptodactylus and Rapana bezoar) were first records of the species for the BOLD database and six of them (Carcinus aestuarii, Loligo vulgaris, Melicertus kerathurus, Nephrops norvegicus, Scyllarides latus, and Scyllarus arctus) were first standard (>648 bp) COI barcode records for the GenBank database. COI barcodes were analyzed for nucleotide composition, nucleotide pair frequencies, and Kimura's two-parameter genetic distance. Mean genetic distance among species was found increasing at higher taxonomic levels. Neighbor-joining trees generated were congruent with morphometric-based taxonomic classification. Findings of this study clearly demonstrate that DNA barcodes could be used as an efficient molecular tool in identification of not only target species from fisheries but also bycatch and discard species, and so it could provide us leverage for a better understanding in monitoring and management of fisheries and biodiversity.
McClelland, M; Nelson, M; Raschke, E
1994-01-01
Restriction endonucleases have site-specific interactions with DNA that can often be inhibited by site-specific DNA methylation and other site-specific DNA modifications. However, such inhibition cannot generally be predicted. The empirically acquired data on these effects are tabulated for over 320 restriction endonucleases. In addition, a table of known site-specific DNA modification methyltransferases and their specificities is presented along with EMBL database accession numbers for cloned genes. PMID:7937074
Lee, Hwan Young; Song, Injee; Ha, Eunho; Cho, Sung-Bae; Yang, Woo Ick; Shin, Kyoung-Jin
2008-01-01
Background For the past few years, scientific controversy has surrounded the large number of errors in forensic and literature mitochondrial DNA (mtDNA) data. However, recent research has shown that using mtDNA phylogeny and referring to known mtDNA haplotypes can be useful for checking the quality of sequence data. Results We developed a Web-based bioinformatics resource "mtDNAmanager" that offers a convenient interface supporting the management and quality analysis of mtDNA sequence data. The mtDNAmanager performs computations on mtDNA control-region sequences to estimate the most-probable mtDNA haplogroups and retrieves similar sequences from a selected database. By the phased designation of the most-probable haplogroups (both expected and estimated haplogroups), mtDNAmanager enables users to systematically detect errors whilst allowing for confirmation of the presence of clear key diagnostic mutations and accompanying mutations. The query tools of mtDNAmanager also facilitate database screening with two options of "match" and "include the queried nucleotide polymorphism". In addition, mtDNAmanager provides Web interfaces for users to manage and analyse their own data in batch mode. Conclusion The mtDNAmanager will provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be accessed at . PMID:19014619
Content based information retrieval in forensic image databases.
Geradts, Zeno; Bijhold, Jurrien
2002-03-01
This paper gives an overview of the various available image databases and ways of searching these databases on image contents. The developments in research groups of searching in image databases is evaluated and compared with the forensic databases that exist. Forensic image databases of fingerprints, faces, shoeprints, handwriting, cartridge cases, drugs tablets, and tool marks are described. The developments in these fields appear to be valuable for forensic databases, especially that of the framework in MPEG-7, where the searching in image databases is standardized. In the future, the combination of the databases (also DNA-databases) and possibilities to combine these can result in stronger forensic evidence.
DNAVaxDB: the first web-based DNA vaccine database and its data analysis
2014-01-01
Since the first DNA vaccine studies were done in the 1990s, thousands more studies have followed. Here we report the development and analysis of DNAVaxDB (http://www.violinet.org/dnavaxdb), the first publically available web-based DNA vaccine database that curates, stores, and analyzes experimentally verified DNA vaccines, DNA vaccine plasmid vectors, and protective antigens used in DNA vaccines. All data in DNAVaxDB are annotated from reliable resources, particularly peer-reviewed articles. Among over 140 DNA vaccine plasmids, some plasmids were more frequently used in one type of pathogen than others; for example, pCMVi-UB for G- bacterial DNA vaccines, and pCAGGS for viral DNA vaccines. Presently, over 400 DNA vaccines containing over 370 protective antigens from over 90 infectious and non-infectious diseases have been curated in DNAVaxDB. While extracellular and bacterial cell surface proteins and adhesin proteins were frequently used for DNA vaccine development, the majority of protective antigens used in Chlamydophila DNA vaccines are localized to the inner portion of the cell. The DNA vaccine priming, other vaccine boosting vaccination regimen has been widely used to induce protection against infection of different pathogens such as HIV. Parasitic and cancer DNA vaccines were also systematically analyzed. User-friendly web query and visualization interfaces are available in DNAVaxDB for interactive data search. To support data exchange, the information of DNA vaccines, plasmids, and protective antigens is stored in the Vaccine Ontology (VO). DNAVaxDB is targeted to become a timely and vital source of DNA vaccines and related data and facilitate advanced DNA vaccine research and development. PMID:25104313
The development of miniplex primer sets for the analysis of degraded DNA
NASA Astrophysics Data System (ADS)
McCord, Bruce; Opel, Kerry; Chung, Denise; Drabek, Jiri; Tatarek, Nancy; Meadows Jantz, Lee; Butler, John
2005-05-01
In this project, a new set of multiplexed PCR reactions has been developed for the analysis of degraded DNA. These DNA markers, known as Miniplexes, utilize primers that have shorter amplicons for use in short tandem repeat (STR) analysis of degraded DNA. In our work we have defined six of these new STR multiplexes, each of which consists of 3 to 4 reduced size STR loci, and each labeled with a different fluorescent dye. When compared to commercially available STR systems, reductions in size of up to 300 basepairs are possible. In addition, these newly designed amplicons consist of loci that are fully compatible with the the national computer DNA database known as CODIS. To demonstrate compatibility with commercial STR kits, a concordance study of 532 DNA samples of Caucasian, African American, and Hispanic origin was undertaken There was 99.77% concordance between allele calls with the two methods. Of these 532 samples, only 15 samples showed discrepancies at one of 12 loci. These occurred predominantly at 2 loci, vWA and D13S317. DNA sequencing revealed that these locations had deletions between the two primer binding sites. Uncommon deletions like these can be expected in certain samples and will not affect the utility of the Miniplexes as tools for degraded DNA analysis. The Miniplexes were also applied to enzymatically digested DNA to assess their potential in degraded DNA analysis. The results demonstrated a greatly improved efficiency in the analysis of degraded DNA when compared to commercial STR genotyping kits. A series of human skeletal remains that had been exposed to a variety of environmental conditions were also examined. Sixty-four percent of the samples generated full profiles when amplified with the Miniplexes, while only sixteen percent of the samples tested generated full profiles with a commercial kit. In addition, complete profiles were obtained for eleven of the twelve Miniplex loci which had amplicon size ranges less than 200 base pairs. These data clearly demonstrate that smaller PCR amplicons provide an attractive alternative to mitochondrial DNA for forensic analysis of degraded DNA.
dbAMEPNI: a database of alanine mutagenic effects for protein–nucleic acid interactions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ling; Xiong, Yi; Gao, Hongyun
Protein–nucleic acid interactions play essential roles in various biological activities such as gene regulation, transcription, DNA repair and DNA packaging. Understanding the effects of amino acid substitutions on protein–nucleic acid binding affinities can help elucidate the molecular mechanism of protein–nucleic acid recognition. Until now, no comprehensive and updated database of quantitative binding data on alanine mutagenic effects for protein–nucleic acid interactions is publicly accessible. Thus, we developed a new database of Alanine Mutagenic Effects for Protein-Nucleic Acid Interactions (dbAMEPNI). dbAMEPNI is a manually curated, literature-derived database, comprising over 577 alanine mutagenic data with experimentally determined binding affinities for protein–nucleic acidmore » complexes. Here, it contains several important parameters, such as dissociation constant (Kd), Gibbs free energy change (ΔΔG), experimental conditions and structural parameters of mutant residues. In addition, the database provides an extended dataset of 282 single alanine mutations with only qualitative data (or descriptive effects) of thermodynamic information.« less
dbAMEPNI: a database of alanine mutagenic effects for protein–nucleic acid interactions
Liu, Ling; Xiong, Yi; Gao, Hongyun; ...
2018-04-02
Protein–nucleic acid interactions play essential roles in various biological activities such as gene regulation, transcription, DNA repair and DNA packaging. Understanding the effects of amino acid substitutions on protein–nucleic acid binding affinities can help elucidate the molecular mechanism of protein–nucleic acid recognition. Until now, no comprehensive and updated database of quantitative binding data on alanine mutagenic effects for protein–nucleic acid interactions is publicly accessible. Thus, we developed a new database of Alanine Mutagenic Effects for Protein-Nucleic Acid Interactions (dbAMEPNI). dbAMEPNI is a manually curated, literature-derived database, comprising over 577 alanine mutagenic data with experimentally determined binding affinities for protein–nucleic acidmore » complexes. Here, it contains several important parameters, such as dissociation constant (Kd), Gibbs free energy change (ΔΔG), experimental conditions and structural parameters of mutant residues. In addition, the database provides an extended dataset of 282 single alanine mutations with only qualitative data (or descriptive effects) of thermodynamic information.« less
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2008-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.
2008-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov PMID:18073190
Bohl, Daniel D; Russo, Glenn S; Basques, Bryce A; Golinvaux, Nicholas S; Fu, Michael C; Long, William D; Grauer, Jonathan N
2014-12-03
There has been an increasing use of national databases to conduct orthopaedic research. Questions regarding the validity and consistency of these studies have not been fully addressed. The purpose of this study was to test for similarity in reported measures between two national databases commonly used for orthopaedic research. A retrospective cohort study of patients undergoing lumbar spinal fusion procedures during 2009 to 2011 was performed in two national databases: the Nationwide Inpatient Sample and the National Surgical Quality Improvement Program. Demographic characteristics, comorbidities, and inpatient adverse events were directly compared between databases. The total numbers of patients included were 144,098 from the Nationwide Inpatient Sample and 8434 from the National Surgical Quality Improvement Program. There were only small differences in demographic characteristics between the two databases. There were large differences between databases in the rates at which specific comorbidities were documented. Non-morbid obesity was documented at rates of 9.33% in the Nationwide Inpatient Sample and 36.93% in the National Surgical Quality Improvement Program (relative risk, 0.25; p < 0.05). Peripheral vascular disease was documented at rates of 2.35% in the Nationwide Inpatient Sample and 0.60% in the National Surgical Quality Improvement Program (relative risk, 3.89; p < 0.05). Similarly, there were large differences between databases in the rates at which specific inpatient adverse events were documented. Sepsis was documented at rates of 0.38% in the Nationwide Inpatient Sample and 0.81% in the National Surgical Quality Improvement Program (relative risk, 0.47; p < 0.05). Acute kidney injury was documented at rates of 1.79% in the Nationwide Inpatient Sample and 0.21% in the National Surgical Quality Improvement Program (relative risk, 8.54; p < 0.05). As database studies become more prevalent in orthopaedic surgery, authors, reviewers, and readers should view these studies with caution. This study shows that two commonly used databases can identify demographically similar patients undergoing a common orthopaedic procedure; however, the databases document markedly different rates of comorbidities and inpatient adverse events. The differences are likely the result of the very different mechanisms through which the databases collect their comorbidity and adverse event data. Findings highlight concerns regarding the validity of orthopaedic database research. Copyright © 2014 by The Journal of Bone and Joint Surgery, Incorporated.
The Histone Database: an integrated resource for histones and histone fold-containing proteins
Mariño-Ramírez, Leonardo; Levine, Kevin M.; Morales, Mario; Zhang, Suiyuan; Moreland, R. Travis; Baxevanis, Andreas D.; Landsman, David
2011-01-01
Eukaryotic chromatin is composed of DNA and protein components—core histones—that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins. Database URL: The Histone Sequence Database is freely available and can be accessed at http://research.nhgri.nih.gov/histones/. PMID:22025671
Internet-accessible DNA sequence database for identifying fusaria from human and animal infections.
O'Donnell, Kerry; Sutton, Deanna A; Rinaldi, Michael G; Sarver, Brice A J; Balajee, S Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C; Robert, Vincent A R G; Crous, Pedro W; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M
2010-10-01
Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1α (EF-1α), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the sequence results can be verified and isolates are made available for future study.
A blue carbon soil database: Tidal wetland stocks for the US National Greenhouse Gas Inventory
NASA Astrophysics Data System (ADS)
Feagin, R. A.; Eriksson, M.; Hinson, A.; Najjar, R. G.; Kroeger, K. D.; Herrmann, M.; Holmquist, J. R.; Windham-Myers, L.; MacDonald, G. M.; Brown, L. N.; Bianchi, T. S.
2015-12-01
Coastal wetlands contain large reservoirs of carbon, and in 2015 the US National Greenhouse Gas Inventory began the work of placing blue carbon within the national regulatory context. The potential value of a wetland carbon stock, in relation to its location, soon could be influential in determining governmental policy and management activities, or in stimulating market-based CO2 sequestration projects. To meet the national need for high-resolution maps, a blue carbon stock database was developed linking National Wetlands Inventory datasets with the USDA Soil Survey Geographic Database. Users of the database can identify the economic potential for carbon conservation or restoration projects within specific estuarine basins, states, wetland types, physical parameters, and land management activities. The database is geared towards both national-level assessments and local-level inquiries. Spatial analysis of the stocks show high variance within individual estuarine basins, largely dependent on geomorphic position on the landscape, though there are continental scale trends to the carbon distribution as well. Future plans including linking this database with a sedimentary accretion database to predict carbon flux in US tidal wetlands.
ERIC Educational Resources Information Center
McGrew, Kevin; And Others
This research analyzes similarities and differences in how students with disabilities are identified in national databases, through examination of 19 national data collection programs in the U.S. Departments of Education, Commerce, Justice, and Health and Human Services, as well as databases from the National Science Foundation. The study found…
DNA fingerprinting in forensics: past, present, future
2013-01-01
DNA fingerprinting, one of the great discoveries of the late 20th century, has revolutionized forensic investigations. This review briefly recapitulates 30 years of progress in forensic DNA analysis which helps to convict criminals, exonerate the wrongly accused, and identify victims of crime, disasters, and war. Current standard methods based on short tandem repeats (STRs) as well as lineage markers (Y chromosome, mitochondrial DNA) are covered and applications are illustrated by casework examples. Benefits and risks of expanding forensic DNA databases are discussed and we ask what the future holds for forensic DNA fingerprinting. PMID:24245688
Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.
2007-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (). PMID:17202161
PRODORIC2: the bacterial gene regulation database in 2018
Dudek, Christian-Alexander; Hartlich, Juliane; Brötje, David; Jahn, Dieter
2018-01-01
Abstract Bacteria adapt to changes in their environment via differential gene expression mediated by DNA binding transcriptional regulators. The PRODORIC2 database hosts one of the largest collections of DNA binding sites for prokaryotic transcription factors. It is the result of the thoroughly redesigned PRODORIC database. PRODORIC2 is more intuitive and user-friendly. Besides significant technical improvements, the new update offers more than 1000 new transcription factor binding sites and 110 new position weight matrices for genome-wide pattern searches with the Virtual Footprint tool. Moreover, binding sites deduced from high-throughput experiments were included. Data for 6 new bacterial species including bacteria of the Rhodobacteraceae family were added. Finally, a comprehensive collection of sigma- and transcription factor data for the nosocomial pathogen Clostridium difficile is now part of the database. PRODORIC2 is publicly available at http://www.prodoric2.de. PMID:29136200
Data tables for the 1993 National Transit Database section 15 report year
DOT National Transportation Integrated Search
1994-12-01
The Data Tables For the 1993 National Transit Database Section 15 Report Year is one of three publications comprising the 1993 Annual Report. Also referred to as the National Transit Database Reporting System, it is administered by the Federal Transi...
Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie
2003-04-02
Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2013 CFR
2013-01-01
... is linked to that animal in the CWD National Database or in an approved State database. The second... that animal and herd in the CWD National Database or in an approved State database. (Approved by the...
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2014 CFR
2014-01-01
... is linked to that animal in the CWD National Database or in an approved State database. The second... that animal and herd in the CWD National Database or in an approved State database. (Approved by the...
Code of Federal Regulations, 2010 CFR
2010-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.3 Definitions. (a) Except as otherwise provided, terms defined in... current editions of the National Transit Database Reporting Manuals and the NTD Uniform System of Accounts... benefits from assistance under 49 U.S.C. 5307 or 5311. Current edition of the National Transit Database...
Code of Federal Regulations, 2014 CFR
2014-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.3 Definitions. (a) Except as otherwise provided, terms defined in... current editions of the National Transit Database Reporting Manuals and the NTD Uniform System of Accounts... benefits from assistance under 49 U.S.C. 5307 or 5311. Current edition of the National Transit Database...
Code of Federal Regulations, 2013 CFR
2013-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.3 Definitions. (a) Except as otherwise provided, terms defined in... current editions of the National Transit Database Reporting Manuals and the NTD Uniform System of Accounts... benefits from assistance under 49 U.S.C. 5307 or 5311. Current edition of the National Transit Database...
Code of Federal Regulations, 2011 CFR
2011-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.3 Definitions. (a) Except as otherwise provided, terms defined in... current editions of the National Transit Database Reporting Manuals and the NTD Uniform System of Accounts... benefits from assistance under 49 U.S.C. 5307 or 5311. Current edition of the National Transit Database...
Code of Federal Regulations, 2012 CFR
2012-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.3 Definitions. (a) Except as otherwise provided, terms defined in... current editions of the National Transit Database Reporting Manuals and the NTD Uniform System of Accounts... benefits from assistance under 49 U.S.C. 5307 or 5311. Current edition of the National Transit Database...
Kleinboelting, Nils; Huep, Gunnar; Weisshaar, Bernd
2017-01-01
SimpleSearch provides access to a database containing information about T-DNA insertion lines of the GABI-Kat collection of Arabidopsis thaliana mutants. These mutants are an important tool for reverse genetics, and GABI-Kat is the second largest collection of such T-DNA insertion mutants. Insertion sites were deduced from flanking sequence tags (FSTs), and the database contains information about mutant plant lines as well as insertion alleles. Here, we describe improvements within the interface (available at http://www.gabi-kat.de/db/genehits.php) and with regard to the database content that have been realized in the last five years. These improvements include the integration of the Araport11 genome sequence annotation data containing the recently updated A. thaliana structural gene descriptions, an updated visualization component that displays groups of insertions with very similar insertion positions, mapped confirmation sequences, and primers. The visualization component provides a quick way to identify insertions of interest, and access to improved data about the exact structure of confirmed insertion alleles. In addition, the database content has been extended by incorporating additional insertion alleles that were detected during the confirmation process, as well as by adding new FSTs that have been produced during continued efforts to complement gaps in FST availability. Finally, the current database content regarding predicted and confirmed insertion alleles as well as primer sequences has been made available as downloadable flat files. © The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Decelle, Johan; Romac, Sarah; Stern, Rowena F; Bendif, El Mahdi; Zingone, Adriana; Audic, Stéphane; Guiry, Michael D; Guillou, Laure; Tessier, Désiré; Le Gall, Florence; Gourvil, Priscillia; Dos Santos, Adriana L; Probert, Ian; Vaulot, Daniel; de Vargas, Colomban; Christen, Richard
2015-11-01
Photosynthetic eukaryotes have a critical role as the main producers in most ecosystems of the biosphere. The ongoing environmental metabarcoding revolution opens the perspective for holistic ecosystems biological studies of these organisms, in particular the unicellular microalgae that often lack distinctive morphological characters and have complex life cycles. To interpret environmental sequences, metabarcoding necessarily relies on taxonomically curated databases containing reference sequences of the targeted gene (or barcode) from identified organisms. To date, no such reference framework exists for photosynthetic eukaryotes. In this study, we built the PhytoREF database that contains 6490 plastidial 16S rDNA reference sequences that originate from a large diversity of eukaryotes representing all known major photosynthetic lineages. We compiled 3333 amplicon sequences available from public databases and 879 sequences extracted from plastidial genomes, and generated 411 novel sequences from cultured marine microalgal strains belonging to different eukaryotic lineages. A total of 1867 environmental Sanger 16S rDNA sequences were also included in the database. Stringent quality filtering and a phylogeny-based taxonomic classification were applied for each 16S rDNA sequence. The database mainly focuses on marine microalgae, but sequences from land plants (representing half of the PhytoREF sequences) and freshwater taxa were also included to broaden the applicability of PhytoREF to different aquatic and terrestrial habitats. PhytoREF, accessible via a web interface (http://phytoref.fr), is a new resource in molecular ecology to foster the discovery, assessment and monitoring of the diversity of photosynthetic eukaryotes using high-throughput sequencing. © 2015 John Wiley & Sons Ltd.
National Transportation Atlas Databases : 2002
DOT National Transportation Integrated Search
2002-01-01
The National Transportation Atlas Databases 2002 (NTAD2002) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2010
DOT National Transportation Integrated Search
2010-01-01
The National Transportation Atlas Databases 2010 (NTAD2010) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2006
DOT National Transportation Integrated Search
2006-01-01
The National Transportation Atlas Databases 2006 (NTAD2006) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2005
DOT National Transportation Integrated Search
2005-01-01
The National Transportation Atlas Databases 2005 (NTAD2005) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2008
DOT National Transportation Integrated Search
2008-01-01
The National Transportation Atlas Databases 2008 (NTAD2008) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2003
DOT National Transportation Integrated Search
2003-01-01
The National Transportation Atlas Databases 2003 (NTAD2003) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2004
DOT National Transportation Integrated Search
2004-01-01
The National Transportation Atlas Databases 2004 (NTAD2004) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2009
DOT National Transportation Integrated Search
2009-01-01
The National Transportation Atlas Databases 2009 (NTAD2009) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2007
DOT National Transportation Integrated Search
2007-01-01
The National Transportation Atlas Databases 2007 (NTAD2007) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2012
DOT National Transportation Integrated Search
2012-01-01
The National Transportation Atlas Databases 2012 (NTAD2012) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
National Transportation Atlas Databases : 2011
DOT National Transportation Integrated Search
2011-01-01
The National Transportation Atlas Databases 2011 (NTAD2011) is a set of nationwide geographic databases of transportation facilities, transportation networks, and associated infrastructure. These datasets include spatial information for transportatio...
Analysis of national and regional landslide inventories in Europe
NASA Astrophysics Data System (ADS)
Hervás, J.; Van Den Eeckhaut, M.
2012-04-01
A landslide inventory can be defined as a detailed register of the distribution and characteristics of past landslides in an area. Today most landslide inventories have the form of digital databases including landslide distribution maps and associated alphanumeric information for each landslide. While landslide inventories are of the utmost importance for land use planning and risk management through the generation of landslide zonation (susceptibility, hazard and risk) maps, landslide databases are thought to greatly differ from one country to another and often also within the same country. This hampers the generation of comparable, harmonised landslide zonation maps at national and continental scales, which is needed for policy and decision making at EU level as regarded for instance in the INSPIRE Directive and the Thematic Strategy for Soil Protection. In order to have a clear understanding of the landslide inventories available in Europe and their potential to produce landslide zonation maps as well as to draw recommendations to improve harmonisation and interoperability between landslide databases, we have surveyed 37 countries. In total, information has been collected and analysed for 24 national databases in 22 countries (Albania, Andorra, Austria, Bosnia and Herzegovina, Bulgaria, Czech Republic, Former Yugoslav Republic of Macedonia, France, Greece, Hungary, Iceland, Ireland, Italy, Norway, Poland, Portugal, Slovakia, Slovenia, Spain, Sweden, Switzerland and UK) and 22 regional databases in 10 countries. At the moment, over 633,000 landslides are recorded in national databases, representing on average less than 50% of the estimated landslides occurred in these countries. The sample of regional databases included over 103,000 landslides, with an estimated completeness substantially higher than that of national databases, as more attention can be paid for data collection over smaller regions. Yet, both for national and regional coverage, the data collection methods only occasionally included advanced technologies such as remote sensing. With regard to the inventory maps of most databases, the analysis illustrates the high variability of scales (between 1:10 000 and 1:1 M for national inventories, and from 1:10 000 to 1:25 000 for regional inventories), landslide classification systems and representation symbology. It also shows the difficulties to precisely locate landslides referred to in historical documents only. In addition, information on landslide magnitude, geometrical characteristics and age reported in national and regional databases greatly differs, even within the same database, as it strongly depends on the objectives of the database, the data collection methods used, the resources employed and the remaining landslide expression. In particular, landslide initiation and/or reactivation dates are generally estimated in less than 25% of records, thus making hazard and hence risk assessment difficult. In most databases, scarce information on landslide impact (damage and casualties) further hinders risk assessment at regional and national scales. Estimated landslide activity, which is very relevant to early warning and emergency management, is only included in half of the national databases and restricted to part of the landslides registered. Moreover, the availability of this information is not substantially higher in regional databases than in national ones. Most landslide databases further included information on geo-environmental characteristics at the landslide site, which is very important for modelling landslide zoning. Although a number of national and regional agencies provide free web-GIS visualisation services, the potential of existing landslide databases is often not fully exploited as, in many cases, access by the general public and external researchers is restricted. Additionally, the availability of information only in the national or local language is common to most national and regional databases, thus hampering consultation for most foreigners. Finally, some suggestions for a minimum set of attributes to be collected and made available by European countries for building up a continental landslide database in support of EU policies are presented. This study has been conducted in the framework of the EU-FP7 project SafeLand (Grant Agreement 22647).
NASA Astrophysics Data System (ADS)
Pennington, Catherine; Dashwood, Claire; Freeborough, Katy
2014-05-01
The National Landslide Database has been developed by the British Geological Survey (BGS) and is the focus for national geohazard research for landslides in Great Britain. The history and structure of the geospatial database and associated Geographical Information System (GIS) are explained, along with the future developments of the database and its applications. The database is the most extensive source of information on landslides in Great Britain with over 16,500 records of landslide events, each documented as fully as possible. Data are gathered through a range of procedures, including: incorporation of other databases; automated trawling of current and historical scientific literature and media reports; new field- and desk-based mapping technologies with digital data capture, and crowd-sourcing information through social media and other online resources. This information is invaluable for the investigation, prevention and mitigation of areas of unstable ground in accordance with Government planning policy guidelines. The national landslide susceptibility map (GeoSure) and a national landslide domain map currently under development rely heavily on the information contained within the landslide database. Assessing susceptibility to landsliding requires knowledge of the distribution of failures and an understanding of causative factors and their spatial distribution, whilst understanding the frequency and types of landsliding present is integral to modelling how rainfall will influence the stability of a region. Communication of landslide data through the Natural Hazard Partnership (NHP) contributes to national hazard mitigation and disaster risk reduction with respect to weather and climate. Daily reports of landslide potential are published by BGS through the NHP and data collected for the National Landslide Database is used widely for the creation of these assessments. The National Landslide Database is freely available via an online GIS and is used by a variety of stakeholders for research purposes.
77 FR 66617 - HIT Policy and Standards Committees; Workgroup Application Database
Federal Register 2010, 2011, 2012, 2013, 2014
2012-11-06
... Database AGENCY: Office of the National Coordinator for Health Information Technology, HHS. ACTION: Notice of New ONC HIT FACA Workgroup Application Database. The Office of the National Coordinator (ONC) has launched a new Health Information Technology Federal Advisory Committee Workgroup Application Database...
78 FR 39290 - Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-01
... project, known as the ``National Mortgage Database,'' which is a joint effort of FHFA and the Consumer... a database of timely and otherwise unavailable residential mortgage market information to be made... Mortgage Database. The key purpose of the National Mortgage Database is to make accessible accurate...
Bodner, Martin; Perego, Ugo A.; Huber, Gabriela; Fendt, Liane; Röck, Alexander W.; Zimmermann, Bettina; Olivieri, Anna; Gómez-Carballa, Alberto; Lancioni, Hovirag; Angerhofer, Norman; Bobillo, Maria Cecilia; Corach, Daniel; Woodward, Scott R.; Salas, Antonio; Achilli, Alessandro; Torroni, Antonio; Bandelt, Hans-Jürgen; Parson, Walther
2012-01-01
It is now widely agreed that the Native American founders originated from a Beringian source population ∼15–18 thousand years ago (kya) and rapidly populated all of the New World, probably mainly following the Pacific coastal route. However, details about the migration into the Americas and the routes pursued on the continent still remain unresolved, despite numerous genetic, archaeological, and linguistic investigations. To examine the pioneering peopling phase of the South American continent, we screened literature and mtDNA databases and identified two novel mitochondrial DNA (mtDNA) clades, here named D1g and D1j, within the pan-American haplogroup D1. They both show overall rare occurrences but local high frequencies, and are essentially restricted to populations from the Southern Cone of South America (Chile and Argentina). We selected and completely sequenced 43 D1g and D1j mtDNA genomes applying highest quality standards. Molecular and phylogeographic analyses revealed extensive variation within each of the two clades and possibly distinct dispersal patterns. Their age estimates agree with the dating of the earliest archaeological sites in South America and indicate that the Paleo-Indian spread along the entire longitude of the American double continent might have taken even <2000 yr. This study confirms that major sampling and sequencing efforts are mandatory for uncovering all of the most basal variation in the Native American mtDNA haplogroups and for clarification of Paleo-Indian migrations, by targeting, if possible, both the general mixed population of national states and autochthonous Native American groups, especially in South America. PMID:22333566
NASA Astrophysics Data System (ADS)
Woolford, Alison; Holden, Marcia; Salit, Marc; Burns, Malcolm; Ellison, Stephen L. R.
2009-01-01
Key comparison CCQM-K61 was performed to demonstrate and document the capability of interested national metrology institutes in the determination of the quantity of specific DNA target in an aqueous solution. The study provides support for the following measurement claim: "Quantitation of a linearised plasmid DNA, based on a matched standard in a matrix of non-target DNA". The comparison was an activity of the Bioanalysis Working Group (BAWG) of the Comité Consultatif pour la Quantité de Matière and was coordinated by NIST (Gaithersburg, USA) and LGC (Teddington, UK). The following laboratories (in alphabetical order) participated in this key comparison. DMSC (Thailand); IRMM (European Union); KRISS (Republic of Korea); LGC (UK); NIM (China); NIST (USA); NMIA (Australia); NMIJ (Japan); VNIIM (Russian Federation) Good agreement was observed between the reported results of all nine of the participants. Uncertainty estimates did not account fully for the dispersion of results even after allowance for possible inhomogeneity in calibration materials. Preliminary studies suggest that the effects of fluorescence threshold setting might contribute to the excess dispersion, and further study of this topic is suggested Main text. To reach the main text of this paper, click on Final Report. Note that this text is that which appears in Appendix B of the BIPM key comparison database kcdb.bipm.org/. The final report has been peer-reviewed and approved for publication by the CCQM, according to the provisions of the CIPM Mutual Recognition Arrangement (MRA).
Collecting, archiving and processing DNA from wildlife samples using FTA® databasing paper
Smith, LM; Burgoyne, LA
2004-01-01
Background Methods involving the analysis of nucleic acids have become widespread in the fields of traditional biology and ecology, however the storage and transport of samples collected in the field to the laboratory in such a manner to allow purification of intact nucleic acids can prove problematical. Results FTA® databasing paper is widely used in human forensic analysis for the storage of biological samples and for purification of nucleic acids. The possible uses of FTA® databasing paper in the purification of DNA from samples of wildlife origin were examined, with particular reference to problems expected due to the nature of samples of wildlife origin. The processing of blood and tissue samples, the possibility of excess DNA in blood samples due to nucleated erythrocytes, and the analysis of degraded samples were all examined, as was the question of long term storage of blood samples on FTA® paper. Examples of the end use of the purified DNA are given for all protocols and the rationale behind the processing procedures is also explained to allow the end user to adjust the protocols as required. Conclusions FTA® paper is eminently suitable for collection of, and purification of nucleic acids from, biological samples from a wide range of wildlife species. This technology makes the collection and storage of such samples much simpler. PMID:15072582
Feline Non-repetitive Mitochondrial DNA Control Region Database for Forensic Evidence
Grahn, R. A.; Kurushima, J. D.; Billings, N. C.; Grahn, J.C.; Halverson, J. L.; Hammer, E.; Ho, C.K.; Kun, T. J.; Levy, J.K.; Lipinski, M. J.; Mwenda, J.M.; Ozpinar, H.; Schuster, R.K; Shoorijeh, S.J.; Tarditi, C. R.; Waly, N.E.; Wictum, E. J.; Lyons, L. A.
2010-01-01
The domestic cat is the one of the most popular pets throughout the world. A by-product of owning, interacting with, or being in a household with a cat is the transfer of shed fur to clothing or personal objects. As trace evidence, transferred cat fur is a relatively untapped resource for forensic scientists. Both phenotypic and genotypic characteristics can be obtained from cat fur, but databases for neither aspect exist. Because cats incessantly groom, cat fur may have nucleated cells, not only in the hair bulb, but also as epithelial cells on the hair shaft deposited during the grooming process, thereby generally providing material for DNA profiling. To effectively exploit cat hair as a resource, representative databases must be established. This study evaluates 402 bp of the mtDNA control region (CR) from 1,394 cats, including cats from 25 distinct worldwide populations and 26 breeds. Eighty-three percent of the cats are represented by 12 major mitotypes. An additional 8.0% are clearly derived from the major mitotypes. Unique sequences were found in 7.5% of the cats. The overall genetic diversity for this data set was 0.8813 ± 0.0046 with a random match probability of 11.8%. This region of the cat mtDNA has discriminatory power suitable for forensic application worldwide. PMID:20457082
Colony-PCR Is a Rapid Method for DNA Amplification of Hyphomycetes
Walch, Georg; Knapp, Maria; Rainer, Georg; Peintner, Ursula
2016-01-01
Fungal pure cultures identified with both classical morphological methods and through barcoding sequences are a basic requirement for reliable reference sequences in public databases. Improved techniques for an accelerated DNA barcode reference library construction will result in considerably improved sequence databases covering a wider taxonomic range. Fast, cheap, and reliable methods for obtaining DNA sequences from fungal isolates are, therefore, a valuable tool for the scientific community. Direct colony PCR was already successfully established for yeasts, but has not been evaluated for a wide range of anamorphic soil fungi up to now, and a direct amplification protocol for hyphomycetes without tissue pre-treatment has not been published so far. Here, we present a colony PCR technique directly from fungal hyphae without previous DNA extraction or other prior manipulation. Seven hundred eighty-eight fungal strains from 48 genera were tested with a success rate of 86%. PCR success varied considerably: DNA of fungi belonging to the genera Cladosporium, Geomyces, Fusarium, and Mortierella could be amplified with high success. DNA of soil-borne yeasts was always successfully amplified. Absidia, Mucor, Trichoderma, and Penicillium isolates had noticeably lower PCR success. PMID:29376929
The problems and promise of DNA barcodes for species diagnosis of primate biomaterials
Lorenz, Joseph G; Jackson, Whitney E; Beck, Jeanne C; Hanner, Robert
2005-01-01
The Integrated Primate Biomaterials and Information Resource (www.IPBIR.org) provides essential research reagents to the scientific community by establishing, verifying, maintaining, and distributing DNA and RNA derived from primate cell cultures. The IPBIR uses mitochondrial cytochrome c oxidase subunit I sequences to verify the identity of samples for quality control purposes in the accession, cell culture, DNA extraction processes and prior to shipping to end users. As a result, IPBIR is accumulating a database of ‘DNA barcodes’ for many species of primates. However, this quality control process is complicated by taxon specific patterns of ‘universal primer’ failure, as well as the amplification or co-amplification of nuclear pseudogenes of mitochondrial origins. To overcome these difficulties, taxon specific primers have been developed, and reverse transcriptase PCR is utilized to exclude these extraneous sequences from amplification. DNA barcoding of primates has applications to conservation and law enforcement. Depositing barcode sequences in a public database, along with primer sequences, trace files and associated quality scores, makes this species identification technique widely accessible. Reference DNA barcode sequences should be derived from, and linked to, specimens of known provenance in web-accessible collections in order to validate this system of molecular diagnostics. PMID:16214744
Rise, Matthew L.; von Schalburg, Kristian R.; Brown, Gordon D.; Mawer, Melanie A.; Devlin, Robert H.; Kuipers, Nathanael; Busby, Maura; Beetz-Sargent, Marianne; Alberto, Roberto; Gibbs, A. Ross; Hunt, Peter; Shukin, Robert; Zeznik, Jeffrey A.; Nelson, Colleen; Jones, Simon R.M.; Smailus, Duane E.; Jones, Steven J.M.; Schein, Jacqueline E.; Marra, Marco A.; Butterfield, Yaron S.N.; Stott, Jeff M.; Ng, Siemon H.S.; Davidson, William S.; Koop, Ben F.
2004-01-01
We report 80,388 ESTs from 23 Atlantic salmon (Salmo salar) cDNA libraries (61,819 ESTs), 6 rainbow trout (Oncorhynchus mykiss) cDNA libraries (14,544 ESTs), 2 chinook salmon (Oncorhynchus tshawytscha) cDNA libraries (1317 ESTs), 2 sockeye salmon (Oncorhynchus nerka) cDNA libraries (1243 ESTs), and 2 lake whitefish (Coregonus clupeaformis) cDNA libraries (1465 ESTs). The majority of these are 3′ sequences, allowing discrimination between paralogs arising from a recent genome duplication in the salmonid lineage. Sequence assembly reveals 28,710 different S. salar, 8981 O. mykiss, 1085 O. tshawytscha, 520 O. nerka, and 1176 C. clupeaformis putative transcripts. We annotate the submitted portion of our EST database by molecular function. Higher- and lower-molecular-weight fractions of libraries are shown to contain distinct gene sets, and higher rates of gene discovery are associated with higher-molecular weight libraries. Pyloric caecum library group annotations indicate this organ may function in redox control and as a barrier against systemic uptake of xenobiotics. A microarray is described, containing 7356 salmonid elements representing 3557 different cDNAs. Analyses of cross-species hybridizations to this cDNA microarray indicate that this resource may be used for studies involving all salmonids. PMID:14962987
REDIdb: the RNA editing database.
Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla
2007-01-01
The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.
Dirks, Wilhelm Gerhard; Faehnrich, Silke; Estella, Isabelle Annick Janine; Drexler, Hans Guenter
2005-01-01
Cell lines have wide applications as model systems in the medical and pharmaceutical industry. Much drug and chemical testing is now first carried out exhaustively on in vitro systems, reducing the need for complicated and invasive animal experiments. The basis for any research, development or production program involving cell lines is the choice of an authentic cell line. Microsatellites in the human genome that harbour short tandem repeat (STR) DNA markers allow individualisation of established cell lines at the DNA level. Fluorescence polymerase chain reaction amplification of eight highly polymorphic microsatellite STR loci plus gender determination was found to be the best tool to screen the uniqueness of DNA profiles in a fingerprint database. Our results demonstrate that cross-contamination and misidentification remain chronic problems in the use of human continuous cell lines. The combination of rapidly generated DNA types based on single-locus STR and their authentication or individualisation by screening the fingerprint database constitutes a highly reliable and robust method for the identification and verification of cell lines.
Weissensteiner, Hansi; Schönherr, Sebastian; Specht, Günther; Kronenberg, Florian; Brandstätter, Anita
2010-03-09
Mitochondrial DNA (mtDNA) is widely being used for population genetics, forensic DNA fingerprinting and clinical disease association studies. The recent past has uncovered severe problems with mtDNA genotyping, not only due to the genotyping method itself, but mainly to the post-lab transcription, storage and report of mtDNA genotypes. eCOMPAGT, a system to store, administer and connect phenotype data to all kinds of genotype data is now enhanced by the possibility of storing mtDNA profiles and allowing their validation, linking to phenotypes and export as numerous formats. mtDNA profiles can be imported from different sequence evaluation programs, compared between evaluations and their haplogroup affiliations stored. Furthermore, eCOMPAGT has been improved in its sophisticated transparency (support of MySQL and Oracle), security aspects (by using database technology) and the option to import, manage and store genotypes derived from various genotyping methods (SNPlex, TaqMan, and STRs). It is a software solution designed for project management, laboratory work and the evaluation process all-in-one. The extended mtDNA version of eCOMPAGT was designed to enable error-free post-laboratory data handling of human mtDNA profiles. This software is suited for small to medium-sized human genetic, forensic and clinical genetic laboratories. The direct support of MySQL and the improved database security options render eCOMPAGT a powerful tool to build an automated workflow architecture for several genotyping methods. eCOMPAGT is freely available at http://dbis-informatik.uibk.ac.at/ecompagt.
2010-01-01
Background Mitochondrial DNA (mtDNA) is widely being used for population genetics, forensic DNA fingerprinting and clinical disease association studies. The recent past has uncovered severe problems with mtDNA genotyping, not only due to the genotyping method itself, but mainly to the post-lab transcription, storage and report of mtDNA genotypes. Description eCOMPAGT, a system to store, administer and connect phenotype data to all kinds of genotype data is now enhanced by the possibility of storing mtDNA profiles and allowing their validation, linking to phenotypes and export as numerous formats. mtDNA profiles can be imported from different sequence evaluation programs, compared between evaluations and their haplogroup affiliations stored. Furthermore, eCOMPAGT has been improved in its sophisticated transparency (support of MySQL and Oracle), security aspects (by using database technology) and the option to import, manage and store genotypes derived from various genotyping methods (SNPlex, TaqMan, and STRs). It is a software solution designed for project management, laboratory work and the evaluation process all-in-one. Conclusions The extended mtDNA version of eCOMPAGT was designed to enable error-free post-laboratory data handling of human mtDNA profiles. This software is suited for small to medium-sized human genetic, forensic and clinical genetic laboratories. The direct support of MySQL and the improved database security options render eCOMPAGT a powerful tool to build an automated workflow architecture for several genotyping methods. eCOMPAGT is freely available at http://dbis-informatik.uibk.ac.at/ecompagt. PMID:20214782
FARME DB: a functional antibiotic resistance element database
Wallace, James C.; Port, Jesse A.; Smith, Marissa N.; Faustman, Elaine M.
2017-01-01
Antibiotic resistance (AR) is a major global public health threat but few resources exist that catalog AR genes outside of a clinical context. Current AR sequence databases are assembled almost exclusively from genomic sequences derived from clinical bacterial isolates and thus do not include many microbial sequences derived from environmental samples that confer resistance in functional metagenomic studies. These environmental metagenomic sequences often show little or no similarity to AR sequences from clinical isolates using standard classification criteria. In addition, existing AR databases provide no information about flanking sequences containing regulatory or mobile genetic elements. To help address this issue, we created an annotated database of DNA and protein sequences derived exclusively from environmental metagenomic sequences showing AR in laboratory experiments. Our Functional Antibiotic Resistant Metagenomic Element (FARME) database is a compilation of publically available DNA sequences and predicted protein sequences conferring AR as well as regulatory elements, mobile genetic elements and predicted proteins flanking antibiotic resistant genes. FARME is the first database to focus on functional metagenomic AR gene elements and provides a resource to better understand AR in the 99% of bacteria which cannot be cultured and the relationship between environmental AR sequences and antibiotic resistant genes derived from cultured isolates. Database URL: http://staff.washington.edu/jwallace/farme PMID:28077567
Genomics dataset on unclassified published organism (patent US 7547531).
Khan Shawan, Mohammad Mahfuz Ali; Hasan, Md Ashraful; Hossain, Md Mozammel; Hasan, Md Mahmudul; Parvin, Afroza; Akter, Salina; Uddin, Kazi Rasel; Banik, Subrata; Morshed, Mahbubul; Rahman, Md Nazibur; Rahman, S M Badier
2016-12-01
Nucleotide (DNA) sequence analysis provides important clues regarding the characteristics and taxonomic position of an organism. With the intention that, DNA sequence analysis is very crucial to learn about hierarchical classification of that particular organism. This dataset (patent US 7547531) is chosen to simplify all the complex raw data buried in undisclosed DNA sequences which help to open doors for new collaborations. In this data, a total of 48 unidentified DNA sequences from patent US 7547531 were selected and their complete sequences were retrieved from NCBI BioSample database. Quick response (QR) code of those DNA sequences was constructed by DNA BarID tool. QR code is useful for the identification and comparison of isolates with other organisms. AT/GC content of the DNA sequences was determined using ENDMEMO GC Content Calculator, which indicates their stability at different temperature. The highest GC content was observed in GP445188 (62.5%) which was followed by GP445198 (61.8%) and GP445189 (59.44%), while lowest was in GP445178 (24.39%). In addition, New England BioLabs (NEB) database was used to identify cleavage code indicating the 5, 3 and blunt end and enzyme code indicating the methylation site of the DNA sequences was also shown. These data will be helpful for the construction of the organisms' hierarchical classification, determination of their phylogenetic and taxonomic position and revelation of their molecular characteristics.
Biological sequence compression algorithms.
Matsumoto, T; Sadakane, K; Imai, H
2000-01-01
Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Critical Infrastructure: The National Asset Database
2007-07-16
Infrastructure: The National Asset Database 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e...upon which federal resources, including infrastructure protection grants , are allocated. According to DHS, both of those assumptions are wrong. DHS...assets that it has determined are critical to the nation. Also, while the National Asset Database has been used to support federal grant -making
NASA Astrophysics Data System (ADS)
Chen, Shuo-Bin; Liu, Guo-Cai; Gu, Lian-Quan; Huang, Zhi-Shu; Tan, Jia-Heng
2018-02-01
Design of small molecules targeted at human telomeric G-quadruplex DNA is an extremely active research area. Interestingly, the telomeric G-quadruplex is a highly polymorphic structure. Changes in its conformation upon small molecule binding may be a powerful method to achieve a desired biological effect. However, the rational development of small molecules capable of regulating conformational change of telomeric G-quadruplex structures is still challenging. In this study, we developed a reliable ligand-based pharmacophore model based on isaindigotone derivatives with conformational change activity toward telomeric G-quadruplex DNA. Furthermore, virtual screening of database was conducted using this pharmacophore model and benzopyranopyrimidine derivatives in the database were identified as a strong inducer of the telomeric G-quadruplex DNA conformation, transforming it from hybrid-type structure to parallel structure.
Rapid in silico cloning of genes using expressed sequence tags (ESTs).
Gill, R W; Sanseau, P
2000-01-01
Expressed sequence tags (ESTs) are short single-pass DNA sequences obtained from either end of cDNA clones. These ESTs are derived from a vast number of cDNA libraries obtained from different species. Human ESTs are the bulk of the data and have been widely used to identify new members of gene families, as markers on the human chromosomes, to discover polymorphism sites and to compare expression patterns in different tissues or pathologies states. Information strategies have been devised to query EST databases. Since most of the analysis is performed with a computer, the term "in silico" strategy has been coined. In this chapter we will review the current status of EST databases, the pros and cons of EST-type data and describe possible strategies to retrieve meaningful information.
Kodama, Yuichi; Mashima, Jun; Kaminuma, Eli; Gojobori, Takashi; Ogasawara, Osamu; Takagi, Toshihisa; Okubo, Kousaku; Nakamura, Yasukazu
2012-01-01
The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. The central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: the 'DDBJ Omics Archive' (DOR; http://trace.ddbj.nig.ac.jp/dor) and BioProject (http://trace.ddbj.nig.ac.jp/bioproject). DOR is an archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides an organizational framework to access metadata about research projects and the data from the projects that are deposited into different databases. In this article, we describe major changes and improvements introduced to the DDBJ services, and the launch of two new resources: DOR and BioProject.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2010-01-01
GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bi-monthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI homepage: www.ncbi.nlm.nih.gov.
NCBI GEO: archive for functional genomics data sets—10 years on
Barrett, Tanya; Troup, Dennis B.; Wilhite, Stephen E.; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F.; Tomashevsky, Maxim; Marshall, Kimberly A.; Phillippy, Katherine H.; Sherman, Patti M.; Muertter, Rolf N.; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra
2011-01-01
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20 000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/. PMID:21097893
PubDNA Finder: a web database linking full-text articles to sequences of nucleic acids.
García-Remesal, Miguel; Cuevas, Alejandro; Pérez-Rey, David; Martín, Luis; Anguita, Alberto; de la Iglesia, Diana; de la Calle, Guillermo; Crespo, José; Maojo, Víctor
2010-11-01
PubDNA Finder is an online repository that we have created to link PubMed Central manuscripts to the sequences of nucleic acids appearing in them. It extends the search capabilities provided by PubMed Central by enabling researchers to perform advanced searches involving sequences of nucleic acids. This includes, among other features (i) searching for papers mentioning one or more specific sequences of nucleic acids and (ii) retrieving the genetic sequences appearing in different articles. These additional query capabilities are provided by a searchable index that we created by using the full text of the 176 672 papers available at PubMed Central at the time of writing and the sequences of nucleic acids appearing in them. To automatically extract the genetic sequences occurring in each paper, we used an original method we have developed. The database is updated monthly by automatically connecting to the PubMed Central FTP site to retrieve and index new manuscripts. Users can query the database via the web interface provided. PubDNA Finder can be freely accessed at http://servet.dia.fi.upm.es:8080/pubdnafinder
The Dfam database of repetitive DNA families.
Hubley, Robert; Finn, Robert D; Clements, Jody; Eddy, Sean R; Jones, Thomas A; Bao, Weidong; Smit, Arian F A; Wheeler, Travis J
2016-01-04
Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Storage and utilization of HLA genomic data--new approaches to HLA typing.
Helmberg, W
2000-01-01
Currently available DNA-based HLA typing assays can provide detailed information about sequence motifs of a tested sample. It is still a common practice, however, for information acquired by high-resolution sequence specific oligonucleotide probe (SSOP) typing or sequence specific priming (SSP) to be presented in a low-resolution serological format. Unfortunately, this representation can lead to significant loss of useful data in many cases. An alternative to assigning allele equivalents to suchDNA typing results is simply to store the observed typing pattern and utilize the information with the help of Virtual DNA Analysis (VDA). Interpretation of the stored typing patterns can then be updated based on newly defined alleles, assuming the sequence motifs detected by the typing reagents are known. Rather than updating reagent specificities in individual laboratories, such updates should be performed in a central, publicly available sequence database. By referring to this database, HLA genomic data can then be stored and transferred between laboratories without loss of information. The 13th International Histocompatibility Workshop offers an ideal opportunity to begin building this common database for the entire human MHC.
Characterizing the genetic structure of a forensic DNA database using a latent variable approach.
Kruijver, Maarten
2016-07-01
Several problems in forensic genetics require a representative model of a forensic DNA database. Obtaining an accurate representation of the offender database can be difficult, since databases typically contain groups of persons with unregistered ethnic origins in unknown proportions. We propose to estimate the allele frequencies of the subpopulations comprising the offender database and their proportions from the database itself using a latent variable approach. We present a model for which parameters can be estimated using the expectation maximization (EM) algorithm. This approach does not rely on relatively small and possibly unrepresentative population surveys, but is driven by the actual genetic composition of the database only. We fit the model to a snapshot of the Dutch offender database (2014), which contains close to 180,000 profiles, and find that three subpopulations suffice to describe a large fraction of the heterogeneity in the database. We demonstrate the utility and reliability of the approach with three applications. First, we use the model to predict the number of false leads obtained in database searches. We assess how well the model predicts the number of false leads obtained in mock searches in the Dutch offender database, both for the case of familial searching for first degree relatives of a donor and searching for contributors to three-person mixtures. Second, we study the degree of partial matching between all pairs of profiles in the Dutch database and compare this to what is predicted using the latent variable approach. Third, we use the model to provide evidence to support that the Dutch practice of estimating match probabilities using the Balding-Nichols formula with a native Dutch reference database and θ=0.03 is conservative. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Ye, Pohao; Luan, Yizhao; Chen, Kaining; Liu, Yizhi; Xiao, Chuanle; Xie, Zhi
2017-01-04
DNA methylation is an important type of epigenetic modifications, where 5- methylcytosine (5mC), 6-methyadenine (6mA) and 4-methylcytosine (4mC) are the most common types. Previous efforts have been largely focused on 5mC, providing invaluable insights into epigenetic regulation through DNA methylation. Recently developed single-molecule real-time (SMRT) sequencing technology provides a unique opportunity to detect the less studied DNA 6mA and 4mC modifications at single-nucleotide resolution. With a rapidly increased amount of SMRT sequencing data generated, there is an emerging demand to systematically explore DNA 6mA and 4mC modifications from these data sets. MethSMRT is the first resource hosting DNA 6mA and 4mC methylomes. All the data sets were processed using the same analysis pipeline with the same quality control. The current version of the database provides a platform to store, browse, search and download epigenome-wide methylation profiles of 156 species, including seven eukaryotes such as Arabidopsis, C. elegans, Drosophila, mouse and yeast, as well as 149 prokaryotes. It also offers a genome browser to visualize the methylation sites and related information such as single nucleotide polymorphisms (SNP) and genomic annotation. Furthermore, the database provides a quick summary of statistics of methylome of 6mA and 4mC and predicted methylation motifs for each species. MethSMRT is publicly available at http://sysbio.sysu.edu.cn/methsmrt/ without use restriction. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
78 FR 31947 - National Institutes of Health
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-28
... Certification (previously National Database for Autism Research Data Access Request), 0925-0667, Revision... approval for use of the National Database for Autism Research (NDAR) Data Use Certification (DUC) Form...
Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures
Pride, David T; Schoenfeld, Thomas
2008-01-01
Background Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. Results From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. Conclusion That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis. PMID:18798991
Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures.
Pride, David T; Schoenfeld, Thomas
2008-09-17
Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis.
DNA repair in Chromobacterium violaceum.
Duarte, Fábio Teixeira; Carvalho, Fabíola Marques de; Bezerra e Silva, Uaska; Scortecci, Kátia Castanho; Blaha, Carlos Alfredo Galindo; Agnez-Lima, Lucymara Fassarella; Batistuzzo de Medeiros, Silvia Regina
2004-03-31
Chromobacterium violaceum is a Gram-negative beta-proteobacterium that inhabits a variety of ecosystems in tropical and subtropical regions, including the water and banks of the Negro River in the Brazilian Amazon. This bacterium has been the subject of extensive study over the last three decades, due to its biotechnological properties, including the characteristic violacein pigment, which has antimicrobial and anti-tumoral activities. C. violaceum promotes the solubilization of gold in a mercury-free process, and has been used in the synthesis of homopolyesters suitable for the production of biodegradable polymers. The complete genome sequence of this organism has been completed by the Brazilian National Genome Project Consortium. The aim of our group was to study the DNA repair genes in this organism, due to their importance in the maintenance of genomic integrity. We identified DNA repair genes involved in different pathways in C. violaceum through a similarity search against known sequences deposited in databases. The phylogenetic analyses were done using programs of the PHILYP package. This analysis revealed various metabolic pathways, including photoreactivation, base excision repair, nucleotide excision repair, mismatch repair, recombinational repair, and the SOS system. The similarity between the C. violaceum sequences and those of Neisserie miningitidis and Ralstonia solanacearum was greater than that between the C. violaceum and Escherichia coli sequences. The peculiarities found in the C. violaceum genome were the absence of LexA, some horizontal transfer events and a large number of repair genes involved with alkyl and oxidative DNA damage.
Estimating Diversity of Florida Keys Zooplankton Using New Environmental DNA Methods
NASA Astrophysics Data System (ADS)
Djurhuus, A.; Goldsmith, D. B.; Sawaya, N. A.; Breitbart, M.
2016-02-01
Zooplankton are of great importance in marine food webs, where they serve to link the phytoplankton and bacteria with higher trophic levels. Zooplankton are a diverse group containing molluscs, crustaceans, fish larvae and many other taxa. The sheer number of species and often minor morphological distinctions between species makes it challenging and exceptionally time consuming to identify the species composition of marine zooplankton samples. As a part of the Marine Biodiversity Observation Network (MBON) project, we have developed and groundtruthed an alternative, relatively time-efficient method for zooplankton identification using environmental DNA (eDNA). Samples were collected from Molasses reef, Looe Key, and Western Sambo along the Florida Keys from five bi-monthly cruises on board the RV Walton Smith. Samples were collected for environmental DNA (eDNA) by filtering 1 L of water on to a 0.22 µm filter and zooplankton samples were collected using nets with three mesh sizes (64μm, 200μm, and 500μm) to catch different size fractions. Half of zooplankton samples were fixed in 70% ethanol and half in 10% formalin, for DNA extraction and morphological identification, respectively. Individuals representing visually abundant taxa were picked into individual wells for PCR with universal 18S rRNA gene primers and subsequent sequencing to build a reference barcode database for zooplankton species commonly found in the study region. PCR and Illumina MiSeq next generation sequencing was applied to the eDNA extracted from the 0.22 μm filters and sequences were be compared to our local custom database as well as publicly available databases to determine zooplankton community composition. Finally, composition and diversity analyses were performed to compare results obtained with the new eDNA approach to standard morphological classification of zooplankton communities. Results show that the eDNA approach can enable the determination of zooplankton diversity through collection of a single water sample, which, when combined with bacterial and archaeal diversity analyses, will help us understand the coupling between different trophic levels and the drivers of plankton dynamics in the sub-tropical Florida Keys.
Drobniewski, F. A.; Gibson, A.; Ruddy, M.; Yates, M. D.
2003-01-01
The aim of this study was to develop a national model and analyze the value of a molecular epidemiological Mycobacterium tuberculosis DNA fingerprint-outbreak database. Incidents were investigated by the United Kingdom PHLS Mycobacterium Reference Unit (MRU) from June 1997 to December 2001, inclusive. A total of 124 incidents involving 972 tuberculosis cases, including 520 patient cultures from referred incidents and 452 patient cultures related to two population studies, were examined by using restriction fragment length polymorphism IS6110 fingerprinting and rapid epidemiological typing. Investigations were divided into the following three categories, reflecting different operational strategies: retrospective passive analysis, retrospective active analysis, and retrospective prospective analysis. The majority of incidents were in the retrospective passive analysis category, i.e., the individual submitting isolates has a suspicion they may be linked. Outbreaks were examined in schools, hospitals, farms, prisons, and public houses, and laboratory cross-contamination events and unusual clinical presentations were investigated. Retrospective active analysis involved a major outbreak centered on a high school. Contact tracing of a teenager with smear-positive pulmonary tuberculosis matched 14 individuals, including members of his class, and another 60 cases were identified in schools clinically and radiologically and by skin testing. Retrospective prospective analysis involved an outbreak of 94 isoniazid-resistant tuberculosis cases in London, United Kingdom, that began after cases were identified at one hospital in January 2000. Contact tracing and comparison with MRU databases indicated that the earliest matched case had occurred in 1995. Subsequently, the MRU changed to an active prospective analysis targeting linked isoniazid-monoresistant isolates for follow up. The patients were multiethnic, born mainly in the United Kingdom, and included professionals, individuals from the music industry, intravenous drug abusers, and prisoners. PMID:12734218
Drobniewski, F A; Gibson, A; Ruddy, M; Yates, M D
2003-05-01
The aim of this study was to develop a national model and analyze the value of a molecular epidemiological Mycobacterium tuberculosis DNA fingerprint-outbreak database. Incidents were investigated by the United Kingdom PHLS Mycobacterium Reference Unit (MRU) from June 1997 to December 2001, inclusive. A total of 124 incidents involving 972 tuberculosis cases, including 520 patient cultures from referred incidents and 452 patient cultures related to two population studies, were examined by using restriction fragment length polymorphism IS6110 fingerprinting and rapid epidemiological typing. Investigations were divided into the following three categories, reflecting different operational strategies: retrospective passive analysis, retrospective active analysis, and retrospective prospective analysis. The majority of incidents were in the retrospective passive analysis category, i.e., the individual submitting isolates has a suspicion they may be linked. Outbreaks were examined in schools, hospitals, farms, prisons, and public houses, and laboratory cross-contamination events and unusual clinical presentations were investigated. Retrospective active analysis involved a major outbreak centered on a high school. Contact tracing of a teenager with smear-positive pulmonary tuberculosis matched 14 individuals, including members of his class, and another 60 cases were identified in schools clinically and radiologically and by skin testing. Retrospective prospective analysis involved an outbreak of 94 isoniazid-resistant tuberculosis cases in London, United Kingdom, that began after cases were identified at one hospital in January 2000. Contact tracing and comparison with MRU databases indicated that the earliest matched case had occurred in 1995. Subsequently, the MRU changed to an active prospective analysis targeting linked isoniazid-monoresistant isolates for follow up. The patients were multiethnic, born mainly in the United Kingdom, and included professionals, individuals from the music industry, intravenous drug abusers, and prisoners.
Application of a mitochondrial DNA control region frequency database for UK domestic cats.
Ottolini, Barbara; Lall, Gurdeep Matharu; Sacchini, Federico; Jobling, Mark A; Wetton, Jon H
2017-03-01
DNA variation in 402bp of the mitochondrial control region flanked by repeat sequences RS2 and RS3 was evaluated by Sanger sequencing in 152 English domestic cats, in order to determine the significance of matching DNA sequences between hairs found with a victim's body and the suspect's pet cat. Whilst 95% of English cats possessed one of the twelve globally widespread mitotypes, four new variants were observed, the most common of which (2% frequency) was shared with the evidential samples. No significant difference in mitotype frequency was seen between 32 individuals from the locality of the crime and 120 additional cats from the rest of England, suggesting a lack of local population structure. However, significant differences were observed in comparison with frequencies in other countries, including the closely neighbouring Netherlands, highlighting the importance of appropriate genetic databases when determining the evidential significance of mitochondrial DNA evidence. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
D.H. Kaye
2006-10-19
Federal and state law enforcement authorities have amassed large collections of DNA samples and the identifying profiles derived from them. These databases help to identify the guilty and to exonerate the innocent, but as the databanks grow, so do fears about civil liberties. The research reported here discusses three legal and social policy issues that have been raised in regard to these biobanks—the choice of loci to type for identifying individuals, the indefinite retention of DNA samples, and the use of the DNA samples or the identifying profiles for research purposes. It also considers the possible value of the databasesmore » for research into the genetics of human behavior and the ethics of using them for this purpose. It rejects the broad claim that such research is inherently unethical but proposes procedures for ensuring that the value of the proposed research justifies any psychosocial or other risks to the subjects of the research.« less
Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools.
Cer, Regina Z; Donohue, Duncan E; Mudunuri, Uma S; Temiz, Nuri A; Loss, Michael A; Starner, Nathan J; Halusa, Goran N; Volfovsky, Natalia; Yi, Ming; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M
2013-01-01
The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.
National Online Meeting Proceedings (15th, New York, New York, May 10-12, 1994).
ERIC Educational Resources Information Center
1994
This proceedings contains 58 papers that were reviewed and selected for presentation at the 1994 National Online Meeting. The introduction, "Highlights of the Online/CD-ROM Database Industry: Implications of the Internet for Database Producers" by Martha E. Williams, provides statistics regarding databases, database records, database…
DIMA.Tools: An R package for working with the database for inventory, monitoring, and assessment
USDA-ARS?s Scientific Manuscript database
The Database for Inventory, Monitoring, and Assessment (DIMA) is a Microsoft Access database used to collect, store and summarize monitoring data. This database is used by both local and national monitoring efforts within the National Park Service, the Forest Service, the Bureau of Land Management, ...
National Transportation Atlas Databases : 1995
DOT National Transportation Integrated Search
1995-01-01
BTS has compiled the initial version of a geographic atlas : database to support research, analysis, and decision making : across all modes of transportation. The atlas databases are : designed primarily to meet the needs of DOT at the national : lev...
Weinreb, Jeffrey H; Yoshida, Ryu; Cote, Mark P; O'Sullivan, Michael B; Mazzocca, Augustus D
2017-01-01
The purpose of this study was to evaluate how database use has changed over time in Arthroscopy: The Journal of Arthroscopic and Related Surgery and to inform readers about available databases used in orthopaedic literature. An extensive literature search was conducted to identify databases used in Arthroscopy and other orthopaedic literature. All articles published in Arthroscopy between January 1, 2006, and December 31, 2015, were reviewed. A database was defined as a national, widely available set of individual patient encounters, applicable to multiple patient populations, used in orthopaedic research in a peer-reviewed journal, not restricted by encounter setting or visit duration, and with information available in English. Databases used in Arthroscopy included PearlDiver, the American College of Surgeons National Surgical Quality Improvement Program, the Danish Common Orthopaedic Database, the Swedish National Knee Ligament Register, the Hospital Episodes Statistics database, and the National Inpatient Sample. Database use increased significantly from 4 articles in 2013 to 11 articles in 2015 (P = .012), with no database use between January 1, 2006, and December 31, 2012. Database use increased significantly between January 1, 2006, and December 31, 2015, in Arthroscopy. Level IV, systematic review of Level II through IV studies. Copyright © 2016 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Looking to a future of improved diabetes management: interview with Professor Steve Bain.
Bain, Steve
2016-12-01
Steve Bain talks to Francesca Lake, Managing Editor: Steve is currently a Professor at Swansea University Medical School (Wales), Assistant Medical Director for Research & Development for ABM University Health Board and Clinical Lead for the Diabetes Research Unit, Wales. His clinical training included research into the genetics of Type 1 diabetes, with his current clinical interests surrounding exercise in Type 1 diabetes, new therapies and the provision of diabetes services. His background has led him to be Principal Investigator for several multicenter trials, and to be involved in various ethical committees concerning genetics. He led the UK Human Genetics Commission's report on DNA testing in 2009, and in 2007 was invited to sit on the National DNA Database Ethics Group, established by the Secretary of State for the Home Department. Steve is also a member of the Wales Diabetes & Endocrine Society executive committee and chairs the Specialist Training Committee for Diabetes & Endocrinology for Wales. He also chairs the Board that oversees the Institute of Life Science Joint Clinical Research Facility, the premier clinical research institute in Wales.
ORF157 from the Archaeal Virus Acidianus Filamentous Virus 1 Defines a New Class of Nuclease▿
Goulet, Adeline; Pina, Mery; Redder, Peter; Prangishvili, David; Vera, Laura; Lichière, Julie; Leulliot, Nicolas; van Tilbeurgh, Herman; Ortiz-Lombardia, Miguel; Campanacci, Valérie; Cambillau, Christian
2010-01-01
Acidianus filamentous virus 1 (AFV1) (Lipothrixviridae) is an enveloped filamentous virus that was characterized from a crenarchaeal host. It infects Acidianus species that thrive in the acidic hot springs (>85°C and pH <3) of Yellowstone National Park, WY. The AFV1 20.8-kb, linear, double-stranded DNA genome encodes 40 putative open reading frames whose sequences generally show little similarity to other genes in the sequence databases. Because three-dimensional structures are more conserved than sequences and hence are more effective at revealing function, we set out to determine protein structures from putative AFV1 open reading frames (ORF). The crystal structure of ORF157 reveals an α+β protein with a novel fold that remotely resembles the nucleotidyltransferase topology. In vitro, AFV1-157 displays a nuclease activity on linear double-stranded DNA. Alanine substitution mutations demonstrated that E86 is essential to catalysis. AFV1-157 represents a novel class of nuclease, but its exact role in vivo remains to be determined. PMID:20200253
[National Database of Genotypes--ethical and legal issues].
Franková, Vera; Tesínová, Jolana; Brdicka, Radim
2011-01-01
National Database of Genotypes--ethical and legal issues The aim of the project National Database of Genotypes is to outline structure and rules for the database operation collecting information about genotypes of individual persons. The database should be used entirely for health care. Its purpose is to enable physicians to gain quick and easy access to the information about persons requiring specialized care due to their genetic constitution. In the future, another introduction of new genetic tests into the clinical practice can be expected thus the database of genotypes facilitates substantial financial savings by exclusion of duplicates of the expensive genetic testing. Ethical questions connected with the creating and functioning of such database concern mainly privacy protection, confidentiality of personal sensitive data, protection of database from misuse, consent with participation and public interests. Due to necessity of correct interpretation by qualified professional (= clinical geneticist), particular categorization of genetic data within the database is discussed. The function of proposed database has to be governed in concordance with the Czech legislation together with solving ethical problems.
DDRprot: a database of DNA damage response-related proteins.
Andrés-León, Eduardo; Cases, Ildefonso; Arcas, Aida; Rojas, Ana M
2016-01-01
The DNA Damage Response (DDR) signalling network is an essential system that protects the genome's integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used.Database URL: http://ddr.cbbio.es. © The Author(s) 2016. Published by Oxford University Press.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2007-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (www.ncbi.nlm.nih.gov).
How effective are DNA barcodes in the identification of African rainforest trees?
Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W; Kenfack, David; Chuyong, George B; Cruaud, Corinne; Hardy, Olivier J
2013-01-01
DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95-100% success), but less for species identification (71-88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84-90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications.
How Effective Are DNA Barcodes in the Identification of African Rainforest Trees?
Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W.; Kenfack, David; Chuyong, George B.; Cruaud, Corinne; Hardy, Olivier J.
2013-01-01
Background DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. Methodology/Principal Findings We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95–100% success), but less for species identification (71–88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84–90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Conclusions/Significance Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications. PMID:23565134
National Administrative Databases in Adult Spinal Deformity Surgery: A Cautionary Tale.
Buckland, Aaron J; Poorman, Gregory; Freitag, Robert; Jalai, Cyrus; Klineberg, Eric O; Kelly, Michael; Passias, Peter G
2017-08-15
Comparison between national administrative databases and a prospective multicenter physician managed database. This study aims to assess the applicability of National Administrative Databases (NADs) in adult spinal deformity (ASD). Our hypothesis is that NADs do not include comparable patients as in a physician-managed database (PMD) for surgical outcomes in adult spinal deformity. NADs such as National Inpatient Sample (NIS) and National Surgical Quality Improvement Program (NSQIP) provide large numbers of publications owing to ease of data access and lack of IRB approval requirement. These databases utilize billing codes, not clinical inclusion criteria, and have not been validated against PMDs in ASD surgery. The NIS was searched for years 2002 to 2012 and NSQIP for years 2006 to 2013 using validated spinal deformity diagnostic codes. Procedural codes (ICD-9 and CPT) were then applied to each database. A multicenter PMD including years 2008 to 2015 was used for comparison. Databases were assessed for levels fused, osteotomies, decompressed levels, and invasiveness. Database comparisons for surgical details were made in all patients, and also for patients with ≥ 5 level spinal fusions. Approximately, 37,368 NIS, 1291 NSQIP, and 737 PMD patients were identified. NADs showed an increased use of deformity billing codes over the study period (NIS doubled, 68x NSQIP, P < 0.001), but ASD remained stable in the PMD.Surgical invasiveness, levels fused and use of 3-column osteotomy (3-CO) were significantly lower for all patients in the NIS (11.4-13.7) and NSQIP databases (6.4-12.7) compared with PMD (27.5-32.3). When limited to patients with ≥5 levels, invasiveness, levels fused, and use of 3-CO remained significantly higher in the PMD compared with NADs (P < 0.001). National databases NIS and NSQIP do not capture the same patient population as is captured in PMDs in ASD. Physicians should remain cautious in interpreting conclusions drawn from these databases. 4.
The National State Policy Database. Quick Turn Around (QTA).
ERIC Educational Resources Information Center
Ahearn, Eileen; Jackson, Terry
This paper describes the National State Policy Database (NSPD), a full-text searchable database of state and federal education regulations for special education. It summarizes the history of the NSPD and reports on a survey of state directors or their designees as to their use of the database and their suggestions for its future expansion. The…
Paula, Débora P.; Linard, Benjamin; Crampton-Platt, Alex; Srivathsan, Amrita; Timmermans, Martijn J. T. N.; Sujii, Edison R.; Pires, Carmen S. S.; Souza, Lucas M.; Andow, David A.; Vogler, Alfried P.
2016-01-01
Characterizing trophic networks is fundamental to many questions in ecology, but this typically requires painstaking efforts, especially to identify the diet of small generalist predators. Several attempts have been devoted to develop suitable molecular tools to determine predatory trophic interactions through gut content analysis, and the challenge has been to achieve simultaneously high taxonomic breadth and resolution. General and practical methods are still needed, preferably independent of PCR amplification of barcodes, to recover a broader range of interactions. Here we applied shotgun-sequencing of the DNA from arthropod predator gut contents, extracted from four common coccinellid and dermapteran predators co-occurring in an agroecosystem in Brazil. By matching unassembled reads against six DNA reference databases obtained from public databases and newly assembled mitogenomes, and filtering for high overlap length and identity, we identified prey and other foreign DNA in the predator guts. Good taxonomic breadth and resolution was achieved (93% of prey identified to species or genus), but with low recovery of matching reads. Two to nine trophic interactions were found for these predators, some of which were only inferred by the presence of parasitoids and components of the microbiome known to be associated with aphid prey. Intraguild predation was also found, including among closely related ladybird species. Uncertainty arises from the lack of comprehensive reference databases and reliance on low numbers of matching reads accentuating the risk of false positives. We discuss caveats and some future prospects that could improve the use of direct DNA shotgun-sequencing to characterize arthropod trophic networks. PMID:27622637
Exploring the ancestry differentiation and inference capacity of the 28-plex AISNPs.
Hao, Wei-Qi; Liu, Jing; Jiang, Li; Han, Jun-Ping; Wang, Ling; Li, Jiu-Ling; Ma, Quan; Liu, Chao; Wang, Hui-Jun; Li, Cai-Xia
2018-06-07
Inferring an unknown DNA's ancestry using a set of ancestry-informative single nucleotide polymorphisms (SNPs) in forensic science is useful to provide investigative leads. This is especially true when there is no DNA database match or specified suspect. Thus, a set of SNPs with highly robust and balanced differential power is strongly demanded in forensic science. In addition, it is also necessary to build a genotyping database for estimating the ancestry of an individual or an unknown DNA. For the differentiation of Africans, Europeans, East Asians, Native Americans, and Oceanians, the Global Nano set that includes just 31 SNPs was developed by de la Puente et al. Its ability for differentiation and balance was evaluated using the genotype data of the 1000 Genomes Phase III project and the Stanford University HGDP-CEPH. Just 402 samples were genotyped and analyzed as a reference set based on statistical methods. To validate the differentiating capacity using more samples, we developed a single-tube 28-plex SNP assay in which the SNPs were chosen from the 31 allelic loci of the Global AIMs Nano set. Three tri-allelic SNPs used to differentiate mixed-source DNA contribute little to population differentiation and were excluded here. Then, 998 individuals from 21 populations were typed, and these genotypes were combined with the genotype data obtained from 1000 Genomes Phase III and the Stanford University HGDP-CEPH (3090 total samples,43 populations) to estimate the power of this multiplex assay and build a database for the further inference of an individual or an unknown DNA sample in forensic practice.
BEAUTY-X: enhanced BLAST searches for DNA queries.
Worley, K C; Culpepper, P; Wiese, B A; Smith, R F
1998-01-01
BEAUTY (BLAST Enhanced Alignment Utility) is an enhanced version of the BLAST database search tool that facilitates identification of the functions of matched sequences. Three recent improvements to the BEAUTY program described here make the enhanced output (1) available for DNA queries, (2) available for searches of any protein database, and (3) more up-to-date, with periodic updates of the domain information. BEAUTY searches of the NCBI and EMBL non-redundant protein sequence databases are available from the BCM Search Launcher Web pages (http://gc.bcm.tmc. edu:8088/search-launcher/launcher.html). BEAUTY Post-Processing of submitted search results is available using the BCM Search Launcher Batch Client (version 2.6) (ftp://gc.bcm.tmc. edu/pub/software/search-launcher/). Example figures are available at http://dot.bcm.tmc. edu:9331/papers/beautypp.html (kworley,culpep)@bcm.tmc.edu
Ryan, K; Williams, D Gareth; Balding, David J
2016-11-01
Many DNA profiles recovered from crime scene samples are of a quality that does not allow them to be searched against, nor entered into, databases. We propose a method for the comparison of profiles arising from two DNA samples, one or both of which can have multiple donors and be affected by low DNA template or degraded DNA. We compute likelihood ratios to evaluate the hypothesis that the two samples have a common DNA donor, and hypotheses specifying the relatedness of two donors. Our method uses a probability distribution for the genotype of the donor of interest in each sample. This distribution can be obtained from a statistical model, or we can exploit the ability of trained human experts to assess genotype probabilities, thus extracting much information that would be discarded by standard interpretation rules. Our method is compatible with established methods in simple settings, but is more widely applicable and can make better use of information than many current methods for the analysis of mixed-source, low-template DNA profiles. It can accommodate uncertainty arising from relatedness instead of or in addition to uncertainty arising from noisy genotyping. We describe a computer program GPMDNA, available under an open source licence, to calculate LRs using the method presented in this paper. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The GHEP–EMPOP collaboration on mtDNA population data—A new resource for forensic casework
Prieto, L.; Zimmermann, B.; Goios, A.; Rodriguez-Monge, A.; Paneto, G.G.; Alves, C.; Alonso, A.; Fridman, C.; Cardoso, S.; Lima, G.; Anjos, M.J.; Whittle, M.R.; Montesino, M.; Cicarelli, R.M.B.; Rocha, A.M.; Albarrán, C.; de Pancorbo, M.M.; Pinheiro, M.F.; Carvalho, M.; Sumita, D.R.; Parson, W.
2011-01-01
Mitochondrial DNA (mtDNA) population data for forensic purposes are still scarce for some populations, which may limit the evaluation of forensic evidence especially when the rarity of a haplotype needs to be determined in a database search. In order to improve the collection of mtDNA lineages from the Iberian and South American subcontinents, we here report the results of a collaborative study involving nine laboratories from the Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics (GHEP-ISFG) and EMPOP. The individual laboratories contributed population data that were generated throughout the past 10 years, but in the majority of cases have not been made available to the scientific community. A total of 1019 haplotypes from Iberia (Basque Country, 2 general Spanish populations, 2 North and 1 Central Portugal populations), and Latin America (3 populations from São Paulo) were collected, reviewed and harmonized according to defined EMPOP criteria. The majority of data ambiguities that were found during the reviewing process (41 in total) were transcription errors confirming that the documentation process is still the most error-prone stage in reporting mtDNA population data, especially when performed manually. This GHEP–EMPOP collaboration has significantly improved the quality of the individual mtDNA datasets and adds mtDNA population data as valuable resource to the EMPOP database (www.empop.org). PMID:21075696
Kim, Mi Ae; Rhee, Jae-Sung; Kim, Tae Ha; Lee, Jung Sick; Choi, Ah-Young; Choi, Beom-Soon; Choi, Ik-Young; Sohn, Young Chang
2017-03-09
In order to characterize the female or male transcriptome of the Pacific abalone and further increase genomic resources, we sequenced the mRNA of full-length complementary DNA (cDNA) libraries derived from pooled tissues of female and male Haliotis discus hannai by employing the Iso-Seq protocol of the PacBio RSII platform. We successfully assembled whole full-length cDNA sequences and constructed a transcriptome database that included isoform information. After clustering, a total of 15,110 and 12,145 genes that coded for proteins were identified in female and male abalones, respectively. A total of 13,057 putative orthologs were retained from each transcriptome in abalones. Overall Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyzed in each database showed a similar composition between sexes. In addition, a total of 519 and 391 isoforms were genome-widely identified with at least two isoforms from female and male transcriptome databases. We found that the number of isoforms and their alternatively spliced patterns are variable and sex-dependent. This information represents the first significant contribution to sex-preferential genomic resources of the Pacific abalone. The availability of whole female and male transcriptome database and their isoform information will be useful to improve our understanding of molecular responses and also for the analysis of population dynamics in the Pacific abalone.
Kim, Mi Ae; Rhee, Jae-Sung; Kim, Tae Ha; Lee, Jung Sick; Choi, Ah-Young; Choi, Beom-Soon; Choi, Ik-Young; Sohn, Young Chang
2017-01-01
In order to characterize the female or male transcriptome of the Pacific abalone and further increase genomic resources, we sequenced the mRNA of full-length complementary DNA (cDNA) libraries derived from pooled tissues of female and male Haliotis discus hannai by employing the Iso-Seq protocol of the PacBio RSII platform. We successfully assembled whole full-length cDNA sequences and constructed a transcriptome database that included isoform information. After clustering, a total of 15,110 and 12,145 genes that coded for proteins were identified in female and male abalones, respectively. A total of 13,057 putative orthologs were retained from each transcriptome in abalones. Overall Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyzed in each database showed a similar composition between sexes. In addition, a total of 519 and 391 isoforms were genome-widely identified with at least two isoforms from female and male transcriptome databases. We found that the number of isoforms and their alternatively spliced patterns are variable and sex-dependent. This information represents the first significant contribution to sex-preferential genomic resources of the Pacific abalone. The availability of whole female and male transcriptome database and their isoform information will be useful to improve our understanding of molecular responses and also for the analysis of population dynamics in the Pacific abalone. PMID:28282934
Lapointe, Martine; Rogic, Anita; Bourgoin, Sarah; Jolicoeur, Christine; Séguin, Diane
2015-11-01
In recent years, sophisticated technology has significantly increased the sensitivity and analytical power of genetic analyses so that very little starting material may now produce viable genetic profiles. This sensitivity however, has also increased the risk of detecting unknown genetic profiles assumed to be that of the perpetrator, yet originate from extraneous sources such as from crime scene workers. These contaminants may mislead investigations, keeping criminal cases active and unresolved for long spans of time. Voluntary submission of DNA samples from crime scene workers is fairly low, therefore we have created a promotional method for our staff elimination database that has resulted in a significant increase in voluntary samples since 2011. Our database enforces privacy safeguards and allows for optional anonymity to all staff members. We also offer information sessions at various police precincts to advise crime scene workers of the importance and success of our staff elimination database. This study, a pioneer in its field, has obtained 327 voluntary submissions from crime scene workers to date, of which 46 individual profiles (14%) have been matched to 58 criminal cases. By implementing our methods and respect for individual privacy, forensic laboratories everywhere may see similar growth and success in explaining unidentified genetic profiles in stagnate criminal cases. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.
National Vulnerability Database (NVD)
National Institute of Standards and Technology Data Gateway
National Vulnerability Database (NVD) (Web, free access) NVD is a comprehensive cyber security vulnerability database that integrates all publicly available U.S. Government vulnerability resources and provides references to industry resources. It is based on and synchronized with the CVE vulnerability naming standard.
Valdez-Moreno, Martha; Quintal-Lizama, Carolina; Gómez-Lozano, Ricardo; García-Rivas, María Del Carmen
2012-01-01
In the Mexican Caribbean, the exotic lionfish Pterois volitans has become a species of great concern because of their predatory habits and rapid expansion onto the Mesoamerican coral reef, the second largest continuous reef system in the world. This is the first report of DNA identification of stomach contents of lionfish using the barcode of life reference database (BOLD). We confirm with barcoding that only Pterois volitans is apparently present in the Mexican Caribbean. We analyzed the stomach contents of 157 specimens of P. volitans from various locations in the region. Based on DNA matches in the Barcode of Life Database (BOLD) and GenBank, we identified fishes from five orders, 14 families, 22 genera and 34 species in the stomach contents. The families with the most species represented were Gobiidae and Apogonidae. Some prey taxa are commercially important species. Seven species were new records for the Mexican Caribbean: Apogon mosavi, Coryphopterus venezuelae, C. thrix, C. tortugae, Lythrypnus minimus, Starksia langi and S. ocellata. DNA matches, as well as the presence of intact lionfish in the stomach contents, indicate some degree of cannibalism, a behavior confirmed in this species by the first time. We obtained 45 distinct crustacean prey sequences, from which only 20 taxa could be identified from the BOLD and GenBank databases. The matches were primarily to Decapoda but only a single taxon could be identified to the species level, Euphausia americana. This technique proved to be an efficient and useful method, especially since prey species could be identified from partially-digested remains. The primary limitation is the lack of comprehensive coverage of potential prey species in the region in the BOLD and GenBank databases, especially among invertebrates.
Valdez-Moreno, Martha; Quintal-Lizama, Carolina; Gómez-Lozano, Ricardo; García-Rivas, María del Carmen
2012-01-01
Background In the Mexican Caribbean, the exotic lionfish Pterois volitans has become a species of great concern because of their predatory habits and rapid expansion onto the Mesoamerican coral reef, the second largest continuous reef system in the world. This is the first report of DNA identification of stomach contents of lionfish using the barcode of life reference database (BOLD). Methodology/Principal Findings We confirm with barcoding that only Pterois volitans is apparently present in the Mexican Caribbean. We analyzed the stomach contents of 157 specimens of P. volitans from various locations in the region. Based on DNA matches in the Barcode of Life Database (BOLD) and GenBank, we identified fishes from five orders, 14 families, 22 genera and 34 species in the stomach contents. The families with the most species represented were Gobiidae and Apogonidae. Some prey taxa are commercially important species. Seven species were new records for the Mexican Caribbean: Apogon mosavi, Coryphopterus venezuelae, C. thrix, C. tortugae, Lythrypnus minimus, Starksia langi and S. ocellata. DNA matches, as well as the presence of intact lionfish in the stomach contents, indicate some degree of cannibalism, a behavior confirmed in this species by the first time. We obtained 45 distinct crustacean prey sequences, from which only 20 taxa could be identified from the BOLD and GenBank databases. The matches were primarily to Decapoda but only a single taxon could be identified to the species level, Euphausia americana. Conclusions/Significance This technique proved to be an efficient and useful method, especially since prey species could be identified from partially-digested remains. The primary limitation is the lack of comprehensive coverage of potential prey species in the region in the BOLD and GenBank databases, especially among invertebrates. PMID:22675470
Bodner, Martin; Bastisch, Ingo; Butler, John M; Fimmers, Rolf; Gill, Peter; Gusmão, Leonor; Morling, Niels; Phillips, Christopher; Prinz, Mechthild; Schneider, Peter M; Parson, Walther
2016-09-01
The statistical evaluation of autosomal Short Tandem Repeat (STR) genotypes is based on allele frequencies. These are empirically determined from sets of randomly selected human samples, compiled into STR databases that have been established in the course of population genetic studies. There is currently no agreed procedure of performing quality control of STR allele frequency databases, and the reliability and accuracy of the data are largely based on the responsibility of the individual contributing research groups. It has been demonstrated with databases of haploid markers (EMPOP for mitochondrial mtDNA, and YHRD for Y-chromosomal loci) that centralized quality control and data curation is essential to minimize error. The concepts employed for quality control involve software-aided likelihood-of-genotype, phylogenetic, and population genetic checks that allow the researchers to compare novel data to established datasets and, thus, maintain the high quality required in forensic genetics. Here, we present STRidER (http://strider.online), a publicly available, centrally curated online allele frequency database and quality control platform for autosomal STRs. STRidER expands on the previously established ENFSI DNA WG STRbASE and applies standard concepts established for haploid and autosomal markers as well as novel tools to reduce error and increase the quality of autosomal STR data. The platform constitutes a significant improvement and innovation for the scientific community, offering autosomal STR data quality control and reliable STR genotype estimates. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F
2012-01-01
Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
Fortin, Connor H; Schulze, Katharina V; Babbitt, Gregory A
2015-01-01
It is now widely-accepted that DNA sequences defining DNA-protein interactions functionally depend upon local biophysical features of DNA backbone that are important in defining sites of binding interaction in the genome (e.g. DNA shape, charge and intrinsic dynamics). However, these physical features of DNA polymer are not directly apparent when analyzing and viewing Shannon information content calculated at single nucleobases in a traditional sequence logo plot. Thus, sequence logos plots are severely limited in that they convey no explicit information regarding the structural dynamics of DNA backbone, a feature often critical to binding specificity. We present TRX-LOGOS, an R software package and Perl wrapper code that interfaces the JASPAR database for computational regulatory genomics. TRX-LOGOS extends the traditional sequence logo plot to include Shannon information content calculated with regard to the dinucleotide-based BI-BII conformation shifts in phosphate linkages on the DNA backbone, thereby adding a visual measure of intrinsic DNA flexibility that can be critical for many DNA-protein interactions. TRX-LOGOS is available as an R graphics module offered at both SourceForge and as a download supplement at this journal. To demonstrate the general utility of TRX logo plots, we first calculated the information content for 416 Saccharomyces cerevisiae transcription factor binding sites functionally confirmed in the Yeastract database and matched to previously published yeast genomic alignments. We discovered that flanking regions contain significantly elevated information content at phosphate linkages than can be observed at nucleobases. We also examined broader transcription factor classifications defined by the JASPAR database, and discovered that many general signatures of transcription factor binding are locally more information rich at the level of DNA backbone dynamics than nucleobase sequence. We used TRX-logos in combination with MEGA 6.0 software for molecular evolutionary genetics analysis to visually compare the human Forkhead box/FOX protein evolution to its binding site evolution. We also compared the DNA binding signatures of human TP53 tumor suppressor determined by two different laboratory methods (SELEX and ChIP-seq). Further analysis of the entire yeast genome, center aligned at the start codon, also revealed a distinct sequence-independent 3 bp periodic pattern in information content, present only in coding region, and perhaps indicative of the non-random organization of the genetic code. TRX-LOGOS is useful in any situation in which important information content in DNA can be better visualized at the positions of phosphate linkages (i.e. dinucleotides) where the dynamic properties of the DNA backbone functions to facilitate DNA-protein interaction.
Widespread recombination in published animal mtDNA sequences.
Tsaousis, A D; Martin, D P; Ladoukakis, E D; Posada, D; Zouros, E
2005-04-01
Mitochondrial DNA (mtDNA) recombination has been observed in several animal species, but there are doubts as to whether it is common or only occurs under special circumstances. Animal mtDNA sequences retrieved from public databases were unambiguously aligned and rigorously tested for evidence of recombination. At least 30 recombination events were detected among 186 alignments examined. Recombinant sequences were found in invertebrates and vertebrates, including primates. It appears that mtDNA recombination may occur regularly in the animal cell but rarely produces new haplotypes because of homoplasmy. Common animal mtDNA recombination would necessitate a reexamination of phylogenetic and biohistorical inference based on the assumption of clonal mtDNA transmission. Recombination may also have an important role in producing and purging mtDNA mutations and thus in mtDNA-based diseases and senescence.
NASA Astrophysics Data System (ADS)
Pennington, Catherine; Freeborough, Katy; Dashwood, Claire; Dijkstra, Tom; Lawrie, Kenneth
2015-11-01
The British Geological Survey (BGS) is the national geological agency for Great Britain that provides geoscientific information to government, other institutions and the public. The National Landslide Database has been developed by the BGS and is the focus for national geohazard research for landslides in Great Britain. The history and structure of the geospatial database and associated Geographical Information System (GIS) are explained, along with the future developments of the database and its applications. The database is the most extensive source of information on landslides in Great Britain with over 17,000 records of landslide events to date, each documented as fully as possible for inland, coastal and artificial slopes. Data are gathered through a range of procedures, including: incorporation of other databases; automated trawling of current and historical scientific literature and media reports; new field- and desk-based mapping technologies with digital data capture, and using citizen science through social media and other online resources. This information is invaluable for directing the investigation, prevention and mitigation of areas of unstable ground in accordance with Government planning policy guidelines. The national landslide susceptibility map (GeoSure) and a national landslide domains map currently under development, as well as regional mapping campaigns, rely heavily on the information contained within the landslide database. Assessing susceptibility to landsliding requires knowledge of the distribution of failures, an understanding of causative factors, their spatial distribution and likely impacts, whilst understanding the frequency and types of landsliding present is integral to modelling how rainfall will influence the stability of a region. Communication of landslide data through the Natural Hazard Partnership (NHP) and Hazard Impact Model contributes to national hazard mitigation and disaster risk reduction with respect to weather and climate. Daily reports of landslide potential are published by BGS through the NHP partnership and data collected for the National Landslide Database are used widely for the creation of these assessments. The National Landslide Database is freely available via an online GIS and is used by a variety of stakeholders for research purposes.
Aviation Safety Issues Database
NASA Technical Reports Server (NTRS)
Morello, Samuel A.; Ricks, Wendell R.
2009-01-01
The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.
Burnett, Leslie; Barlow-Stewart, Kris; Proos, Anné L; Aizenberg, Harry
2003-05-01
This article describes a generic model for access to samples and information in human genetic databases. The model utilises a "GeneTrustee", a third-party intermediary independent of the subjects and of the investigators or database custodians. The GeneTrustee model has been implemented successfully in various community genetics screening programs and has facilitated research access to genetic databases while protecting the privacy and confidentiality of research subjects. The GeneTrustee model could also be applied to various types of non-conventional genetic databases, including neonatal screening Guthrie card collections, and to forensic DNA samples.
Identification of food and beverage spoilage yeasts from DNA sequence analyses
USDA-ARS?s Scientific Manuscript database
Detection, identification, and classification of yeasts has undergone a major transformation in the last decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined DNA sequences from domains 1 and 2 (D1/D2) of th...
Identification of DNA Methyltransferase Genes in Human Pathogenic Bacteria by Comparative Genomics.
Brambila-Tapia, Aniel Jessica Leticia; Poot-Hernández, Augusto Cesar; Perez-Rueda, Ernesto; Rodríguez-Vázquez, Katya
2016-06-01
DNA methylation plays an important role in gene expression and virulence in some pathogenic bacteria. In this report, we describe DNA methyltransferases (MTases) present in human pathogenic bacteria and compared them with related species, which are not pathogenic or less pathogenic, based in comparative genomics. We performed a search in the KEGG database of the KEGG database orthology groups associated with adenine and cytosine DNA MTase activities (EC: 2.1.1.37, EC: 2.1.1.113 and EC: 2.1.1.72) in 37 human pathogenic species and 18 non/less pathogenic relatives and performed comparisons of the number of these MTases sequences according to their genome size, the DNA MTase type and with their non-less pathogenic relatives. We observed that Helicobacter pylori and Neisseria spp. presented the highest number of MTases while ten different species did not present a predicted DNA MTase. We also detected a significant increase of adenine MTases over cytosine MTases (2.19 vs. 1.06, respectively, p < 0.001). Adenine MTases were the only MTases associated with restriction modification systems and DNA MTases associated with type I restriction modification systems were more numerous than those associated with type III restriction modification systems (0.84 vs. 0.17, p < 0.001); additionally, there was no correlation with the genome size and the total number of DNA MTases, indicating that the number of DNA MTases is related to the particular evolution and lifestyle of specific species, regulating the expression of virulence genes in some pathogenic bacteria.
Completion of the National Land Cover Database (NLCD) 1992-2001 Land Cover Change Retrofit Product
The Multi-Resolution Land Characteristics Consortium has supported the development of two national digital land cover products: the National Land Cover Dataset (NLCD) 1992 and National Land Cover Database (NLCD) 2001. Substantial differences in imagery, legends, and methods betwe...
NBIC: National Ballast Information Clearinghouse
Smithsonian Environmental Research Center Logo US Coast Guard Logo Submit BW Report | Search NBIC Database | NBIC Research & Development | NBIC News | Home Cite NBIC Database as: National Ballast Information Clearinghouse 2016. NBIC Online Database. Electronic publication, Smithsonian Environmental Research Center &
Image Encryption Algorithm Based on Hyperchaotic Maps and Nucleotide Sequences Database
2017-01-01
Image encryption technology is one of the main means to ensure the safety of image information. Using the characteristics of chaos, such as randomness, regularity, ergodicity, and initial value sensitiveness, combined with the unique space conformation of DNA molecules and their unique information storage and processing ability, an efficient method for image encryption based on the chaos theory and a DNA sequence database is proposed. In this paper, digital image encryption employs a process of transforming the image pixel gray value by using chaotic sequence scrambling image pixel location and establishing superchaotic mapping, which maps quaternary sequences and DNA sequences, and by combining with the logic of the transformation between DNA sequences. The bases are replaced under the displaced rules by using DNA coding in a certain number of iterations that are based on the enhanced quaternary hyperchaotic sequence; the sequence is generated by Chen chaos. The cipher feedback mode and chaos iteration are employed in the encryption process to enhance the confusion and diffusion properties of the algorithm. Theoretical analysis and experimental results show that the proposed scheme not only demonstrates excellent encryption but also effectively resists chosen-plaintext attack, statistical attack, and differential attack. PMID:28392799
DNA methylation biomarkers for head and neck squamous cell carcinoma.
Zhou, Chongchang; Ye, Meng; Ni, Shumin; Li, Qun; Ye, Dong; Li, Jinyun; Shen, Zhishen; Deng, Hongxia
2018-06-21
DNA methylation plays an important role in the etiology and pathogenesis of head and neck squamous cell carcinoma (HNSCC). The current study aimed to identify aberrantly methylated-differentially expressed genes (DEGs) by a comprehensive bioinformatics analysis. In addition, we screened for DEGs affected by DNA methylation modification and further investigated their prognostic values for HNSCC. We included microarray data of DNA methylation (GSE25093 and GSE33202) and gene expression (GSE23036 and GSE58911) from Gene Expression Omnibus. Aberrantly methylated-DEGs were analyzed with R software. The Cancer Genome Atlas (TCGA) RNA sequencing and DNA methylation (Illumina HumanMethylation450) databases were utilized for validation. In total, 27 aberrantly methylated genes accompanied by altered expression were identified. After confirmation by The Cancer Genome Atlas (TCGA) database, 2 hypermethylated-low-expression genes (FAM135B and ZNF610) and 2 hypomethylated-high-expression genes (HOXA9 and DCC) were identified. A receiver operating characteristic (ROC) curve confirmed the diagnostic value of these four methylated genes for HNSCC. Multivariate Cox proportional hazards analysis showed that FAM135B methylation was a favorable independent prognostic biomarker for overall survival of HNSCC patients.
Kao, Wei-Heng; Hong, Ji-Hong; See, Lai-Chu; Yu, Huang-Ping; Hsu, Jun-Te; Chou, I-Jun; Chou, Wen-Chi; Chiou, Meng-Jiun; Wang, Chun-Chieh; Kuo, Chang-Fu
2017-08-16
We aimed to evaluate the validity of cancer diagnosis in the National Health Insurance (NHI) database, which has routinely collected the health information of almost the entire Taiwanese population since 1995, compared with the Taiwan National Cancer Registry (NCR). There were 26,542,445 active participants registered in the NHI database between 2001 and 2012. National Cancer Registry and NHI database records were compared for cancer diagnosis; date of cancer diagnosis; and 1, 2, and 5 year survival. In addition, the 10 leading causes of cancer deaths in Taiwan were analyzed. There were 908,986 cancer diagnoses in NCR and NHI database and 782,775 (86.1%) in both, with 53,192 (5.9%) in the NHI database only and 73,019 (8.0%) in the NCR only. The positive predictive value of the NHI database cancer diagnoses was 94% for all cancers; the positive predictive value of the 10 specific cancers ranged from 95% (lung cancer) to 82% (cervical cancer). The date of diagnosis in the NHI database was generally delayed by a median of 15 days (interquartile range 8-18) compared with the NCR. The 1, 2, and 5 year survival rates were 71.21%, 60.85%, and 47.44% using the NHI database and were 71.18%, 60.17%, and 46.09% using NCR data. Recording of cancer diagnoses and survival estimates based on these diagnosis codes in the NHI database are generally consistent with the NCR. Studies using NHI database data must pay careful attention to eligibility and record linkage; use of both sources is recommended. Copyright © 2017 John Wiley & Sons, Ltd.
The Danish Inguinal Hernia database.
Friis-Andersen, Hans; Bisgaard, Thue
2016-01-01
To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. Patients ≥18 years operated for groin hernia. Type and size of hernia, primary or recurrent, type of surgical repair procedure, mesh and mesh fixation methods. According to the Danish National Health Act, surgeons are obliged to register all hernia repairs immediately after surgery (3 minute registration time). All institutions have continuous access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles the medical management of the database. The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015). A total of 49 peer-reviewed national and international publications have been published from the database (June 2015). The Danish Inguinal Hernia Database is fully active monitoring surgical quality and contributes to the national and international surgical society to improve outcome after groin hernia repair.
Reyes-Aldasoro, Constantino Carlos
2017-01-01
In this work, the public database of biomedical literature PubMed was mined using queries with combinations of keywords and year restrictions. It was found that the proportion of Cancer-related entries per year in PubMed has risen from around 6% in 1950 to more than 16% in 2016. This increase is not shared by other conditions such as AIDS, Malaria, Tuberculosis, Diabetes, Cardiovascular, Stroke and Infection some of which have, on the contrary, decreased as a proportion of the total entries per year. Organ-related queries were performed to analyse the variation of some specific cancers. A series of queries related to incidence, funding, and relationship with DNA, Computing and Mathematics, were performed to test correlation between the keywords, with the hope of elucidating the cause behind the rise of Cancer in PubMed. Interestingly, the proportion of Cancer-related entries that contain "DNA", "Computational" or "Mathematical" have increased, which suggests that the impact of these scientific advances on Cancer has been stronger than in other conditions. It is important to highlight that the results obtained with the data mining approach here presented are limited to the presence or absence of the keywords on a single, yet extensive, database. Therefore, results should be observed with caution. All the data used for this work is publicly available through PubMed and the UK's Office for National Statistics. All queries and figures were generated with the software platform Matlab and the files are available as supplementary material.
Chaves, Camila L; Degen, Bernd; Pakull, Birte; Mader, Malte; Honorio, Euridice; Ruas, Paulo; Tysklind, Niklas; Sebbenn, Alexandre M
2018-06-27
Deforestation-reinforced by illegal logging-is a serious problem in many tropical regions and causes pervasive environmental and economic damage. Existing laws that intend to reduce illegal logging need efficient, fraud resistant control methods. We developed a genetic reference database for Jatoba (Hymenaea courbaril), an important, high value timber species from the Neotropics. The data set can be used for controls on declarations of wood origin. Samples from 308 Hymenaea trees from 12 locations in Brazil, Bolivia, Peru, and French Guiana have been collected and genotyped on 10 nuclear microsatellites (nSSRs), 13 chloroplast SNPs (cpSNP), and 1 chloroplast indel marker. The chloroplast gene markers have been developed using Illumina DNA sequencing. Bayesian cluster analysis divided the individuals based on the nSSRs into 8 genetic groups. Using self-assignment tests, the power of the genetic reference database to judge on declarations on the location has been tested for 3 different assignment methods. We observed a strong genetic differentiation among locations leading to high and reliable self-assignment rates for the locations between 50% to 100% (average of 88%). Although all 3 assignment methods came up with similar mean self-assignment rates, there were differences for some locations linked to the level of genetic diversity, differentiation, and heterozygosity. Our results show that the nuclear and chloroplast gene markers are effective to be used for a genetic certification system and can provide national and international authorities with a robust tool to confirm legality of timber.
Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas
2009-06-01
The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
78 FR 24420 - Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-04-25
... academics and other interested parties outside of the government. Generally, the National Mortgage Database... project, known as the ``National Mortgage Database,'' which is a joint effort of FHFA and the Consumer... a database of timely and otherwise unavailable residential mortgage market information to be made...
Asamizu, E; Nakamura, Y; Sato, S; Tabata, S
2000-06-30
For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-08-07
... Access Request and Use Certification (previously National Database for Autism Research Data Access... approval for use of the National Database for Autism Research (NDAR) Data Use Certification (DUC) Form...
NATIONAL NOSOCOMIAL INFECTIONS SURVEILLANCE SYSTEM (NNIS)
The National Nosocomial Infections Surveillance (NNIS) System is a cooperative effort that began in 1970 between the Centers for Disease Control and Prevention (CDC) and participating hospitals to create a national nosocomial infections database. The database is used to describe ...
DNA barcoding of vouchered xylarium wood specimens of nine endangered Dalbergia species.
Yu, Min; Jiao, Lichao; Guo, Juan; Wiedenhoeft, Alex C; He, Tuo; Jiang, Xiaomei; Yin, Yafang
2017-12-01
ITS2+ trnH - psbA was the best combination of DNA barcode to resolve the Dalbergia wood species studied. We demonstrate the feasibility of building a DNA barcode reference database using xylarium wood specimens. The increase in illegal logging and timber trade of CITES-listed tropical species necessitates the development of unambiguous identification methods at the species level. For these methods to be fully functional and deployable for law enforcement, they must work using wood or wood products. DNA barcoding of wood has been promoted as a promising tool for species identification; however, the main barrier to extensive application of DNA barcoding to wood is the lack of a comprehensive and reliable DNA reference library of barcodes from wood. In this study, xylarium wood specimens of nine Dalbergia species were selected from the Wood Collection of the Chinese Academy of Forestry and DNA was then extracted from them for further PCR amplification of eight potential DNA barcode sequences (ITS2, matK, trnL, trnH-psbA, trnV-trnM1, trnV-trnM2, trnC-petN, and trnS-trnG). The barcodes were tested singly and in combination for species-level discrimination ability by tree-based [neighbor-joining (NJ)] and distance-based (TaxonDNA) methods. We found that the discrimination ability of DNA barcodes in combination was higher than any single DNA marker among the Dalbergia species studied, with the best two-marker combination of ITS2+trnH-psbA analyzed with NJ trees performing the best (100% accuracy). These barcodes are relatively short regions (<350 bp) and amplification reactions were performed with high success (≥90%) using wood as the source material, a necessary factor to apply DNA barcoding to timber trade. The present results demonstrate the feasibility of using vouchered xylarium specimens to build DNA barcoding reference databases.
Standard atomic volumes in double-stranded DNA and packing in protein–DNA interfaces
Nadassy, Katalin; Tomás-Oliveira, Isabel; Alberts, Ian; Janin, Joël; Wodak, Shoshana J.
2001-01-01
Standard volumes for atoms in double-stranded B-DNA are derived using high resolution crystal structures from the Nucleic Acid Database (NDB) and compared with corresponding values derived from crystal structures of small organic compounds in the Cambridge Structural Database (CSD). Two different methods are used to compute these volumes: the classical Voronoi method, which does not depend on the size of atoms, and the related Radical Planes method which does. Results show that atomic groups buried in the interior of double-stranded DNA are, on average, more tightly packed than in related small molecules in the CSD. The packing efficiency of DNA atoms at the interfaces of 25 high resolution protein–DNA complexes is determined by computing the ratios between the volumes of interfacial DNA atoms and the corresponding standard volumes. These ratios are found to be close to unity, indicating that the DNA atoms at protein–DNA interfaces are as closely packed as in crystals of B-DNA. Analogous volume ratios, computed for buried protein atoms, are also near unity, confirming our earlier conclusions that the packing efficiency of these atoms is similar to that in the protein interior. In addition, we examine the number, volume and solvent occupation of cavities located at the protein–DNA interfaces and compared them with those in the protein interior. Cavities are found to be ubiquitous in the interfaces as well as inside the protein moieties. The frequency of solvent occupation of cavities is however higher in the interfaces, indicating that those are more hydrated than protein interiors. Lastly, we compare our results with those obtained using two different measures of shape complementarity of the analysed interfaces, and find that the correlation between our volume ratios and these measures, as well as between the measures themselves, is weak. Our results indicate that a tightly packed environment made up of DNA, protein and solvent atoms plays a significant role in protein–DNA recognition. PMID:11504874
77 FR 12234 - Changes in Hydric Soils Database Selection Criteria
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-29
... Conservation Service [Docket No. NRCS-2011-0026] Changes in Hydric Soils Database Selection Criteria AGENCY... Changes to the National Soil Information System (NASIS) Database Selection Criteria for Hydric Soils of the United States. SUMMARY: The National Technical Committee for Hydric Soils (NTCHS) has updated the...
EPA U.S. NATIONAL MARKAL DATABASE: DATABASE DOCUMENTATION
This document describes in detail the U.S. Energy System database developed by EPA's Integrated Strategic Assessment Work Group for use with the MARKAL model. The group is part of the Office of Research and Development and is located in the National Risk Management Research Labor...
Code of Federal Regulations, 2010 CFR
2010-10-01
... the CCR database prior to award of a contract or agreement, except for— (1) Purchases that use a... database, or use of CCR data, could compromise the safeguarding of classified information or national...'s written notification of its intention to change the name in the CCR database; comply with the...
Code of Federal Regulations, 2011 CFR
2011-10-01
... the CCR database prior to award of a contract or agreement, except for— (1) Purchases that use a... database, or use of CCR data, could compromise the safeguarding of classified information or national...'s written notification of its intention to change the name in the CCR database; comply with the...
Access to DNA and protein databases on the Internet.
Harper, R
1994-02-01
During the past year, the number of biological databases that can be queried via Internet has dramatically increased. This increase has resulted from the introduction of networking tools, such as Gopher and WAIS, that make it easy for research workers to index databases and make them available for on-line browsing. Biocomputing in the nineties will see the advent of more client/server options for the solution of problems in bioinformatics.
REBASE--a database for DNA restriction and modification: enzymes, genes and genomes.
Roberts, Richard J; Vincze, Tamas; Posfai, Janos; Macelis, Dana
2015-01-01
REBASE is a comprehensive and fully curated database of information about the components of restriction-modification (RM) systems. It contains fully referenced information about recognition and cleavage sites for both restriction enzymes and methyltransferases as well as commercial availability, methylation sensitivity, crystal and sequence data. All genomes that are completely sequenced are analyzed for RM system components, and with the advent of PacBio sequencing, the recognition sequences of DNA methyltransferases (MTases) are appearing rapidly. Thus, Type I and Type III systems can now be characterized in terms of recognition specificity merely by DNA sequencing. The contents of REBASE may be browsed from the web http://rebase.neb.com and selected compilations can be downloaded by FTP (ftp.neb.com). Monthly updates are also available via email. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Gene regulation knowledge commons: community action takes care of DNA binding transcription factors
Tripathi, Sushil; Vercruysse, Steven; Chawla, Konika; Christie, Karen R.; Blake, Judith A.; Huntley, Rachael P.; Orchard, Sandra; Hermjakob, Henning; Thommesen, Liv; Lægreid, Astrid; Kuiper, Martin
2016-01-01
A large gap remains between the amount of knowledge in scientific literature and the fraction that gets curated into standardized databases, despite many curation initiatives. Yet the availability of comprehensive knowledge in databases is crucial for exploiting existing background knowledge, both for designing follow-up experiments and for interpreting new experimental data. Structured resources also underpin the computational integration and modeling of regulatory pathways, which further aids our understanding of regulatory dynamics. We argue how cooperation between the scientific community and professional curators can increase the capacity of capturing precise knowledge from literature. We demonstrate this with a project in which we mobilize biological domain experts who curate large amounts of DNA binding transcription factors, and show that they, although new to the field of curation, can make valuable contributions by harvesting reported knowledge from scientific papers. Such community curation can enhance the scientific epistemic process. Database URL: http://www.tfcheckpoint.org PMID:27270715
Williamson, J; Maurin, O; Shiba, S N S; van der Bank, H; Pfab, M; Pilusa, M; Kabongo, R M; van der Bank, M
2016-09-01
Species in the cycad genus Encephalartos are listed in CITES Appendix I and as Threatened or Protected Species in terms of South Africa's National Environmental Management: Biodiversity Act (NEM:BA) of 2004. Despite regulations, illegal plant harvesting for medicinal trade has continued in South Africa and resulted in declines in cycad populations and even complete loss of sub-populations. Encephalartos is traded at traditional medicine markets in South Africa in the form of bark strips and stem sections; thus, determining the species traded presents a major challenge due to a lack of characteristic plant parts. Here, a case study is presented on the use of DNA barcoding to identify cycads sold at the Faraday and Warwick traditional medicine markets in Johannesburg and Durban, respectively. Market samples were sequenced for the core DNA barcodes (rbcLa and matK) as well as two additional regions: nrITS and trnH-psbA. The barcoding database for cycads at the University of Johannesburg was utilized to assign query samples to known species. Three approaches were followed: tree-based, similarity-based, and character-based (BRONX) methods. Market samples identified were Encephalartos ferox (Near Threatened), Encephalartos lebomboensis (Endangered), Encephalartos natalensis (Near Threatened), Encephalartos senticosus (Vulnerable), and Encephalartos villosus (Least Concern). Results from this study are crucial for making appropriate assessments and decisions on how to manage these markets.
Using DNA Barcodes to Identify Road-Killed Animals in Two Atlantic Forest Nature Reserves, Brazil
Klippel, Angélica H.; Oliveira, Pablo V.; Britto, Karollini B.; Freire, Bárbara F.; Moreno, Marcel R.; dos Santos, Alexandre R.; Banhos, Aureo; Paneto, Greiciane G.
2015-01-01
Road mortality is the leading source of biodiversity loss in the world, especially due to fragmentation of natural habitats and loss of wildlife. The survey of the main species victims of roadkill is of fundamental importance for the better understanding of the problem, being necessary, for this, the correct species identification. The aim of this study was to verify if DNA barcodes can be applied to identify road-killed samples that often cannot be determined morphologically. For this purpose, 222 vertebrate samples were collected in a stretch of the BR-101 highway that crosses two Discovery Coast Atlantic Forest Natural Reserves, the Sooretama Biological Reserve and the Vale Natural Reserve, in Espírito Santo, Brazil. The mitochondrial COI gene was amplified, sequenced and confronted with the BOLD database. It was possible to identify 62.16% of samples, totaling 62 different species, including Pyrrhura cruentata, Chaetomys subspinosus, Puma yagouaroundi and Leopardus wiedii considered Vulnerable in the National Official List of Species of Endangered Wildlife. The most commonly identified animals were a bat (Molossus molossus), an opossum (Didelphis aurita) and a frog (Trachycephalus mesophaeus) species. Only one reptile was identified using the technique, probably due to lack of reference sequences in BOLD. These data may contribute to a better understanding of the impact of roads on species biodiversity loss and to introduce the DNA barcode technique to road ecology scenarios. PMID:26244644
[Cosmid libraries containing DNA from human chromosome 13].
Kapanadze, B I; Brodianskiĭ, V M; Baranova, A V; Sevat'ianov, S Iu; Fedorova, N D; Kurskov, M M; Kostina, M A; Mironov, A A; Sineokiĭ, S P; Zakhar'ev, V M; Grafodatskiĭ, A S; Modianov, N N; Iankovskiĭ, N K
1996-03-01
We characterized two cosmid libraries constructed from flow-sorted chromosome 13 at the Imperial Cancer Research Fund (ICRF), UK (13,000 clones) and Los Alamos National Laboratory (LANL), USA (17,000 clones). After storage for two years, clones showed high viability (95%) and structural stability. EcoR I and Hind III restriction patterns were studied in more than 500 ICRF and 200 LANL cosmids. The average size of inserts was shown to be 35-37 kb in both the libraries. Most cosmids (83% and 93% of ICRF and LANL libraries, respectively) exceed the lower size limit of DNA fragments that can be packaged and represent a good source for physical mapping of chromosome 13. Total length of inserts is four and five genome equivalents in the ICRF and LANL libraries, respectively. ICRF cosmids showed hybridization to 22 of 24 unique probes tested, which corresponds to a 90% probability of having any DNA fragment represented in the library. More than 1 Mb of chromosome 13 is overlapped by 90 cosmids of 22 groups revealed. A chromosomal region of more than 150 kb, containing the ATP1AL1 gene for alpha-1 peptide of Na+, K(+)-ATPase, is covered by 12 cosmids forming a contig. The results of restriction and hybridization analyses are stored in a CLONE database. These data and all the cosmids described are publicly available.
Data tables for the 1994 National Transit Database report year
DOT National Transportation Integrated Search
1995-12-01
The Data Tables For the 1994 National Transit Database Report Year is one of three publications also referred to as the National Transit Databse Reporting System. The report provides detailed summaries of financial and operating data submitted to FTA...
Landscape features, standards, and semantics in U.S. national topographic mapping databases
Varanka, Dalia
2009-01-01
The objective of this paper is to examine the contrast between local, field-surveyed topographical representation and feature representation in digital, centralized databases and to clarify their ontological implications. The semantics of these two approaches are contrasted by examining the categorization of features by subject domains inherent to national topographic mapping. When comparing five USGS topographic mapping domain and feature lists, results indicate that multiple semantic meanings and ontology rules were applied to the initial digital database, but were lost as databases became more centralized at national scales, and common semantics were replaced by technological terms.
An Audit of the Irish National Intellectual Disability Database
ERIC Educational Resources Information Center
Dodd, Philip; Craig, Sarah; Kelly, Fionnola; Guerin, Suzanne
2010-01-01
This study describes a national data audit of the National Intellectual Disability Database (NIDD). The NIDD is a national information system for intellectual disability (ID) for Ireland. The purpose of this audit was to assess the overall accuracy of information contained on the NIDD, as well as collecting qualitative information to support the…
Risk and Resiliency for Dementia: Comparison of Male and Female Veterans
2017-09-01
from the Veterans Health Administration (VHA) National Patient Care Database (NPCD) 2. Obtain data from the Veterans Health Administration (VHA...National Patient Care Database (NPCD): Months 6-12 In the second quarter, we submitted and received approval to receive data from the VHA NPCD In...injury. We plan to capitalize on our prior experience working with the Veterans Health Administration National Patient Care Database . We will use data
Database Software Selection for the Egyptian National STI Network.
ERIC Educational Resources Information Center
Slamecka, Vladimir
The evaluation and selection of information/data management system software for the Egyptian National Scientific and Technical (STI) Network are described. An overview of the state-of-the-art of database technology elaborates on the differences between information retrieval and database management systems (DBMS). The desirable characteristics of…
Selection and Management of DNA Markers for Use in Genomic Evaluation
USDA-ARS?s Scientific Manuscript database
A database was constructed to store genotypes for 50,972 single-nucleotide polymorphisms (SNP) from the Illumina BovineSNP50 BeadChip for over 30,000 animals. The database allows storage of multiple samples per animal and stores all SNP genotypes for a sample in a single row. An indicator specifies ...
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2011-01-01
GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
NASA Astrophysics Data System (ADS)
Ferdiani, Defika I.; Devi, Fera L.; Koentjana, Johan P.; Milasari, Asri F.; Nur'aini, Indah; Semiarti, Endang
2015-09-01
Natural orchid is one of the most important tropical biodiversity. In Indonesia there are ± 6000 species out of 30000 orchids species in the world, of which there are ± 60 species at Mount Merapi. Repetitive eruption of Merapi have wiped out the biodiversity of orchids, therefore the efforts to conserve the orchids and to establish the database of natural orchids in Mount Merapi are needed. The orchid's database can be created based on DNA analysis, and establish barcoding DNA. DNA-barcodes can be used as molecular markers. The different character of morphology usually shows different pattern in DNA fragments. This research aims to characterize the phenotype and genotype of natural orchids of Mt. Merapi based on morphology and the structure of DNA in trnL-F intergenic region of chloroplasts DNA of orchid. Amplified Fragment Length Polymorphism (AFLP) technique was used to characterize the molecular types of orchids in silico of intergenic space area of orchid chloroplast. In this study, 11 species of orchids were characterized based on morphological and molecular characters. The molecular characters were obtained from trnL-F intergenic region of leaves chloroplasts. The data indicates that there is a conserve DNA pattern in all orchids and the distinctive characters of some orchids. In this study, based on trnL-F intergenic region of chloroplast genome, the phylogenetic tree revealed that 11 species of orchids at Mt. Merapi can be grouped into 2 clades, that matched with morphological characters.
Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao
2018-01-01
Abstract Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. PMID:29069510
Genotype and Phenotype of Echinococcus granulosus Derived from Wild Sheep (Ovis orientalis) in Iran.
Eslami, Ali; Meshgi, Behnam; Jalousian, Fatemeh; Rahmani, Shima; Salari, Mohammad Ali
2016-02-01
The aim of the present study is to determine the characteristics of genotype and phenotype of Echinococcus granulosus derived from wild sheep and to compare them with the strains of E. granulosus sensu stricto (sheep-dog) and E. granulosus camel strain (camel-dog) in Iran. In Khojir National Park, near Tehran, Iran, a fertile hydatid cyst was recently found in the liver of a dead wild sheep (Ovis orientalis). The number of protoscolices (n=6,000) proved enough for an experimental infection in a dog. The characteristics of large and small hooks of metacestode were statistically determined as the sensu stricto strain but not the camel strain (P=0.5). To determine E. granulosus genotype, 20 adult worms of this type were collected from the infected dog. The second internal transcribed spacer (ITS2) of the nuclear ribosomal DNA (rDNA) and cytochrome c oxidase 1 subunit (COX1) of the mitochondrial DNA were amplified from individual adult worm by PCR. Subsequently, the PCR product was sequenced by Sanger method. The lengths of ITS2 and COX1 sequences were 378 and 857 bp, respectively, for all the sequenced samples. The amplified DNA sequences from both ribosomal and mitochondrial genes were highly similar (99% and 98%, respectively) to that of the ovine strain in the GenBank database. The results of the present study indicate that the morpho-molecular features and characteristics of E. granulosus in the Iranian wild sheep are the same as those of the sheep-dog E. granulosus sensu stricto strain.
Effects of soil water holding capacity on evapotranspiration and irrigation scheduling
USDA-ARS?s Scientific Manuscript database
The USDA Natural Resources Conservation Service (NRCS), through the National Cooperative Soil Survey, developed three soil geographic databases that are appropriate for acquiring soil information at the national, regional, and local scales. These relational databases include the National Soil Geogra...
Designs on a National Research Network.
ERIC Educational Resources Information Center
Walsh, John
1988-01-01
Discusses the addition of the National Aeronautics and Space Administration database to the National Science Foundation's NSFnet data communication network. Outlines the history of databases in the United States and enumerates proposed upgrades from a new Office of Science and Technology policy report. (TW)
2011-09-30
DNA profiles. Referred to as geneGIS, the program will provide the ability to display, browse, select, filter and summarize spatial or temporal...of the SPLASH photo-identification records and available DNA profiles is underway through integration and crosschecking by Cascadia and MMI . An...Darwin Core standards where possible and can accommodate the current databases developed for telemetry data at MMI and SPLASH collection records at
Campbell, Rebecca; Fehler-Cabral, Giannina; Bybee, Deborah; Shaw, Jessica
2017-10-01
Throughout the United States, hundreds of thousands of sexual assault kits (SAKs) (also termed "rape kits") have not been submitted by the police for forensic DNA testing. DNA evidence can help sexual assault investigations and prosecutions by identifying offenders, revealing serial offenders through DNA matches across cases, and exonerating those who have been wrongly accused. In this article, we describe a 5-year action research project conducted with 1 city that had large numbers of untested SAKs-Detroit, Michigan-and our examination into why thousands of rape kits in this city were never submitted for forensic DNA testing. This mixed methods study combined ethnographic observations and qualitative interviews to identify stakeholders' perspectives as to why rape kits were not routinely submitted for testing. Then, we quantitatively examined whether these factors may have affected police practices regarding SAK testing, as evidenced by predictable changes in SAK submission rates over time. Chronic resource scarcity only partially explained why the organizations that serve rape victims-the police, crime lab, prosecution, and victim advocacy-could not test all rape kits, investigate all reported sexual assaults, and support all rape survivors. SAK submission rates significantly increased once criminal justice professionals in this city had full access to the FBI DNA forensic database Combined DNA Index System (CODIS), but even then, most SAKs were still not submitted for DNA testing. Building crime laboratories' capacities for DNA testing and training police on the utility of forensic evidence and best practices in sexual assault investigations can help remedy, and possibly prevent, the problem of untested rape kits. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Creating a model to detect dairy cattle farms with poor welfare using a national database.
Krug, C; Haskell, M J; Nunes, T; Stilwell, G
2015-12-01
The objective of this study was to determine whether dairy farms with poor cow welfare could be identified using a national database for bovine identification and registration that monitors cattle deaths and movements. The welfare of dairy cattle was assessed using the Welfare Quality(®) protocol (WQ) on 24 Portuguese dairy farms and on 1930 animals. Five farms were classified as having poor welfare and the other 19 were classified as having good welfare. Fourteen million records from the national cattle database were analysed to identify potential welfare indicators for dairy farms. Fifteen potential national welfare indicators were calculated based on that database, and the link between the results on the WQ evaluation and the national cattle database was made using the identification code of each farm. Within the potential national welfare indicators, only two were significantly different between farms with good welfare and poor welfare, 'proportion of on-farm deaths' (p<0.01) and 'female/male birth ratio' (p<0.05). To determine whether the database welfare indicators could be used to distinguish farms with good welfare from farms with poor welfare, we created a model using the classifier J48 of Waikato Environment for Knowledge Analysis. The model was a decision tree based on two variables, 'proportion of on-farm deaths' and 'calving-to-calving interval', and it was able to correctly identify 70% and 79% of the farms classified as having poor and good welfare, respectively. The national cattle database analysis could be useful in helping official veterinary services in detecting farms that have poor welfare and also in determining which welfare indicators are poor on each particular farm. Copyright © 2015 Elsevier B.V. All rights reserved.
Pyramid Servings Database (PSDB) for NHANES III
The National Cancer Institute developed a database to examine dietary data from the National Center for Health Statistics' Third National Health and Nutrition Examination Survey in terms of servings from each of United States Department of Agriculture's The Food Guide Pyramid's major and minor food groups.
USDA-ARS?s Scientific Manuscript database
USDA National Nutrient Database for Standard Reference Dataset for What We Eat In America, NHANES (Survey-SR) provides the nutrient data for assessing dietary intakes from the national survey What We Eat In America, National Health and Nutrition Examination Survey (WWEIA, NHANES). The current versi...
National Transportation Atlas Databases : 2014
DOT National Transportation Integrated Search
2014-01-01
The National Transportation Atlas Databases 2014 : (NTAD2014) is a set of nationwide geographic datasets of : transportation facilities, transportation networks, associated : infrastructure, and other political and administrative entities. : These da...
National Transportation Atlas Databases : 2015
DOT National Transportation Integrated Search
2015-01-01
The National Transportation Atlas Databases 2015 : (NTAD2015) is a set of nationwide geographic datasets of : transportation facilities, transportation networks, associated : infrastructure, and other political and administrative entities. : These da...
Code of Federal Regulations, 2010 CFR
2010-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.1 Purpose. The purpose of this part is to prescribe requirements and procedures necessary for compliance with the National Transit Database Reporting System and...
Code of Federal Regulations, 2014 CFR
2014-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.1 Purpose. The purpose of this part is to prescribe requirements and procedures necessary for compliance with the National Transit Database Reporting System and...
Code of Federal Regulations, 2012 CFR
2012-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.1 Purpose. The purpose of this part is to prescribe requirements and procedures necessary for compliance with the National Transit Database Reporting System and...
Code of Federal Regulations, 2013 CFR
2013-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.1 Purpose. The purpose of this part is to prescribe requirements and procedures necessary for compliance with the National Transit Database Reporting System and...
Code of Federal Regulations, 2011 CFR
2011-10-01
... TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.1 Purpose. The purpose of this part is to prescribe requirements and procedures necessary for compliance with the National Transit Database Reporting System and...
Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David
2018-04-11
Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions. It also allows the definition of sequence length and sequence variability of the target region as well as the less variable flanking regions for tailoring to MPS platforms. As shown in this study, TIA can be used to discover identity-linked SNP islands within the human genome, useful for differentiating individuals by targeted resequencing on MPS technologies.
DaVIE: Database for the Visualization and Integration of Epigenetic data
Fejes, Anthony P.; Jones, Meaghan J.; Kobor, Michael S.
2014-01-01
One of the challenges in the analysis of large data sets, particularly in a population-based setting, is the ability to perform comparisons across projects. This has to be done in such a way that the integrity of each individual project is maintained, while ensuring that the data are comparable across projects. These issues are beginning to be observed in human DNA methylation studies, as the Illumina 450k platform and next generation sequencing-based assays grow in popularity and decrease in price. This increase in productivity is enabling new insights into epigenetics, but also requires the development of pipelines and software capable of handling the large volumes of data. The specific problems inherent in creating a platform for the storage, comparison, integration, and visualization of DNA methylation data include data storage, algorithm efficiency and ability to interpret the results to derive biological meaning from them. Databases provide a ready-made solution to these issues, but as yet no tools exist that that leverage these advantages while providing an intuitive user interface for interpreting results in a genomic context. We have addressed this void by integrating a database to store DNA methylation data with a web interface to query and visualize the database and a set of libraries for more complex analysis. The resulting platform is called DaVIE: Database for the Visualization and Integration of Epigenetics data. DaVIE can use data culled from a variety of sources, and the web interface includes the ability to group samples by sub-type, compare multiple projects and visualize genomic features in relation to sites of interest. We have used DaVIE to identify patterns of DNA methylation in specific projects and across different projects, identify outlier samples, and cross-check differentially methylated CpG sites identified in specific projects across large numbers of samples. A demonstration server has been setup using GEO data at http://echelon.cmmt.ubc.ca/dbaccess/, with login “guest” and password “guest.” Groups may download and install their own version of the server following the instructions on the project's wiki. PMID:25278960
75 FR 72873 - Privacy Act Of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-26
...) is amending two existing systems of records 121VA19, ``National Patient Databases--VA'', and 136VA19E... being amended for additional databases. DATES: Comments on the amendment of these systems of records... system identified as 121VA19, ``National Patient Databases--VA,'' as set forth in the Federal Register...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-25
...: Proposed Collection; Comment Request--National Hunger Clearinghouse Database Form AGENCY: Food and... Database Form. Form: FNS 543. OMB Number: 0584-0474. Expiration Date: 8/31/2012. Type of Request: Revision... Clearinghouse includes a database (FNS-543) of non- governmental, grassroots programs that work in the areas of...
National Transportation Atlas Databases : 2013
DOT National Transportation Integrated Search
2013-01-01
The National Transportation Atlas Databases 2013 (NTAD2013) is a set of nationwide geographic datasets of transportation facilities, transportation networks, associated infrastructure, and other political and administrative entities. These datasets i...
The National Solar Radiation Database (NSRDB)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sengupta, Manajit; Habte, Aron; Lopez, Anthony
This presentation provides a high-level overview of the National Solar Radiation Database (NSRDB), including sensing, measurement and forecasting, and discusses observations that are needed for research and product development.
Bannasch, Detlev; Mehrle, Alexander; Glatting, Karl-Heinz; Pepperkok, Rainer; Poustka, Annemarie; Wiemann, Stefan
2004-01-01
We have implemented LIFEdb (http://www.dkfz.de/LIFEdb) to link information regarding novel human full-length cDNAs generated and sequenced by the German cDNA Consortium with functional information on the encoded proteins produced in functional genomics and proteomics approaches. The database also serves as a sample-tracking system to manage the process from cDNA to experimental read-out and data interpretation. A web interface enables the scientific community to explore and visualize features of the annotated cDNAs and ORFs combined with experimental results, and thus helps to unravel new features of proteins with as yet unknown functions. PMID:14681468
Sato, Yukuto; Miya, Masaki; Fukunaga, Tsukasa; Sado, Tetsuya; Iwasaki, Wataru
2018-06-01
Fish mitochondrial genome (mitogenome) data form a fundamental basis for revealing vertebrate evolution and hydrosphere ecology. Here, we report recent functional updates of MitoFish, which is a database of fish mitogenomes with a precise annotation pipeline MitoAnnotator. Most importantly, we describe implementation of MiFish pipeline for metabarcoding analysis of fish mitochondrial environmental DNA, which is a fast-emerging and powerful technology in fish studies. MitoFish, MitoAnnotator, and MiFish pipeline constitute a key platform for studies of fish evolution, ecology, and conservation, and are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed April 7th, 2018).
Developing a Nursing Database System in Kenya
Riley, Patricia L; Vindigni, Stephen M; Arudo, John; Waudo, Agnes N; Kamenju, Andrew; Ngoya, Japheth; Oywer, Elizabeth O; Rakuom, Chris P; Salmon, Marla E; Kelley, Maureen; Rogers, Martha; St Louis, Michael E; Marum, Lawrence H
2007-01-01
Objective To describe the development, initial findings, and implications of a national nursing workforce database system in Kenya. Principal Findings Creating a national electronic nursing workforce database provides more reliable information on nurse demographics, migration patterns, and workforce capacity. Data analyses are most useful for human resources for health (HRH) planning when workforce capacity data can be linked to worksite staffing requirements. As a result of establishing this database, the Kenya Ministry of Health has improved capability to assess its nursing workforce and document important workforce trends, such as out-migration. Current data identify the United States as the leading recipient country of Kenyan nurses. The overwhelming majority of Kenyan nurses who elect to out-migrate are among Kenya's most qualified. Conclusions The Kenya nursing database is a first step toward facilitating evidence-based decision making in HRH. This database is unique to developing countries in sub-Saharan Africa. Establishing an electronic workforce database requires long-term investment and sustained support by national and global stakeholders. PMID:17489921
An updated version of NPIDB includes new classifications of DNA–protein complexes and their families
Zanegina, Olga; Kirsanov, Dmitriy; Baulin, Eugene; Karyagina, Anna; Alexeevski, Andrei; Spirin, Sergey
2016-01-01
The recent upgrade of nucleic acid–protein interaction database (NPIDB, http://npidb.belozersky.msu.ru/) includes a newly elaborated classification of complexes of protein domains with double-stranded DNA and a classification of families of related complexes. Our classifications are based on contacting structural elements of both DNA: the major groove, the minor groove and the backbone; and protein: helices, beta-strands and unstructured segments. We took into account both hydrogen bonds and hydrophobic interaction. The analyzed material contains 1942 structures of protein domains from 748 PDB entries. We have identified 97 interaction modes of individual protein domain–DNA complexes and 17 DNA–protein interaction classes of protein domain families. We analyzed the sources of diversity of DNA–protein interaction modes in different complexes of one protein domain family. The observed interaction mode is sometimes influenced by artifacts of crystallization or diversity in secondary structure assignment. The interaction classes of domain families are more stable and thus possess more biological sense than a classification of single complexes. Integration of the classification into NPIDB allows the user to browse the database according to the interacting structural elements of DNA and protein molecules. For each family, we present average DNA shape parameters in contact zones with domains of the family. PMID:26656949
NASA Astrophysics Data System (ADS)
Dornback, M.; Hourigan, T.; Etnoyer, P.; McGuinn, R.; Cross, S. L.
2014-12-01
Research on deep-sea corals has expanded rapidly over the last two decades, as scientists began to realize their value as long-lived structural components of high biodiversity habitats and archives of environmental information. The NOAA Deep Sea Coral Research and Technology Program's National Database for Deep-Sea Corals and Sponges is a comprehensive resource for georeferenced data on these organisms in U.S. waters. The National Database currently includes more than 220,000 deep-sea coral records representing approximately 880 unique species. Database records from museum archives, commercial and scientific bycatch, and from journal publications provide baseline information with relatively coarse spatial resolution dating back as far as 1842. These data are complemented by modern, in-situ submersible observations with high spatial resolution, from surveys conducted by NOAA and NOAA partners. Management of high volumes of modern high-resolution observational data can be challenging. NOAA is working with our data partners to incorporate this occurrence data into the National Database, along with images and associated information related to geoposition, time, biology, taxonomy, environment, provenance, and accuracy. NOAA is also working to link associated datasets collected by our program's research, to properly archive them to the NOAA National Data Centers, to build a robust metadata record, and to establish a standard protocol to simplify the process. Access to the National Database is provided through an online mapping portal. The map displays point based records from the database. Records can be refined by taxon, region, time, and depth. The queries and extent used to view the map can also be used to download subsets of the database. The database, map, and website is already in use by NOAA, regional fishery management councils, and regional ocean planning bodies, but we envision it as a model that can expand to accommodate data on a global scale.
NASA Astrophysics Data System (ADS)
Ferreira, M.; Creveling, J.; Hilburn, I.; Karlsson, E.; Pepe-Ranney, C.; Spear, J.; Dawson, S.; Geobio2008, I.
2008-12-01
Silicified structures that exhibit a putative biologic component in their formation permeate the rock record as stromatolites. We have studied a silicified microbial structure from a hot spring in Yellowstone National Park using phenotypic, phylogenetic, and metagenomic analyses to determine microbial carbon metabolic pathways and the phylogenetic affiliations of microbes present in this unique structure. In this multi-faceted approach, dominant physiologies, specifically with regards to anaerobic and aerobic metabolisms, were inferred from 16S rRNA gene sequences and 454 sequencing data from bulk DNA samples of the structure. Carbon utilization as indicated by ECO Biolog plates showed abundant heterotrophy and heterotrophic diversity throughout the microbial structure. Microbes within the structure are able to utilize all tested sources of carbohydrates, lipids/fatty acids, and protein/amino acids as carbon sources. ECO plate testing of the hot spring water yielded considerable less carbohydrate consumption (only 4 out of 13 tested carbohydrates) and similar lipids/fatty acids and protein/amino acids consumption (2 out of 3 and 5 out of 5 tested sources respectively). Full length 16S rRNA gene sequences and metagenomic 454 pyrosequencing of community DNA showed limited diversity among primary producers. From the 16S data, the majority of the autotrophs are inferred to utilize the Calvin cycle for CO2 fixation, followed by 3-hydroxypropionate/4- hydroxybutyrate CO2 fixation. However, an analysis of the metagenomic data compared to the KEGG database does not show genes directly involved with Calvin cycle carbon fixation. Further BLAST searches of our data failed to find significant matches within our 6514 metagenomic sequences to known RuBisCo sequences taken from the NCBI database. This is likely due to a far under-sampled dataset of metagenomic sequences, and the low number (958) that had matches to the KEGG pathways database. Anaerobic versus aerobic physiology also can be estimated from the 16S clone libraries. Phylogenetic analysis of recovered 16S sequences suggests that 15% of the 16S sequences can be attributed to anaerobic microbes while 42% likely come from aerobes. The remaining 43% of 16S rRNA gene sequences belong to metabolically unassigned phyla both known and novel. This preliminary study demonstrates that the small spatially stratified silicified microbial structure present on the margins of a hot spring contains a rich and complex microbial community with different trophic levels and enzymatic pathways.
Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine
Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson
2011-01-01
Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...
National Solar Radiation Database 1991-2010 Update: User's Manual
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilcox, S. M.
This user's manual provides information on the updated 1991-2010 National Solar Radiation Database. Included are data format descriptions, data sources, production processes, and information about data uncertainty.
2013-09-30
profiles of right whales Eubalaena glacialis from the North Atlantic Right Whale Consortium; 2) DNA profiles of sperm whales Physeter macrocephalus...of other cetacean databases in Wildbook format (e.g., North Atlantic right whales, sperm whales and Hector’s dolphins); 8) Supported continuing...of sperm whales, using samples collected during the 5-year Voyage of the Odyssey; and 3) DNA profiles of Hector’s dolphins from Cloudy Bay, New
Bach, Evelise; Sant'Anna, Fernando Hayashi; Magrich Dos Passos, João Frederico; Balsanelli, Eduardo; de Baura, Valter Antonio; Pedrosa, Fábio de Oliveira; de Souza, Emanuel Maltempi; Passaglia, Luciane Maria Pereira
2017-08-31
The correct identification of bacteria from the Burkholderia cepacia complex (Bcc) is crucial for epidemiological studies and treatment of cystic fibrosis infections. However, genome-based identification tools are revealing many controversial Bcc species assignments. The aim of this work is to re-examine the taxonomic position of the soil bacterium B. cepacia 89 through polyphasic and genomic approaches. recA and 16S rRNA gene sequence analysis positioned strain 89 inside the Bcc group. However, based on the divergence score of seven concatenated allele sequences, and values of average nucleotide identity, and digital DNA:DNA hybridization, our results suggest that strain 89 is different from other Bcc species formerly described. Thus, we propose to classify Burkholderia sp. 89 as the novel species Burkholderia catarinensis sp. nov. with strain 89T (=DSM 103188T = BR 10601T) as the type strain. Moreover, our results call the attention to some probable misidentifications of Bcc genomes at the National Center for Biotechnology Information database. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Aberrant DNA methylation patterns of spermatozoa in men with unexplained infertility.
Urdinguio, Rocío G; Bayón, Gustavo F; Dmitrijeva, Marija; Toraño, Estela G; Bravo, Cristina; Fraga, Mario F; Bassas, Lluís; Larriba, Sara; Fernández, Agustín F
2015-05-01
Are there DNA methylation alterations in sperm that could explain the reduced biological fertility of male partners from couples with unexplained infertility? DNA methylation patterns, not only at specific loci but also at Alu Yb8 repetitive sequences, are altered in infertile individuals compared with fertile controls. Aberrant DNA methylation of sperm has been associated with human male infertility in patients demonstrating either deficiencies in the process of spermatogenesis or low semen quality. Case and control prospective study. This study compares 46 sperm samples obtained from 17 normospermic fertile men and 29 normospermic infertile patients. Illumina Infinium HD Human Methylation 450K arrays were used to identify genomic regions showing differences in sperm DNA methylation patterns between five fertile and seven infertile individuals. Additionally, global DNA methylation of sperm was measured using the Methylamp Global DNA Methylation Quantification Ultra kit (Epigentek) in 14 samples, and DNA methylation at several repetitive sequences (LINE-1, Alu Yb8, NBL2, D4Z4) measured by bisulfite pyrosequencing in 44 sperm samples. A sperm-specific DNA methylation pattern was obtained by comparing the sperm methylomes with the DNA methylomes of differentiated somatic cells using data obtained from methylation arrays (Illumina 450 K) of blood, neural and glial cells deposited in public databases. In this study we conduct, for the first time, a genome-wide study to identify alterations of sperm DNA methylation in individuals with unexplained infertility that may account for the differences in their biological fertility compared with fertile individuals. We have identified 2752 CpGs showing aberrant DNA methylation patterns, and more importantly, these differentially methylated CpGs were significantly associated with CpG sites which are specifically methylated in sperm when compared with somatic cells. We also found statistically significant (P < 0.001) associations between DNA hypomethylation and regions corresponding to those which, in somatic cells, are enriched in the repressive histone mark H3K9me3, and between DNA hypermethylation and regions enriched in H3K4me1 and CTCF, suggesting that the relationship between chromatin context and aberrant DNA methylation of sperm in infertile men could be locus-dependent. Finally, we also show that DNA methylation patterns, not only at specific loci but also at several repetitive sequences (LINE-1, Alu Yb8, NBL2, D4Z4), were lower in sperm than in somatic cells. Interestingly, sperm samples at Alu Yb8 repetitive sequences of infertile patients showed significantly lower DNA methylation levels than controls. Our results are descriptive and further studies would be needed to elucidate the functional effects of aberrant DNA methylation on male fertility. Overall, our data suggest that aberrant sperm DNA methylation might contribute to fertility impairment in couples with unexplained infertility and they provide a promising basis for future research. This work has been financially supported by Fundación Cientifica de la AECC (to R.G.U.); IUOPA (to G.F.B.); FICYT (to E.G.T.); the Spanish National Research Council (CSIC; 200820I172 to M.F.F.); Fundación Ramón Areces (to M.F.F); the Plan Nacional de I+D+I 2008-2011/2013-2016/FEDER (PI11/01728 to AF.F., PI12/01080 to M.F.F. and PI12/00361 to S.L.); the PN de I+D+I 2008-20011 and the Generalitat de Catalunya (2009SGR01490). A.F.F. is sponsored by ISCIII-Subdirección General de Evaluación y Fomento de la Investigación (CP11/00131). S.L. is sponsored by the Researchers Stabilization Program from the Spanish National Health System (CES09/020). The IUOPA is supported by the Obra Social Cajastur, Spain. © The Author 2015. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
"First generation" automated DNA sequencing technology.
Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M
2011-10-01
Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
Hammondia heydorni oocysts in the faeces of a greyhound in New Zealand.
Ellis, J T; Pomroy, W E
2003-02-01
To identify oocysts found in faecal material of a greyhound. Polymerase chain reaction (PCR) and DNA sequencing were used to study genomic DNA isolated from oocysts purified from faeces of a greyhound. Database searches with the DNA sequences obtained showed they were derived from Hammondia heydorni. A species-specific PCR was developed to detect H. heydorni DNA. Light microscopy in conjunction with PCR and DNA sequencing definitively identified the presence of H. heydorni oocysts in faeces of a greyhound. This study confirms the presence of H. heydorni in New Zealand and indicates the need to correctly identify similar oocysts from dogs, rather than assume they are Neospora caninum.
78 FR 25095 - Notice of an Extension of an Information Collection (1028-0092)
Federal Register 2010, 2011, 2012, 2013, 2014
2013-04-29
... the development of The National Map and other national geospatial databases. In FY 2010, projects for... including elevation, orthoimagery, hydrography and other layers in the national databases may be possible. We will accept applications from State, local or tribal governments and academic institutions to...
GABI-Kat SimpleSearch: new features of the Arabidopsis thaliana T-DNA mutant database.
Kleinboelting, Nils; Huep, Gunnar; Kloetgen, Andreas; Viehoever, Prisca; Weisshaar, Bernd
2012-01-01
T-DNA insertion mutants are very valuable for reverse genetics in Arabidopsis thaliana. Several projects have generated large sequence-indexed collections of T-DNA insertion lines, of which GABI-Kat is the second largest resource worldwide. User access to the collection and its Flanking Sequence Tags (FSTs) is provided by the front end SimpleSearch (http://www.GABI-Kat.de). Several significant improvements have been implemented recently. The database now relies on the TAIRv10 genome sequence and annotation dataset. All FSTs have been newly mapped using an optimized procedure that leads to improved accuracy of insertion site predictions. A fraction of the collection with weak FST yield was re-analysed by generating new FSTs. Along with newly found predictions for older sequences about 20,000 new FSTs were included in the database. Information about groups of FSTs pointing to the same insertion site that is found in several lines but is real only in a single line are included, and many problematic FST-to-line links have been corrected using new wet-lab data. SimpleSearch currently contains data from ~71,000 lines with predicted insertions covering 62.5% of the 27,206 nuclear protein coding genes, and offers insertion allele-specific data from 9545 confirmed lines that are available from the Nottingham Arabidopsis Stock Centre.
Castañón, Jesús; Román, José Pablo; Jessop, Theodore C; de Blas, Jesús; Haro, Rubén
2018-06-01
DNA-encoded libraries (DELs) have emerged as an efficient and cost-effective drug discovery tool for the exploration and screening of very large chemical space using small-molecule collections of unprecedented size. Herein, we report an integrated automation and informatics system designed to enhance the quality, efficiency, and throughput of the production and affinity selection of these libraries. The platform is governed by software developed according to a database-centric architecture to ensure data consistency, integrity, and availability. Through its versatile protocol management functionalities, this application captures the wide diversity of experimental processes involved with DEL technology, keeps track of working protocols in the database, and uses them to command robotic liquid handlers for the synthesis of libraries. This approach provides full traceability of building-blocks and DNA tags in each split-and-pool cycle. Affinity selection experiments and high-throughput sequencing reads are also captured in the database, and the results are automatically deconvoluted and visualized in customizable representations. Researchers can compare results of different experiments and use machine learning methods to discover patterns in data. As of this writing, the platform has been validated through the generation and affinity selection of various libraries, and it has become the cornerstone of the DEL production effort at Lilly.
ERIC Educational Resources Information Center
Avellone, Lauren; Scott, Sally
2017-01-01
The purpose of this research brief was to identify and provide an overview of national databases containing information about college students with disabilities. Eleven instruments from federal and university-based sources were described. Databases reflect a variety of survey methods, respondents, definitions of disability, and research questions.…
76 FR 68811 - Notice of Request for the Revision of Currently Approved Information Collection
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-07
... information collection: 49 U.S.C. 5335(a) and (b) National Transit Database (NTD). DATES: Comments must be... CONTACT: John D. Giorgis, National Transit Database Program Manager, FTA Office of Budget and Policy, (202... Transit Database. (OMB Number: 2132-0008). Background: 49 U.S.C. 5335(a) and (b) requires the Secretary of...
We discuss the initial design and application of the National Urban Database and Access Portal Tool (NUDAPT). This new project is sponsored by the USEPA and involves collaborations and contributions from many groups from federal and state agencies, and from private and academic i...
Increasing global participation in genetics research through DNA barcoding.
Adamowicz, Sarah J; Steinke, Dirk
2015-12-01
DNA barcoding--the sequencing of short, standardized DNA regions for specimen identification and species discovery--has promised to facilitate rapid access to biodiversity knowledge by diverse users. Here, we advance our opinion that increased global participation in genetics research is beneficial, both to scientists and for science, and explore the premise that DNA barcoding can help to democratize participation in genetics research. We examine publication patterns (2003-2014) in the DNA barcoding literature and compare trends with those in the broader, related domain of genomics. While genomics is the older and much larger field, the number of nations contributing to the published literature is similar between disciplines. Meanwhile, DNA barcoding exhibits a higher pace of growth in the number of publications as well as greater evenness among nations in their proportional contribution to total authorships. This exploration revealed DNA barcoding to be a highly international discipline, with growing participation by researchers in especially biodiverse nations. We briefly consider several of the challenges that may hinder further participation in genetics research, including access to training and molecular facilities as well as policy relating to the movement of genetic resources.
2008 rural national transit database
DOT National Transportation Integrated Search
2008-01-01
This spreadsheet includes the following data from the 2008 Rural National Transit Database: : > Sub-Recipient Information : > Service Data : > Revenue Vehicle Inventory : > Counties Served : Each one of the categories above are in worksheets within t...
32 CFR 338.1 - Ordering DNA issuances.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 32 National Defense 2 2010-07-01 2010-07-01 false Ordering DNA issuances. 338.1 Section 338.1... DOD INFORMATION AVAILABILITY TO THE PUBLIC OF DEFENSE NUCLEAR AGENCY (DNA) INSTRUCTIONS AND CHANGES THERETO § 338.1 Ordering DNA issuances. (a) The DNA issuances published in the DNA indexes are published...
32 CFR 338.1 - Ordering DNA issuances.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 32 National Defense 2 2011-07-01 2011-07-01 false Ordering DNA issuances. 338.1 Section 338.1... DOD INFORMATION AVAILABILITY TO THE PUBLIC OF DEFENSE NUCLEAR AGENCY (DNA) INSTRUCTIONS AND CHANGES THERETO § 338.1 Ordering DNA issuances. (a) The DNA issuances published in the DNA indexes are published...
32 CFR 338.1 - Ordering DNA issuances.
Code of Federal Regulations, 2014 CFR
2014-07-01
... 32 National Defense 2 2014-07-01 2014-07-01 false Ordering DNA issuances. 338.1 Section 338.1... DOD INFORMATION AVAILABILITY TO THE PUBLIC OF DEFENSE NUCLEAR AGENCY (DNA) INSTRUCTIONS AND CHANGES THERETO § 338.1 Ordering DNA issuances. (a) The DNA issuances published in the DNA indexes are published...
32 CFR 338.1 - Ordering DNA issuances.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 32 National Defense 2 2012-07-01 2012-07-01 false Ordering DNA issuances. 338.1 Section 338.1... DOD INFORMATION AVAILABILITY TO THE PUBLIC OF DEFENSE NUCLEAR AGENCY (DNA) INSTRUCTIONS AND CHANGES THERETO § 338.1 Ordering DNA issuances. (a) The DNA issuances published in the DNA indexes are published...
32 CFR 338.1 - Ordering DNA issuances.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 32 National Defense 2 2013-07-01 2013-07-01 false Ordering DNA issuances. 338.1 Section 338.1... DOD INFORMATION AVAILABILITY TO THE PUBLIC OF DEFENSE NUCLEAR AGENCY (DNA) INSTRUCTIONS AND CHANGES THERETO § 338.1 Ordering DNA issuances. (a) The DNA issuances published in the DNA indexes are published...
The Iranian National Geodata Revision Strategy and Realization Based on Geodatabase
NASA Astrophysics Data System (ADS)
Haeri, M.; Fasihi, A.; Ayazi, S. M.
2012-07-01
In recent years, using of spatial database for storing and managing spatial data has become a hot topic in the field of GIS. Accordingly National Cartographic Center of Iran (NCC) produces - from time to time - some spatial data which is usually included in some databases. One of the NCC major projects was designing National Topographic Database (NTDB). NCC decided to create National Topographic Database of the entire country-based on 1:25000 coverage maps. The standard of NTDB was published in 1994 and its database was created at the same time. In NTDB geometric data was stored in MicroStation design format (DGN) which each feature has a link to its attribute data (stored in Microsoft Access file). Also NTDB file was produced in a sheet-wise mode and then stored in a file-based style. Besides map compilation, revision of existing maps has already been started. Key problems of NCC are revision strategy, NTDB file-based style storage and operator challenges (NCC operators are almost preferred to edit and revise geometry data in CAD environments). A GeoDatabase solution for national Geodata, based on NTDB map files and operators' revision preferences, is introduced and released herein. The proposed solution extends the traditional methods to have a seamless spatial database which it can be revised in CAD and GIS environment, simultaneously. The proposed system is the common data framework to create a central data repository for spatial data storage and management.
Chien, Tsair-Wei; Chang, Yu; Wang, Hsien-Yi
2018-02-01
Many researchers used National Health Insurance database to publish medical papers which are often retrospective, population-based, and cohort studies. However, the author's research domain and academic characteristics are still unclear.By searching the PubMed database (Pubmed.com), we used the keyword of [Taiwan] and [National Health Insurance Research Database], then downloaded 2913 articles published from 1995 to 2017. Social network analysis (SNA), Gini coefficient, and Google Maps were applied to gather these data for visualizing: the most productive author; the pattern of coauthor collaboration teams; and the author's research domain denoted by abstract keywords and Pubmed MESH (medical subject heading) terms.Utilizing the 2913 papers from Taiwan's National Health Insurance database, we chose the top 10 research teams shown on Google Maps and analyzed one author (Dr. Kao) who published 149 papers in the database in 2015. In the past 15 years, we found Dr. Kao had 2987 connections with other coauthors from 13 research teams. The cooccurrence abstract keywords with the highest frequency are cohort study and National Health Insurance Research Database. The most coexistent MESH terms are tomography, X-ray computed, and positron-emission tomography. The strength of the author research distinct domain is very low (Gini < 0.40).SNA incorporated with Google Maps and Gini coefficient provides insight into the relationships between entities. The results obtained in this study can be applied for a comprehensive understanding of other productive authors in the field of academics.
Forensic DNA methylation profiling from evidence material for investigative leads
Lee, Hwan Young; Lee, Soong Deok; Shin, Kyoung-Jin
2016-01-01
DNA methylation is emerging as an attractive marker providing investigative leads to solve crimes in forensic genetics. The identification of body fluids that utilizes tissue-specific DNA methylation can contribute to solving crimes by predicting activity related to the evidence material. The age estimation based on DNA methylation is expected to reduce the number of potential suspects, when the DNA profile from the evidence does not match with any known person, including those stored in the forensic database. Moreover, the variation in DNA implicates environmental exposure, such as cigarette smoking and alcohol consumption, thereby suggesting the possibility to be used as a marker for predicting the lifestyle of potential suspect. In this review, we describe recent advances in our understanding of DNA methylation variations and the utility of DNA methylation as a forensic marker for advanced investigative leads from evidence materials. [BMB Reports 2016; 49(7): 359-369] PMID:27099236
The future of forensic DNA analysis
Butler, John M.
2015-01-01
The author's thoughts and opinions on where the field of forensic DNA testing is headed for the next decade are provided in the context of where the field has come over the past 30 years. Similar to the Olympic motto of ‘faster, higher, stronger’, forensic DNA protocols can be expected to become more rapid and sensitive and provide stronger investigative potential. New short tandem repeat (STR) loci have expanded the core set of genetic markers used for human identification in Europe and the USA. Rapid DNA testing is on the verge of enabling new applications. Next-generation sequencing has the potential to provide greater depth of coverage for information on STR alleles. Familial DNA searching has expanded capabilities of DNA databases in parts of the world where it is allowed. Challenges and opportunities that will impact the future of forensic DNA are explored including the need for education and training to improve interpretation of complex DNA profiles. PMID:26101278
Development of the National Institutes of Health Guidelines for Recombinant DNA Research.
Talbot, B
1983-01-01
Recombinant DNA is a technique of major importance in basic biomedical research and, increasingly, in industrial applications. Although the risks of this research remain hypothetical, scientists working in the field have spearheaded discussions of safety. The original National Institutes of Health (NIH) Guidelines for Recombinant DNA Research were issued in June 1976. They assigned each type of recombinant DNA experiment a specific level of "physical containment" and of "biological containment." Responsibility for overseeing the application of the guidelines belongs to the NIH Recombinant DNA Advisory Committee (RAC)--composed of scientists and laymen, including non-voting representatives from many Federal agencies--and local institutional biosafety committees at each university where recombinant DNA research is conducted. The NIH guidelines were subsequently adopted by other Federal agencies, but congressional proposals aimed at extending the guidelines to private industry did not result in national legislation. Some States and localities regulate recombinant DNA research, however, and many private companies have voluntarily submitted information on their recombinant DNA work for RAC and NIH approval. The NIH guidelines underwent a major revision in December 1978 and have been revised approximately every 3 months since then. NIH supports experiments to assess recombinant DNA risks and publishes and updates a plan for a risk assessment program. PMID:6611823
Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao; Ning, Shangwei; Jin, Lianhong; Li, Xia
2018-01-04
Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
A reservoir morphology database for the conterminous United States
Rodgers, Kirk D.
2017-09-13
The U.S. Geological Survey, in cooperation with the Reservoir Fisheries Habitat Partnership, combined multiple national databases to create one comprehensive national reservoir database and to calculate new morphological metrics for 3,828 reservoirs. These new metrics include, but are not limited to, shoreline development index, index of basin permanence, development of volume, and other descriptive metrics based on established morphometric formulas. The new database also contains modeled chemical and physical metrics. Because of the nature of the existing databases used to compile the Reservoir Morphology Database and the inherent missing data, some metrics were not populated. One comprehensive database will assist water-resource managers in their understanding of local reservoir morphology and water chemistry characteristics throughout the continental United States.
Concept for estimating mitochondrial DNA haplogroups using a maximum likelihood approach (EMMA)☆
Röck, Alexander W.; Dür, Arne; van Oven, Mannis; Parson, Walther
2013-01-01
The assignment of haplogroups to mitochondrial DNA haplotypes contributes substantial value for quality control, not only in forensic genetics but also in population and medical genetics. The availability of Phylotree, a widely accepted phylogenetic tree of human mitochondrial DNA lineages, led to the development of several (semi-)automated software solutions for haplogrouping. However, currently existing haplogrouping tools only make use of haplogroup-defining mutations, whereas private mutations (beyond the haplogroup level) can be additionally informative allowing for enhanced haplogroup assignment. This is especially relevant in the case of (partial) control region sequences, which are mainly used in forensics. The present study makes three major contributions toward a more reliable, semi-automated estimation of mitochondrial haplogroups. First, a quality-controlled database consisting of 14,990 full mtGenomes downloaded from GenBank was compiled. Together with Phylotree, these mtGenomes serve as a reference database for haplogroup estimates. Second, the concept of fluctuation rates, i.e. a maximum likelihood estimation of the stability of mutations based on 19,171 full control region haplotypes for which raw lane data is available, is presented. Finally, an algorithm for estimating the haplogroup of an mtDNA sequence based on the combined database of full mtGenomes and Phylotree, which also incorporates the empirically determined fluctuation rates, is brought forward. On the basis of examples from the literature and EMPOP, the algorithm is not only validated, but both the strength of this approach and its utility for quality control of mitochondrial haplotypes is also demonstrated. PMID:23948335
Differences in expression of retinal pigment epithelium mRNA between normal canines
2004-01-01
Abstract A reference database of differences in mRNA expression in normal healthy canine retinal pigment epithelium (RPE) has been established. This database identifies non-informative differences in mRNA expression that can be used in screening canine RPE for mutations associated with clinical effects on vision. Complementary DNA (cDNA) pools were prepared from mRNA harvested from RPE, amplified by PCR, and used in a subtractive hybridization protocol (representational differential analysis) to identify differences in RPE mRNA expression between canines. The effect of relatedness of the test canines on the frequency of occurrence of differences was evaluated by using 2 unrelated canines for comparison with 2 female sibling canines of blue heeler/bull terrier lineage. Differentially expressed cDNA species were cloned, sequenced, and identified by comparison to public database entries. The most frequently observed differentially expressed sequence from the unrelated canine comparison was cDNA with 21 base pairs (bp) identical to the human epithelial membrane protein 1 gene (present in 8 of 20 clones). Different clones from the same-sex sibling RPE contained repetitions of several short sequence motifs including the human epithelial membrane protein 1 (4 of 25 clones). Other prevalent differences between sibling RPE included sequences similar to a chicken genetic marker sequence motif (5 of 25), and 6 clones with homology to porcine major histocompatibility loci. In addition to identifying several repetitively occurring, noninformative, differentially expressed RPE mRNA species, the findings confirm that fewer differences occurred between siblings, highlighting the importance of using closely related subjects in representational difference analysis studies. PMID:15352545
Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami
2011-02-01
Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named 'RiceFOX'. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/.
Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami
2011-01-01
Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named ‘RiceFOX’. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/. PMID:21186176
A VBA Desktop Database for Proposal Processing at National Optical Astronomy Observatories
NASA Astrophysics Data System (ADS)
Brown, Christa L.
National Optical Astronomy Observatories (NOAO) has developed a relational Microsoft Windows desktop database using Microsoft Access and the Microsoft Office programming language, Visual Basic for Applications (VBA). The database is used to track data relating to observing proposals from original receipt through the review process, scheduling, observing, and final statistical reporting. The database has automated proposal processing and distribution of information. It allows NOAO to collect and archive data so as to query and analyze information about our science programs in new ways.
8 CFR 338.12 - Endorsement by clerk of court in case name is changed.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 8 Aliens and Nationality 1 2010-01-01 2010-01-01 false Endorsement by clerk of court in case name is changed. 338.12 Section 338.12 Aliens and Nationality DEPARTMENT OF HOMELAND SECURITY NATIONALITY... database for naturalization recordkeeping, the name change information will be maintained in that database...
The National Land Cover Database (NLCD) provides nationwide data on land cover and land cover change at the native 30-m spatial resolution of the Landsat Thematic Mapper (TM). The database is designed to provide five-year cyclical updating of United States land cover and associat...
ERIC Educational Resources Information Center
Breit-Smith, Allison; Cabell, Sonia Q.; Justice, Laura M.
2010-01-01
Purpose: The present article illustrates how the National Household Education Surveys (NHES; U.S. Department of Education, 2009) database might be used to address questions of relevance to researchers who are concerned with literacy development among young children. Following a general description of the NHES database, a study is provided that…
NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data.
Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug
2016-01-01
The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data.
Use of national clinical databases for informing and for evaluating health care policies.
Black, Nick; Tan, Stefanie
2013-02-01
Policy-makers and analysts could make use of national clinical databases either to inform or to evaluate meso-level (organisation and delivery of health care) and macro-level (national) policies. Reviewing the use of 15 of the best established databases in England, we identify and describe four published examples of each use. These show that policy-makers can either make use of the data itself or of research based on the database. For evaluating policies, the major advantages are the huge sample sizes available, the generalisability of the data, its immediate availability and historic information. The principal methodological challenges involve the need for risk adjustment and time-series analysis. Given their usefulness in the policy arena, there are several reasons why national clinical databases have not been used more, some due to a lack of 'push' by their custodians and some to the lack of 'pull' by policy-makers. Greater exploitation of these valuable resources would be facilitated by policy-makers' and custodians' increased awareness, minimisation of legal restrictions on data use, improvements in the quality of databases and a library of examples of applications to policy. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Rudi, Knut; Kleiberg, Gro H; Heiberg, Ragnhild; Rosnes, Jan T
2007-08-01
The aim of this work was to evaluate restriction fragment melting curve analyses (RFMCA) as a novel approach for rapid classification of bacteria during food production. RFMCA was evaluated for bacteria isolated from sous vide food products, and raw materials used for sous vide production. We identified four major bacterial groups in the material analysed (cluster I-Streptococcus, cluster II-Carnobacterium/Bacillus, cluster III-Staphylococcus and cluster IV-Actinomycetales). The accuracy of RFMCA was evaluated by comparison with 16S rDNA sequencing. The strains satisfying the RFMCA quality filtering criteria (73%, n=57), with both 16S rDNA sequence information and RFMCA data (n=45) gave identical group assignments with the two methods. RFMCA enabled rapid and accurate classification of bacteria that is database compatible. Potential application of RFMCA in the food or pharmaceutical industry will include development of classification models for the bacteria expected in a given product, and then to build an RFMCA database as a part of the product quality control.
National Solar Radiation Database 1991-2005 Update: User's Manual
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilcox, S.
2007-04-01
This manual describes how to obtain and interpret the data products from the updated 1991-2005 National Solar Radiation Database (NSRDB). This is an update of the original 1961-1990 NSRDB released in 1992.
Feline mitochondrial DNA sampling for forensic analysis: when enough is enough!
Grahn, Robert A; Alhaddad, Hasan; Alves, Paulo C; Randi, Ettore; Waly, Nashwa E; Lyons, Leslie A
2015-05-01
Pet hair has a demonstrated value in resolving legal issues. Cat hair is chronically shed and it is difficult to leave a home with cats without some level of secondary transfer. The power of cat hair as an evidentiary resource may be underused because representative genetic databases are not available for exclusionary purposes. Mitochondrial control region databases are highly valuable for hair analyses and have been developed for the cat. In a representative worldwide data set, 83% of domestic cat mitotypes belong to one of twelve major types. Of the remaining 17%, 7.5% are unique within the published 1394 sample database. The current research evaluates the sample size necessary to establish a representative population for forensic comparison of the mitochondrial control region for the domestic cat. For most worldwide populations, randomly sampling 50 unrelated local individuals will achieve saturation at 95%. The 99% saturation is achieved by randomly sampling 60-170 cats, depending on the numbers of mitotypes available in the population at large. Likely due to the recent domestication of the cat and minimal localized population substructure, fewer cats are needed to meet mitochondria DNA control region database practical saturation than for humans or dogs. Coupled with the available worldwide feline control region database of nearly 1400 cats, minimal local sampling will be required to establish an appropriate comparative representative database and achieve significant exclusionary power. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Moudgal, Chandrika J; Garrahan, Kevin; Brady-Roberts, Eletha; Gavrelis, Naida; Arbogast, Michelle; Dun, Sarah
2008-11-15
The toxicity value database of the United States Environmental Protection Agency's (EPA) National Homeland Security Research Center has been in development since 2004. The toxicity value database includes a compilation of agent property, toxicity, dose-response, and health effects data for 96 agents: 84 chemical and radiological agents and 12 biotoxins. The database is populated with multiple toxicity benchmark values and agent property information from secondary sources, with web links to the secondary sources, where available. A selected set of primary literature citations and associated dose-response data are also included. The toxicity value database offers a powerful means to quickly and efficiently gather pertinent toxicity and dose-response data for a number of agents that are of concern to the nation's security. This database, in conjunction with other tools, will play an important role in understanding human health risks, and will provide a means for risk assessors and managers to make quick and informed decisions on the potential health risks and determine appropriate responses (e.g., cleanup) to agent release. A final, stand alone MS ACESSS working version of the toxicity value database was completed in November, 2007.
Kang, Young Gon; Suh, Eunkyung; Lee, Jae-woo; Kim, Dong Wook; Cho, Kyung Hee; Bae, Chul-Young
2018-01-01
Purpose A comprehensive health index is needed to measure an individual’s overall health and aging status and predict the risk of death and age-related disease incidence, and evaluate the effect of a health management program. The purpose of this study is to demonstrate the validity of estimated biological age (BA) in relation to all-cause mortality and age-related disease incidence based on National Sample Cohort database. Patients and methods This study was based on National Sample Cohort database of the National Health Insurance Service – Eligibility database and the National Health Insurance Service – Medical and Health Examination database of the year 2002 through 2013. BA model was developed based on the National Health Insurance Service – National Sample Cohort (NHIS – NSC) database and Cox proportional hazard analysis was done for mortality and major age-related disease incidence. Results For every 1 year increase of the calculated BA and chronological age difference, the hazard ratio for mortality significantly increased by 1.6% (1.5% in men and 2.0% in women) and also for hypertension, diabetes mellitus, heart disease, stroke, and cancer incidence by 2.5%, 4.2%, 1.3%, 1.6%, and 0.4%, respectively (p<0.001). Conclusion Estimated BA by the developed BA model based on NHIS – NSC database is expected to be used not only as an index for assessing health and aging status and predicting mortality and major age-related disease incidence, but can also be applied to various health care fields. PMID:29593385
A bio-inspired structural health monitoring system based on ambient vibration
NASA Astrophysics Data System (ADS)
Lin, Tzu-Kang; Kiremidjian, Anne; Lei, Chi-Yang
2010-11-01
A structural health monitoring (SHM) system based on naïve Bayesian (NB) damage classification and DNA-like expression data was developed in this research. Adapted from the deoxyribonucleic acid (DNA) array concept in molecular biology, the proposed structural health monitoring system is constructed utilizing a double-tier regression process to extract the expression array from the structural time history recorded during external excitations. The extracted array is symbolized as the various genes of the structure from the viewpoint of molecular biology and reflects the possible damage conditions prevalent in the structure. A scaled down, six-story steel building mounted on the shaking table of the National Center for Research on Earthquake Engineering (NCREE) was used as the benchmark. The structural response at different damage levels and locations under ambient vibration was collected to support the database for the proposed SHM system. To improve the precision of detection in practical applications, the system was enhanced by an optimization process using the likelihood selection method. The obtained array representing the DNA array of the health condition of the structure was first evaluated and ranked. A total of 12 groups of expression arrays were regenerated from a combination of four damage conditions. To keep the length of the array unchanged, the best 16 coefficients from every expression array were selected to form the optimized SHM system. Test results from the ambient vibrations showed that the detection accuracy of the structural damage could be greatly enhanced by the optimized expression array, when compared to the original system. Practical verification also demonstrated that a rapid and reliable result could be given by the final system within 1 min. The proposed system implements the idea of transplanting the DNA array concept from molecular biology into the field of SHM.
More evidence for non-maternal inheritance of mitochondrial DNA?
Bandelt, H-J; Kong, Q-P; Parson, W; Salas, A
2005-12-01
A single case of paternal co-transmission of mitochondrial DNA (mtDNA) in humans has been reported so far. To find potential instances of non-maternal inheritance of mtDNA. Published medical case studies (of single patients) were searched for irregular mtDNA patterns by comparing the given haplotype information for different clones or tissues with the worldwide mtDNA database as known to date-a method that has proved robust and reliable for the detection of flawed mtDNA sequence data. More than 20 studies were found reporting clear cut instances with mtDNAs of different ancestries in single individuals. As examples, cases are reviewed from recent published reports which, at face value, may be taken as evidence for paternal inheritance of mtDNA or recombination. Multiple types (or recombinant types) of quite dissimilar mitochondrial DNA from different parts of the known mtDNA phylogeny are often reported in single individuals. From re-analyses and corrigenda of forensic mtDNA data, it is apparent that the phenomenon of mixed or mosaic mtDNA can be ascribed solely to contamination and sample mix up.
National Water Quality Standards Database (NWQSD)
The National Water Quality Standards Database (WQSDB) provides access to EPA and state water quality standards (WQS) information in text, tables, and maps. This data source was last updated in December 2007 and will no longer be updated.
Freight Transportation Energy Use : Volume 3. Freight Network and Operations Database.
DOT National Transportation Integrated Search
1979-07-01
The data sources, procedures, and assumptions used to generate the TSC national freight network and operations database are documented. National rail, highway, waterway, and pipeline networks are presented, and estimates of facility capacity, travel ...
Report to Congress : review of the National Transit Database
DOT National Transportation Integrated Search
2000-05-30
This report presents the findings and recommendations of the evaluation of the Federal Transit Administration (FTA) National Transit Database (NTD), conducted in accordance with the direction of the House and Senate Committees of Appropriations, as s...
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2011 CFR
2011-01-01
... is linked to that animal in the CWD National Database. The second animal identification must be... CWD National Database. (Approved by the Office of Management and Budget under control number 0579-0237) ...
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2010 CFR
2010-01-01
... is linked to that animal in the CWD National Database. The second animal identification must be... CWD National Database. (Approved by the Office of Management and Budget under control number 0579-0237) ...
9 CFR 81.2 - Identification of deer, elk, and moose in interstate commerce.
Code of Federal Regulations, 2012 CFR
2012-01-01
... is linked to that animal in the CWD National Database. The second animal identification must be... CWD National Database. (Approved by the Office of Management and Budget under control number 0579-0237) ...
Hirabayashi, Satoshi; Nowak, David J
2016-08-01
Trees remove air pollutants through dry deposition processes depending upon forest structure, meteorology, and air quality that vary across space and time. Employing nationally available forest, weather, air pollution and human population data for 2010, computer simulations were performed for deciduous and evergreen trees with varying leaf area index for rural and urban areas in every county in the conterminous United States. The results populated a national database of annual air pollutant removal, concentration changes, and reductions in adverse health incidences and costs for NO2, O3, PM2.5 and SO2. The developed database enabled a first order approximation of air quality and associated human health benefits provided by trees with any forest configurations anywhere in the conterminous United States over time. Comprehensive national database of tree effects on air quality and human health in the United States was developed. Copyright © 2016 Elsevier Ltd. All rights reserved.
Dwyer, Johanna T.; Picciano, Mary Frances; Betz, Joseph M.; Fisher, Kenneth D.; Saldanha, Leila G.; Yetley, Elizabeth A.; Coates, Paul M.; Radimer, Kathy; Bindewald, Bernadette; Sharpless, Katherine E.; Holden, Joanne; Andrews, Karen; Zhao, Cuiwei; Harnly, James; Wolf, Wayne R.; Perry, Charles R.
2013-01-01
Several activities of the Office of Dietary Supplements (ODS) at the National Institutes of Health involve enhancement of dietary supplement databases. These include an initiative with US Department of Agriculture to develop an analytically substantiated dietary supplement ingredient database (DSID) and collaboration with the National Center for Health Statistics to enhance the dietary supplement label database in the National Health and Nutrition Examination Survey (NHANES). The many challenges that must be dealt with in developing an analytically supported DSID include categorizing product types in the database, identifying nutrients, and other components of public health interest in these products and prioritizing which will be entered in the database first. Additional tasks include developing methods and reference materials for quantifying the constituents, finding qualified laboratories to measure the constituents, developing appropriate sample handling procedures, and finally developing representative sampling plans. Developing the NHANES dietary supplement label database has other challenges such as collecting information on dietary supplement use from NHANES respondents, constant updating and refining of information obtained, developing default values that can be used if the respondent cannot supply the exact supplement or strength that was consumed, and developing a publicly available label database. Federal partners and the research community are assisting in making an analytically supported dietary supplement database a reality. PMID:25309034
DNA Barcodes for Forensically Important Fly Species in Brazil.
Koroiva, Ricardo; de Souza, Mirian S; Roque, Fabio de Oliveira; Pepinelli, Mateus
2018-04-07
Here, we analyze 248 DNA barcode sequences of 35 fly species of forensic importance in Brazil. DNA barcoding can be effectively used for specimen identification of these species, allowing the unambiguous identification of 31 species, an overall success rate of 88%. Our results show a high rate of success for molecular identification using DNA barcoding sequences and open new perspectives for immature species identification, a subject on which limited forensic investigations exist in Tropical regions. We also address the implications of building a robust forensic DNA barcode database. A geographic bias is recognized for the COI dataset available for forensically important fly species in Brazil, with concentration of sequences from specimens collected mainly in sites located in the Cerrado, Mata Atlântica, and Pampa biomes.
A DNA sequence analysis package for the IBM personal computer.
Lagrimini, L M; Brentano, S T; Donelson, J E
1984-01-01
We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
Identification of Rays through DNA Barcoding: An Application for Ecologists
Cerutti-Pereyra, Florencia; Meekan, Mark G.; Wei, Nu-Wei V.; O'Shea, Owen; Bradshaw, Corey J. A.; Austin, Chris M.
2012-01-01
DNA barcoding potentially offers scientists who are not expert taxonomists a powerful tool to support the accuracy of field studies involving taxa that are diverse and difficult to identify. The taxonomy of rays has received reasonable attention in Australia, although the fauna in remote locations such as Ningaloo Reef, Western Australia is poorly studied and the identification of some species in the field is problematic. Here, we report an application of DNA-barcoding to the identification of 16 species (from 10 genera) of tropical rays as part of an ecological study. Analysis of the dataset combined across all samples grouped sequences into clearly defined operational taxonomic units, with two conspicuous exceptions: the Neotrygon kuhlii species complex and the Aetobatus species complex. In the field, the group that presented the most difficulties for identification was the spotted whiptail rays, referred to as the ‘uarnak’ complex. Two sets of problems limited the successful application of DNA barcoding: (1) the presence of cryptic species, species complexes with unresolved taxonomic status and intra-specific geographical variation, and (2) insufficient numbers of entries in online databases that have been verified taxonomically, and the presence of lodged sequences in databases with inconsistent names. Nevertheless, we demonstrate the potential of the DNA barcoding approach to confirm field identifications and to highlight species complexes where taxonomic uncertainty might confound ecological data. PMID:22701556
Bridging Plant and Human Radiation Response and DNA Repair through an In Silico Approach
Nikitaki, Zacharenia; Pavlopoulou, Athanasia; Holá, Marcela; Donà, Mattia; Michalopoulos, Ioannis; Balestrazzi, Alma; Angelis, Karel J.; Georgakilas, Alexandros G.
2017-01-01
The mechanisms of response to radiation exposure are conserved in plants and animals. The DNA damage response (DDR) pathways are the predominant molecular pathways activated upon exposure to radiation, both in plants and animals. The conserved features of DDR in plants and animals might facilitate interdisciplinary studies that cross traditional boundaries between animal and plant biology in order to expand the collection of biomarkers currently used for radiation exposure monitoring (REM) in environmental and biomedical settings. Genes implicated in trans-kingdom conserved DDR networks often triggered by ionizing radiation (IR) and UV light are deposited into biological databases. In this study, we have applied an innovative approach utilizing data pertinent to plant and human genes from publicly available databases towards the design of a ‘plant radiation biodosimeter’, that is, a plant and DDR gene-based platform that could serve as a REM reliable biomarker for assessing environmental radiation exposure and associated risk. From our analysis, in addition to REM biomarkers, a significant number of genes, both in human and Arabidopsis thaliana, not yet characterized as DDR, are suggested as possible DNA repair players. Last but not least, we provide an example on the applicability of an Arabidopsis thaliana—based plant system monitoring the role of cancer-related DNA repair genes BRCA1, BARD1 and PARP1 in processing DNA lesions. PMID:28587301
Bridging Plant and Human Radiation Response and DNA Repair through an In Silico Approach.
Nikitaki, Zacharenia; Pavlopoulou, Athanasia; Holá, Marcela; Donà, Mattia; Michalopoulos, Ioannis; Balestrazzi, Alma; Angelis, Karel J; Georgakilas, Alexandros G
2017-06-06
The mechanisms of response to radiation exposure are conserved in plants and animals. The DNA damage response (DDR) pathways are the predominant molecular pathways activated upon exposure to radiation, both in plants and animals. The conserved features of DDR in plants and animals might facilitate interdisciplinary studies that cross traditional boundaries between animal and plant biology in order to expand the collection of biomarkers currently used for radiation exposure monitoring (REM) in environmental and biomedical settings. Genes implicated in trans-kingdom conserved DDR networks often triggered by ionizing radiation (IR) and UV light are deposited into biological databases. In this study, we have applied an innovative approach utilizing data pertinent to plant and human genes from publicly available databases towards the design of a 'plant radiation biodosimeter', that is, a plant and DDR gene-based platform that could serve as a REM reliable biomarker for assessing environmental radiation exposure and associated risk. From our analysis, in addition to REM biomarkers, a significant number of genes, both in human and Arabidopsis thaliana, not yet characterized as DDR, are suggested as possible DNA repair players. Last but not least, we provide an example on the applicability of an Arabidopsis thaliana- based plant system monitoring the role of cancer-related DNA repair genes BRCA1 , BARD1 and PARP1 in processing DNA lesions.
USDA-ARS?s Scientific Manuscript database
For nearly 20 years, the National Food and Nutrient Analysis Program (NFNAP) has expanded and improved the quantity and quality of data in US Department of Agriculture’s (USDA) food composition databases through the collection and analysis of nationally representative food samples. This manuscript d...
van Wieren-de Wijer, Diane B M A; Maitland-van der Zee, Anke-Hilse; de Boer, Anthonius; Stricker, Bruno H Ch; Kroon, Abraham A; de Leeuw, Peter W; Bozkurt, O; Klungel, Olaf H
2009-04-01
To describe the design, recruitment and baseline characteristics of participants in a community pharmacy based pharmacogenetic study of antihypertensive drug treatment. Participants enrolled from the population-based Pharmaco-Morbidity Record Linkage System. We designed a nested case-control study in which we will assess whether specific genetic polymorphisms modify the effect of antihypertensive drugs on the risk of myocardial infarction. In this study, cases (myocardial infarction) and controls were recruited through community pharmacies that participate in PHARMO. The PHARMO database comprises drug dispensing histories of about 2,000,000 subjects from a representative sample of Dutch community pharmacies linked to the national registrations of hospital discharges. In total we selected 31010 patients (2777 cases and 28233 controls) from the PHARMO database, of whom 15973 (1871 cases, 14102 controls) were approached through their community pharmacy. Overall response rate was 36.3% (n = 5791, 794 cases, 4997 controls), whereas 32.1% (n = 5126, 701 cases, 4425 controls) gave informed consent to genotype their DNA. As expected, several cardiovascular risk factors such as smoking, body mass index, hypercholesterolemia, and diabetes mellitus were more common in cases than in controls. Furthermore, cases more often used beta-blockers and calcium-antagonists, whereas controls more often used thiazide diuretics, ACE-inhibitors, and angiotensin-II receptor blockers. We have demonstrated that it is feasible to select patients from a coded database for a pharmacogenetic study and to approach them through community pharmacies, achieving reasonable response rates and without violating privacy rules.