Sample records for discovery rate analysis

  1. Quantitative trait Loci analysis using the false discovery rate.

    PubMed

    Benjamini, Yoav; Yekutieli, Daniel

    2005-10-01

    False discovery rate control has become an essential tool in any study that has a very large multiplicity problem. False discovery rate-controlling procedures have also been found to be very effective in QTL analysis, ensuring reproducible results with few falsely discovered linkages and offering increased power to discover QTL, although their acceptance has been slower than in microarray analysis, for example. The reason is partly because the methodological aspects of applying the false discovery rate to QTL mapping are not well developed. Our aim in this work is to lay a solid foundation for the use of the false discovery rate in QTL mapping. We review the false discovery rate criterion, the appropriate interpretation of the FDR, and alternative formulations of the FDR that appeared in the statistical and genetics literature. We discuss important features of the FDR approach, some stemming from new developments in FDR theory and methodology, which deem it especially useful in linkage analysis. We review false discovery rate-controlling procedures--the BH, the resampling procedure, and the adaptive two-stage procedure-and discuss the validity of these procedures in single- and multiple-trait QTL mapping. Finally we argue that the control of the false discovery rate has an important role in suggesting, indicating the significance of, and confirming QTL and present guidelines for its use.

  2. Separate class true discovery rate degree of association sets for biomarker identification.

    PubMed

    Crager, Michael R; Ahmed, Murat

    2014-01-01

    In 2008, Efron showed that biological features in a high-dimensional study can be divided into classes and a separate false discovery rate (FDR) analysis can be conducted in each class using information from the entire set of features to assess the FDR within each class. We apply this separate class approach to true discovery rate degree of association (TDRDA) set analysis, which is used in clinical-genomic studies to identify sets of biomarkers having strong association with clinical outcome or state while controlling the FDR. Careful choice of classes based on prior information can increase the identification power of the separate class analysis relative to the overall analysis.

  3. Assessing differential expression in two-color microarrays: a resampling-based empirical Bayes approach.

    PubMed

    Li, Dongmei; Le Pape, Marc A; Parikh, Nisha I; Chen, Will X; Dye, Timothy D

    2013-01-01

    Microarrays are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. Multiple testing methods in microarray data analysis aim at controlling both Type I and Type II error rates; however, real microarray data do not always fit their distribution assumptions. Smyth's ubiquitous parametric method, for example, inadequately accommodates violations of normality assumptions, resulting in inflated Type I error rates. The Significance Analysis of Microarrays, another widely used microarray data analysis method, is based on a permutation test and is robust to non-normally distributed data; however, the Significance Analysis of Microarrays method fold change criteria are problematic, and can critically alter the conclusion of a study, as a result of compositional changes of the control data set in the analysis. We propose a novel approach, combining resampling with empirical Bayes methods: the Resampling-based empirical Bayes Methods. This approach not only reduces false discovery rates for non-normally distributed microarray data, but it is also impervious to fold change threshold since no control data set selection is needed. Through simulation studies, sensitivities, specificities, total rejections, and false discovery rates are compared across the Smyth's parametric method, the Significance Analysis of Microarrays, and the Resampling-based empirical Bayes Methods. Differences in false discovery rates controls between each approach are illustrated through a preterm delivery methylation study. The results show that the Resampling-based empirical Bayes Methods offer significantly higher specificity and lower false discovery rates compared to Smyth's parametric method when data are not normally distributed. The Resampling-based empirical Bayes Methods also offers higher statistical power than the Significance Analysis of Microarrays method when the proportion of significantly differentially expressed genes is large for both normally and non-normally distributed data. Finally, the Resampling-based empirical Bayes Methods are generalizable to next generation sequencing RNA-seq data analysis.

  4. Identification of differentially expressed genes and false discovery rate in microarray studies.

    PubMed

    Gusnanto, Arief; Calza, Stefano; Pawitan, Yudi

    2007-04-01

    To highlight the development in microarray data analysis for the identification of differentially expressed genes, particularly via control of false discovery rate. The emergence of high-throughput technology such as microarrays raises two fundamental statistical issues: multiplicity and sensitivity. We focus on the biological problem of identifying differentially expressed genes. First, multiplicity arises due to testing tens of thousands of hypotheses, rendering the standard P value meaningless. Second, known optimal single-test procedures such as the t-test perform poorly in the context of highly multiple tests. The standard approach of dealing with multiplicity is too conservative in the microarray context. The false discovery rate concept is fast becoming the key statistical assessment tool replacing the P value. We review the false discovery rate approach and argue that it is more sensible for microarray data. We also discuss some methods to take into account additional information from the microarrays to improve the false discovery rate. There is growing consensus on how to analyse microarray data using the false discovery rate framework in place of the classical P value. Further research is needed on the preprocessing of the raw data, such as the normalization step and filtering, and on finding the most sensitive test procedure.

  5. Analysis of the rate of wildcat drilling and deposit discovery

    USGS Publications Warehouse

    Drew, L.J.

    1975-01-01

    The rate at which petroleum deposits were discovered during a 16-yr period (1957-72) was examined in relation to changes in a suite of economic and physical variables. The study area encompasses 11,000 mi2 and is located on the eastern flank of the Powder River Basin. A two-stage multiple-regression model was used as a basis for this analysis. The variables employed in this model were: (1) the yearly wildcat drilling rate, (2) a measure of the extent of the physical exhaustion of the resource base of the region, (3) a proxy for the discovery expectation of the exploration operators active in the region, (4) an exploration price/cost ratio, and (5) the expected depths of the exploration targets sought. The rate at which wildcat wells were drilled was strongly correlated with the discovery expectation of the exploration operators. Small additional variations in the wildcat drilling rate were explained by the price/cost ratio and target-depth variables. The number of deposits discovered each year was highly dependent on the wildcat drilling rate, but the aggregate quantity of petroleum discovered each year was independent of the wildcat drilling rate. The independence between these last two variables is a consequence of the cyclical behavior of the exploration play mechanism. Although the discovery success ratio declined sharply during the initial phases of the two exploration plays which developed in the study area, a learning effect occurred whereby the discovery success ratio improved steadily with the passage of time during both exploration plays. ?? 1975 Plenum Publishing Corporation.

  6. Controlling the joint local false discovery rate is more powerful than meta-analysis methods in joint analysis of summary statistics from multiple genome-wide association studies.

    PubMed

    Jiang, Wei; Yu, Weichuan

    2017-02-15

    In genome-wide association studies (GWASs) of common diseases/traits, we often analyze multiple GWASs with the same phenotype together to discover associated genetic variants with higher power. Since it is difficult to access data with detailed individual measurements, summary-statistics-based meta-analysis methods have become popular to jointly analyze datasets from multiple GWASs. In this paper, we propose a novel summary-statistics-based joint analysis method based on controlling the joint local false discovery rate (Jlfdr). We prove that our method is the most powerful summary-statistics-based joint analysis method when controlling the false discovery rate at a certain level. In particular, the Jlfdr-based method achieves higher power than commonly used meta-analysis methods when analyzing heterogeneous datasets from multiple GWASs. Simulation experiments demonstrate the superior power of our method over meta-analysis methods. Also, our method discovers more associations than meta-analysis methods from empirical datasets of four phenotypes. The R-package is available at: http://bioinformatics.ust.hk/Jlfdr.html . eeyu@ust.hk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  7. Retrospective analysis of natural products provides insights for future discovery trends.

    PubMed

    Pye, Cameron R; Bertin, Matthew J; Lokey, R Scott; Gerwick, William H; Linington, Roger G

    2017-05-30

    Understanding of the capacity of the natural world to produce secondary metabolites is important to a broad range of fields, including drug discovery, ecology, biosynthesis, and chemical biology, among others. Both the absolute number and the rate of discovery of natural products have increased significantly in recent years. However, there is a perception and concern that the fundamental novelty of these discoveries is decreasing relative to previously known natural products. This study presents a quantitative examination of the field from the perspective of both number of compounds and compound novelty using a dataset of all published microbial and marine-derived natural products. This analysis aimed to explore a number of key questions, such as how the rate of discovery of new natural products has changed over the past decades, how the average natural product structural novelty has changed as a function of time, whether exploring novel taxonomic space affords an advantage in terms of novel compound discovery, and whether it is possible to estimate how close we are to having described all of the chemical space covered by natural products. Our analyses demonstrate that most natural products being published today bear structural similarity to previously published compounds, and that the range of scaffolds readily accessible from nature is limited. However, the analysis also shows that the field continues to discover appreciable numbers of natural products with no structural precedent. Together, these results suggest that the development of innovative discovery methods will continue to yield compounds with unique structural and biological properties.

  8. Long-term trends in oil and gas discovery rates in lower 48 United States

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Woods, T.J.

    1985-09-01

    The Gas Research Institute (GRI), in association with Energy and Environmental Analysis, Inc. (EEA), has developed a data base characterizing the discovered oil and gas fields in the lower 48 United States. The number of fields in this data base reported to have been discovered since 1947 substantially exceeds the count presented in the AAPG survey of new-field discoveries since 1947. The greatest relative difference between the field counts is for fields larger than 10 million bbl of oil equivalent (BOE) (AAPG Class C fields or larger). Two factors contribute to the difference in reported discoveries by field size. First,more » the AAPG survey does not capture all new-field discoveries, particularly in the offshore. Second, the AAPG survey does not update field sizes past 6 years after the field discovery date. Because of reserve appreciation to discovered fields, discovery-trend data based on field-size data should be used with caution, particularly when field-size estimates have not been updated for a substantial period of time. Based on the GRI/EEA data base, the major decline in the discovery rates of large, new oil and gas fields in the lower 48 United States appears to have ended by the early 1960s. Since then, discovery rates seem to have improved. Thus, the outlook for future discoveries of large fields may be much better than previously believed.« less

  9. Retrospective analysis of natural products provides insights for future discovery trends

    PubMed Central

    Pye, Cameron R.; Bertin, Matthew J.; Lokey, R. Scott; Gerwick, William H.

    2017-01-01

    Understanding of the capacity of the natural world to produce secondary metabolites is important to a broad range of fields, including drug discovery, ecology, biosynthesis, and chemical biology, among others. Both the absolute number and the rate of discovery of natural products have increased significantly in recent years. However, there is a perception and concern that the fundamental novelty of these discoveries is decreasing relative to previously known natural products. This study presents a quantitative examination of the field from the perspective of both number of compounds and compound novelty using a dataset of all published microbial and marine-derived natural products. This analysis aimed to explore a number of key questions, such as how the rate of discovery of new natural products has changed over the past decades, how the average natural product structural novelty has changed as a function of time, whether exploring novel taxonomic space affords an advantage in terms of novel compound discovery, and whether it is possible to estimate how close we are to having described all of the chemical space covered by natural products. Our analyses demonstrate that most natural products being published today bear structural similarity to previously published compounds, and that the range of scaffolds readily accessible from nature is limited. However, the analysis also shows that the field continues to discover appreciable numbers of natural products with no structural precedent. Together, these results suggest that the development of innovative discovery methods will continue to yield compounds with unique structural and biological properties. PMID:28461474

  10. False Discovery Control in Large-Scale Spatial Multiple Testing

    PubMed Central

    Sun, Wenguang; Reich, Brian J.; Cai, T. Tony; Guindani, Michele; Schwartzman, Armin

    2014-01-01

    Summary This article develops a unified theoretical and computational framework for false discovery control in multiple testing of spatial signals. We consider both point-wise and cluster-wise spatial analyses, and derive oracle procedures which optimally control the false discovery rate, false discovery exceedance and false cluster rate, respectively. A data-driven finite approximation strategy is developed to mimic the oracle procedures on a continuous spatial domain. Our multiple testing procedures are asymptotically valid and can be effectively implemented using Bayesian computational algorithms for analysis of large spatial data sets. Numerical results show that the proposed procedures lead to more accurate error control and better power performance than conventional methods. We demonstrate our methods for analyzing the time trends in tropospheric ozone in eastern US. PMID:25642138

  11. Optimal False Discovery Rate Control for Dependent Data

    PubMed Central

    Xie, Jichun; Cai, T. Tony; Maris, John; Li, Hongzhe

    2013-01-01

    This paper considers the problem of optimal false discovery rate control when the test statistics are dependent. An optimal joint oracle procedure, which minimizes the false non-discovery rate subject to a constraint on the false discovery rate is developed. A data-driven marginal plug-in procedure is then proposed to approximate the optimal joint procedure for multivariate normal data. It is shown that the marginal procedure is asymptotically optimal for multivariate normal data with a short-range dependent covariance structure. Numerical results show that the marginal procedure controls false discovery rate and leads to a smaller false non-discovery rate than several commonly used p-value based false discovery rate controlling methods. The procedure is illustrated by an application to a genome-wide association study of neuroblastoma and it identifies a few more genetic variants that are potentially associated with neuroblastoma than several p-value-based false discovery rate controlling procedures. PMID:23378870

  12. A note on the false discovery rate of novel peptides in proteogenomics.

    PubMed

    Zhang, Kun; Fu, Yan; Zeng, Wen-Feng; He, Kun; Chi, Hao; Liu, Chao; Li, Yan-Chang; Gao, Yuan; Xu, Ping; He, Si-Min

    2015-10-15

    Proteogenomics has been well accepted as a tool to discover novel genes. In most conventional proteogenomic studies, a global false discovery rate is used to filter out false positives for identifying credible novel peptides. However, it has been found that the actual level of false positives in novel peptides is often out of control and behaves differently for different genomes. To quantitatively model this problem, we theoretically analyze the subgroup false discovery rates of annotated and novel peptides. Our analysis shows that the annotation completeness ratio of a genome is the dominant factor influencing the subgroup FDR of novel peptides. Experimental results on two real datasets of Escherichia coli and Mycobacterium tuberculosis support our conjecture. yfu@amss.ac.cn or xupingghy@gmail.com or smhe@ict.ac.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  13. The Paradox of Equipoise: The Principle That Drives and Limits Therapeutic Discoveries in Clinical Research

    PubMed Central

    Djulbegovic, Benjamin

    2009-01-01

    Background Progress in clinical medicine relies on the willingness of patients to take part in experimental clinical trials, particularly randomized controlled trials (RCTs). Before agreeing to enroll in clinical trials, patients require guarantees that they will not knowingly be harmed and will have the best possible chances of receiving the most favorable treatments. This guarantee is provided by the acknowledgment of uncertainty (equipoise), which removes ethical dilemmas and makes it easier for patients to enroll in clinical trials. Methods Since the design of clinical trials is mostly affected by clinical equipoise, the “clinical equipoise hypothesis” has been postulated. If the uncertainty requirement holds, this means that investigators cannot predict what they are going to discover in any individual trial that they undertake. In some instances, new treatments will be superior to standard treatments, while in others, standard treatments will be superior to experimental treatments, and in still others, no difference will be detected between new and standard treatments. It is hypothesized that there must be a relationship between the overall pattern of treatment successes and the uncertainties that RCTs are designed to address. Results An analysis of published trials shows that the results cannot be predicted at the level of individual trials. However, the results also indicate that the overall pattern of discovery of treatment success across a series of trials is predictable and is consistent with clinical equipoise hypothesis. The analysis shows that we can discover no more than 25% to 50% of successful treatments when they are tested in RCTs. The analysis also indicates that this discovery rate is optimal in helping to preserve the clinical trial system; a high discovery rate (eg, a 90% to 100% probability of success) is neither feasible nor desirable since under these circumstances, neither the patient nor the researcher has an interest in randomization. This in turn would halt the RCT system as we know it. Conclusions The “principle or law of clinical discovery” described herein predicts the efficiency of the current system of RCTs at generating discoveries of new treatments. The principle is derived from the requirement for uncertainty or equipoise as a precondition for RCTs, the precept that paradoxically drives discoveries of new treatments while limiting the proportion and rate of new therapeutic discoveries. PMID:19910921

  14. The Drug Discovery and Development Industry in India—Two Decades of Proprietary Small‐Molecule R&D

    PubMed Central

    2017-01-01

    Abstract This review provides a comprehensive survey of proprietary drug discovery and development efforts performed by Indian companies between 1994 and mid‐2016. It is based on the identification and detailed analysis of pharmaceutical, biotechnology, and contract research companies active in proprietary new chemical entity (NCE) research and development (R&D) in India. Information on preclinical and clinical development compounds was collected by company, therapeutic indication, mode of action, target class, and development status. The analysis focuses on the overall pipeline and its evolution over two decades, contributions by type of company, therapeutic focus, attrition rates, and contribution to Western pharmaceutical pipelines through licensing agreements. This comprehensive analysis is the first of its kind, and, in our view, represents a significant contribution to the understanding of the current state of the drug discovery and development industry in India. PMID:28464443

  15. Cloud-based solution to identify statistically significant MS peaks differentiating sample categories.

    PubMed

    Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B

    2013-03-23

    Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.

  16. The Drug Discovery and Development Industry in India-Two Decades of Proprietary Small-Molecule R&D.

    PubMed

    Differding, Edmond

    2017-06-07

    This review provides a comprehensive survey of proprietary drug discovery and development efforts performed by Indian companies between 1994 and mid-2016. It is based on the identification and detailed analysis of pharmaceutical, biotechnology, and contract research companies active in proprietary new chemical entity (NCE) research and development (R&D) in India. Information on preclinical and clinical development compounds was collected by company, therapeutic indication, mode of action, target class, and development status. The analysis focuses on the overall pipeline and its evolution over two decades, contributions by type of company, therapeutic focus, attrition rates, and contribution to Western pharmaceutical pipelines through licensing agreements. This comprehensive analysis is the first of its kind, and, in our view, represents a significant contribution to the understanding of the current state of the drug discovery and development industry in India. © 2017 The Author. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  17. The variable sky of deep synoptic surveys

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ridgway, Stephen T.; Matheson, Thomas; Mighell, Kenneth J.

    2014-11-20

    The discovery of variable and transient sources is an essential product of synoptic surveys. The alert stream will require filtering for personalized criteria—a process managed by a functionality commonly described as a Broker. In order to understand quantitatively the magnitude of the alert generation and Broker tasks, we have undertaken an analysis of the most numerous types of variable targets in the sky—Galactic stars, quasi-stellar objects (QSOs), active galactic nuclei (AGNs), and asteroids. It is found that the Large Synoptic Survey Telescope (LSST) will be capable of discovering ∼10{sup 5} high latitude (|b| > 20°) variable stars per night atmore » the beginning of the survey. (The corresponding number for |b| < 20° is orders of magnitude larger, but subject to caveats concerning extinction and crowding.) However, the number of new discoveries may well drop below 100 per night within less than one year. The same analysis applied to GAIA clarifies the complementarity of the GAIA and LSST surveys. Discovery of AGNs and QSOs are each predicted to begin at ∼3000 per night and decrease by 50 times over four years. Supernovae are expected at ∼1100 per night, and after several survey years will dominate the new variable discovery rate. LSST asteroid discoveries will start at >10{sup 5} per night, and if orbital determination has a 50% success rate per epoch, they will drop below 1000 per night within two years.« less

  18. New field discovery rates in lower 48 states

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Woods, T.J.; Hugman, R.; Vidas, H.

    1989-03-01

    Through 1982, AAPG reported new field discovery rates. In 1985, a paper demonstrated that through 1975 the AAPG survey of new field discoveries had significantly underreported the larger new field discoveries. This presentation updates the new field discovery data reported in that paper and extends the data through the mid-1980s. Regional details of the new field discoveries, including an explicit breakout of discoveries below 15,000 ft, are reported. The extent to which the observed relative stabilization in new field discoveries per wildcat reflects regional shifts in exploration activity is discussed. Finally, the rate of reserve growth reflected in the passagemore » of particular fields through the AAPG field size categories is discussed.« less

  19. The Secant Rate of Corrosion: Correlating Observations of the USS Arizona Submerged in Pearl Harbor

    NASA Astrophysics Data System (ADS)

    Johnson, Donald L.; DeAngelis, Robert J.; Medlin, Dana J.; Johnson, Jon E.; Carr, James D.; Conlin, David L.

    2018-03-01

    Contrary to previous linear projections of steel corrosion in seawater, analysis of an inert marker embedded in USS Arizona concretion since the 7 December 1941 attack on Pearl Harbor reveals evidence that the effective corrosion rate decreases with time. The secant rate of corrosion, or SRC correlation, derived from this discovery could have a significant impact on failure analysis investigations for concreted shipwrecks or underwater structures. The correlation yields a lower rate of metal thinning than predicted. Development of the correlation is described.

  20. Enhancing the detection of barcoded reads in high throughput DNA sequencing data by controlling the false discovery rate.

    PubMed

    Buschmann, Tilo; Zhang, Rong; Brash, Douglas E; Bystrykh, Leonid V

    2014-08-07

    DNA barcodes are short unique sequences used to label DNA or RNA-derived samples in multiplexed deep sequencing experiments. During the demultiplexing step, barcodes must be detected and their position identified. In some cases (e.g., with PacBio SMRT), the position of the barcode and DNA context is not well defined. Many reads start inside the genomic insert so that adjacent primers might be missed. The matter is further complicated by coincidental similarities between barcode sequences and reference DNA. Therefore, a robust strategy is required in order to detect barcoded reads and avoid a large number of false positives or negatives.For mass inference problems such as this one, false discovery rate (FDR) methods are powerful and balanced solutions. Since existing FDR methods cannot be applied to this particular problem, we present an adapted FDR method that is suitable for the detection of barcoded reads as well as suggest possible improvements. In our analysis, barcode sequences showed high rates of coincidental similarities with the Mus musculus reference DNA. This problem became more acute when the length of the barcode sequence decreased and the number of barcodes in the set increased. The method presented in this paper controls the tail area-based false discovery rate to distinguish between barcoded and unbarcoded reads. This method helps to establish the highest acceptable minimal distance between reads and barcode sequences. In a proof of concept experiment we correctly detected barcodes in 83% of the reads with a precision of 89%. Sensitivity improved to 99% at 99% precision when the adjacent primer sequence was incorporated in the analysis. The analysis was further improved using a paired end strategy. Following an analysis of the data for sequence variants induced in the Atp1a1 gene of C57BL/6 murine melanocytes by ultraviolet light and conferring resistance to ouabain, we found no evidence of cross-contamination of DNA material between samples. Our method offers a proper quantitative treatment of the problem of detecting barcoded reads in a noisy sequencing environment. It is based on the false discovery rate statistics that allows a proper trade-off between sensitivity and precision to be chosen.

  1. Knowledge Discovery from Posts in Online Health Communities Using Unified Medical Language System.

    PubMed

    Chen, Donghua; Zhang, Runtong; Liu, Kecheng; Hou, Lei

    2018-06-19

    Patient-reported posts in Online Health Communities (OHCs) contain various valuable information that can help establish knowledge-based online support for online patients. However, utilizing these reports to improve online patient services in the absence of appropriate medical and healthcare expert knowledge is difficult. Thus, we propose a comprehensive knowledge discovery method that is based on the Unified Medical Language System for the analysis of narrative posts in OHCs. First, we propose a domain-knowledge support framework for OHCs to provide a basis for post analysis. Second, we develop a Knowledge-Involved Topic Modeling (KI-TM) method to extract and expand explicit knowledge within the text. We propose four metrics, namely, explicit knowledge rate, latent knowledge rate, knowledge correlation rate, and perplexity, for the evaluation of the KI-TM method. Our experimental results indicate that our proposed method outperforms existing methods in terms of providing knowledge support. Our method enhances knowledge support for online patients and can help develop intelligent OHCs in the future.

  2. Observed oil and gas field size distributions: A consequence of the discovery process and prices of oil and gas

    USGS Publications Warehouse

    Drew, L.J.; Attanasi, E.D.; Schuenemeyer, J.H.

    1988-01-01

    If observed oil and gas field size distributions are obtained by random samplings, the fitted distributions should approximate that of the parent population of oil and gas fields. However, empirical evidence strongly suggests that larger fields tend to be discovered earlier in the discovery process than they would be by random sampling. Economic factors also can limit the number of small fields that are developed and reported. This paper examines observed size distributions in state and federal waters of offshore Texas. Results of the analysis demonstrate how the shape of the observable size distributions change with significant hydrocarbon price changes. Comparison of state and federal observed size distributions in the offshore area shows how production cost differences also affect the shape of the observed size distribution. Methods for modifying the discovery rate estimation procedures when economic factors significantly affect the discovery sequence are presented. A primary conclusion of the analysis is that, because hydrocarbon price changes can significantly affect the observed discovery size distribution, one should not be confident about inferring the form and specific parameters of the parent field size distribution from the observed distributions. ?? 1988 International Association for Mathematical Geology.

  3. Steps and Pips in the History of the Cumulative Recorder

    ERIC Educational Resources Information Center

    Lattal, Kennon A.

    2004-01-01

    From its inception in the 1930s until very recent times, the cumulative recorder was the most widely used measurement instrument in the experimental analysis of behavior. It was an essential instrument in the discovery and analysis of schedules of reinforcement, providing the first real-time analysis of operant response rates and patterns. This…

  4. Glioblastoma: Vascular Habitats Detected at Preoperative Dynamic Susceptibility-weighted Contrast-enhanced Perfusion MR Imaging Predict Survival.

    PubMed

    Juan-Albarracín, Javier; Fuster-Garcia, Elies; Pérez-Girbés, Alexandre; Aparici-Robles, Fernando; Alberich-Bayarri, Ángel; Revert-Ventura, Antonio; Martí-Bonmatí, Luis; García-Gómez, Juan M

    2018-06-01

    Purpose To determine if preoperative vascular heterogeneity of glioblastoma is predictive of overall survival of patients undergoing standard-of-care treatment by using an unsupervised multiparametric perfusion-based habitat-discovery algorithm. Materials and Methods Preoperative magnetic resonance (MR) imaging including dynamic susceptibility-weighted contrast material-enhanced perfusion studies in 50 consecutive patients with glioblastoma were retrieved. Perfusion parameters of glioblastoma were analyzed and used to automatically draw four reproducible habitats that describe the tumor vascular heterogeneity: high-angiogenic and low-angiogenic regions of the enhancing tumor, potentially tumor-infiltrated peripheral edema, and vasogenic edema. Kaplan-Meier and Cox proportional hazard analyses were conducted to assess the prognostic potential of the hemodynamic tissue signature to predict patient survival. Results Cox regression analysis yielded a significant correlation between patients' survival and maximum relative cerebral blood volume (rCBV max ) and maximum relative cerebral blood flow (rCBF max ) in high-angiogenic and low-angiogenic habitats (P < .01, false discovery rate-corrected P < .05). Moreover, rCBF max in the potentially tumor-infiltrated peripheral edema habitat was also significantly correlated (P < .05, false discovery rate-corrected P < .05). Kaplan-Meier analysis demonstrated significant differences between the observed survival of populations divided according to the median of the rCBV max or rCBF max at the high-angiogenic and low-angiogenic habitats (log-rank test P < .05, false discovery rate-corrected P < .05), with an average survival increase of 230 days. Conclusion Preoperative perfusion heterogeneity contains relevant information about overall survival in patients who undergo standard-of-care treatment. The hemodynamic tissue signature method automatically describes this heterogeneity, providing a set of vascular habitats with high prognostic capabilities. © RSNA, 2018.

  5. Search strategy has influenced the discovery rate of human viruses.

    PubMed

    Rosenberg, Ronald; Johansson, Michael A; Powers, Ann M; Miller, Barry R

    2013-08-20

    A widely held concern is that the pace of infectious disease emergence has been increasing. We have analyzed the rate of discovery of pathogenic viruses, the preeminent source of newly discovered causes of human disease, from 1897 through 2010. The rate was highest during 1950-1969, after which it moderated. This general picture masks two distinct trends: for arthropod-borne viruses, which comprised 39% of pathogenic viruses, the discovery rate peaked at three per year during 1960-1969, but subsequently fell nearly to zero by 1980; however, the rate of discovery of nonarboviruses remained stable at about two per year from 1950 through 2010. The period of highest arbovirus discovery coincided with a comprehensive program supported by The Rockefeller Foundation of isolating viruses from humans, animals, and arthropod vectors at field stations in Latin America, Africa, and India. The productivity of this strategy illustrates the importance of location, approach, long-term commitment, and sponsorship in the discovery of emerging pathogens.

  6. Compound annotation with real time cellular activity profiles to improve drug discovery.

    PubMed

    Fang, Ye

    2016-01-01

    In the past decade, a range of innovative strategies have been developed to improve the productivity of pharmaceutical research and development. In particular, compound annotation, combined with informatics, has provided unprecedented opportunities for drug discovery. In this review, a literature search from 2000 to 2015 was conducted to provide an overview of the compound annotation approaches currently used in drug discovery. Based on this, a framework related to a compound annotation approach using real-time cellular activity profiles for probe, drug, and biology discovery is proposed. Compound annotation with chemical structure, drug-like properties, bioactivities, genome-wide effects, clinical phenotypes, and textural abstracts has received significant attention in early drug discovery. However, these annotations are mostly associated with endpoint results. Advances in assay techniques have made it possible to obtain real-time cellular activity profiles of drug molecules under different phenotypes, so it is possible to generate compound annotation with real-time cellular activity profiles. Combining compound annotation with informatics, such as similarity analysis, presents a good opportunity to improve the rate of discovery of novel drugs and probes, and enhance our understanding of the underlying biology.

  7. Comparative mass spectrometry-based metabolomics strategies for the investigation of microbial secondary metabolites.

    PubMed

    Covington, Brett C; McLean, John A; Bachmann, Brian O

    2017-01-04

    Covering: 2000 to 2016The labor-intensive process of microbial natural product discovery is contingent upon identifying discrete secondary metabolites of interest within complex biological extracts, which contain inventories of all extractable small molecules produced by an organism or consortium. Historically, compound isolation prioritization has been driven by observed biological activity and/or relative metabolite abundance and followed by dereplication via accurate mass analysis. Decades of discovery using variants of these methods has generated the natural pharmacopeia but also contributes to recent high rediscovery rates. However, genomic sequencing reveals substantial untapped potential in previously mined organisms, and can provide useful prescience of potentially new secondary metabolites that ultimately enables isolation. Recently, advances in comparative metabolomics analyses have been coupled to secondary metabolic predictions to accelerate bioactivity and abundance-independent discovery work flows. In this review we will discuss the various analytical and computational techniques that enable MS-based metabolomic applications to natural product discovery and discuss the future prospects for comparative metabolomics in natural product discovery.

  8. SFC/MS in drug discovery at Pfizer, La Jolla

    NASA Astrophysics Data System (ADS)

    Bolaños, Ben; Greig, Michael; Ventura, Manuel; Farrell, William; Aurigemma, Christine M.; Li, Haitao; Quenzer, Terri L.; Tivel, Kathleen; Bylund, Jessica M. R.; Tran, Phuong; Pham, Catherine; Phillipson, Doug

    2004-11-01

    We report the use of supercritical fluid chromatography/mass spectrometry (SFC/MS) for numerous applications in drug discovery at Pfizer, La Jolla. Namely, SFC/MS has been heavily relied upon for analysis and purification of a diverse set of compounds from the in-house chemical library. Supporting high-speed SFC/MS quality control of the purified compounds is made possible at high flow rate SFC along with time-of-flight mass detection. The flexibility of SFC/MS systems has been extended with the integration of an atmospheric pressure photoionization source (APPI) for use with more non-polar compounds and enhancements in signal to noise. Further SFC/MS applications of note include chiral analysis for purification and assessment of enantiomers and SFC/MS analysis of difficult to separate hydrophobic peptides.

  9. Discrete False-Discovery Rate Improves Identification of Differentially Abundant Microbes.

    PubMed

    Jiang, Lingjing; Amir, Amnon; Morton, James T; Heller, Ruth; Arias-Castro, Ery; Knight, Rob

    2017-01-01

    Differential abundance testing is a critical task in microbiome studies that is complicated by the sparsity of data matrices. Here we adapt for microbiome studies a solution from the field of gene expression analysis to produce a new method, discrete false-discovery rate (DS-FDR), that greatly improves the power to detect differential taxa by exploiting the discreteness of the data. Additionally, DS-FDR is relatively robust to the number of noninformative features, and thus removes the problem of filtering taxonomy tables by an arbitrary abundance threshold. We show by using a combination of simulations and reanalysis of nine real-world microbiome data sets that this new method outperforms existing methods at the differential abundance testing task, producing a false-discovery rate that is up to threefold more accurate, and halves the number of samples required to find a given difference (thus increasing the efficiency of microbiome experiments considerably). We therefore expect DS-FDR to be widely applied in microbiome studies. IMPORTANCE DS-FDR can achieve higher statistical power to detect significant findings in sparse and noisy microbiome data compared to the commonly used Benjamini-Hochberg procedure and other FDR-controlling procedures.

  10. False-positive rate determination of protein target discovery using a covalent modification- and mass spectrometry-based proteomics platform.

    PubMed

    Strickland, Erin C; Geer, M Ariel; Hong, Jiyong; Fitzgerald, Michael C

    2014-01-01

    Detection and quantitation of protein-ligand binding interactions is important in many areas of biological research. Stability of proteins from rates of oxidation (SPROX) is an energetics-based technique for identifying the proteins targets of ligands in complex biological mixtures. Knowing the false-positive rate of protein target discovery in proteome-wide SPROX experiments is important for the correct interpretation of results. Reported here are the results of a control SPROX experiment in which chemical denaturation data is obtained on the proteins in two samples that originated from the same yeast lysate, as would be done in a typical SPROX experiment except that one sample would be spiked with the test ligand. False-positive rates of 1.2-2.2% and <0.8% are calculated for SPROX experiments using Q-TOF and Orbitrap mass spectrometer systems, respectively. Our results indicate that the false-positive rate is largely determined by random errors associated with the mass spectral analysis of the isobaric mass tag (e.g., iTRAQ®) reporter ions used for peptide quantitation. Our results also suggest that technical replicates can be used to effectively eliminate such false positives that result from this random error, as is demonstrated in a SPROX experiment to identify yeast protein targets of the drug, manassantin A. The impact of ion purity in the tandem mass spectral analyses and of background oxidation on the false-positive rate of protein target discovery using SPROX is also discussed.

  11. Exploring Site-Specific N-Glycosylation Microheterogeneity of Haptoglobin using Glycopeptide CID Tandem Mass Spectra and Glycan Database Search

    PubMed Central

    Chandler, Kevin Brown; Pompach, Petr; Goldman, Radoslav

    2013-01-01

    Glycosylation is a common protein modification with a significant role in many vital cellular processes and human diseases, making the characterization of protein-attached glycan structures important for understanding cell biology and disease processes. Direct analysis of protein N-glycosylation by tandem mass spectrometry of glycopeptides promises site-specific elucidation of N-glycan microheterogeneity, something which detached N-glycan and de-glycosylated peptide analyses cannot provide. However, successful implementation of direct N-glycopeptide analysis by tandem mass spectrometry remains a challenge. In this work, we consider algorithmic techniques for the analysis of LC-MS/MS data acquired from glycopeptide-enriched fractions of enzymatic digests of purified proteins. We implement a computational strategy which takes advantage of the properties of CID fragmentation spectra of N-glycopeptides, matching the MS/MS spectra to peptide-glycan pairs from protein sequences and glycan structure databases. Significantly, we also propose a novel false-discovery-rate estimation technique to estimate and manage the number of false identifications. We use a human glycoprotein standard, haptoglobin, digested with trypsin and GluC, enriched for glycopeptides using HILIC chromatography, and analyzed by LC-MS/MS to demonstrate our algorithmic strategy and evaluate its performance. Our software, GlycoPeptideSearch (GPS), assigned glycopeptide identifications to 246 of the spectra at false-discovery-rate 5.58%, identifying 42 distinct haptoglobin peptide-glycan pairs at each of the four haptoglobin N-linked glycosylation sites. We further demonstrate the effectiveness of this approach by analyzing plasma-derived haptoglobin, identifying 136 N-linked glycopeptide spectra at false-discovery-rate 0.4%, representing 15 distinct glycopeptides on at least three of the four N-linked glycosylation sites. The software, GlycoPeptideSearch, is available for download from http://edwardslab.bmcb.georgetown.edu/GPS. PMID:23829323

  12. COMPASS: a suite of pre- and post-search proteomics software tools for OMSSA

    PubMed Central

    Wenger, Craig D.; Phanstiel, Douglas H.; Lee, M. Violet; Bailey, Derek J.; Coon, Joshua J.

    2011-01-01

    Here we present the Coon OMSSA Proteomic Analysis Software Suite (COMPASS): a free and open-source software pipeline for high-throughput analysis of proteomics data, designed around the Open Mass Spectrometry Search Algorithm. We detail a synergistic set of tools for protein database generation, spectral reduction, peptide false discovery rate analysis, peptide quantitation via isobaric labeling, protein parsimony and protein false discovery rate analysis, and protein quantitation. We strive for maximum ease of use, utilizing graphical user interfaces and working with data files in the original instrument vendor format. Results are stored in plain text comma-separated values files, which are easy to view and manipulate with a text editor or spreadsheet program. We illustrate the operation and efficacy of COMPASS through the use of two LC–MS/MS datasets. The first is a dataset of a highly annotated mixture of standard proteins and manually validated contaminants that exhibits the identification workflow. The second is a dataset of yeast peptides, labeled with isobaric stable isotope tags and mixed in known ratios, to demonstrate the quantitative workflow. For these two datasets, COMPASS performs equivalently or better than the current de facto standard, the Trans-Proteomic Pipeline. PMID:21298793

  13. Working with Data: Discovering Knowledge through Mining and Analysis; Systematic Knowledge Management and Knowledge Discovery; Text Mining; Methodological Approach in Discovering User Search Patterns through Web Log Analysis; Knowledge Discovery in Databases Using Formal Concept Analysis; Knowledge Discovery with a Little Perspective.

    ERIC Educational Resources Information Center

    Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.

    2000-01-01

    These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)

  14. On the Discovery of Evolving Truth

    PubMed Central

    Li, Yaliang; Li, Qi; Gao, Jing; Su, Lu; Zhao, Bo; Fan, Wei; Han, Jiawei

    2015-01-01

    In the era of big data, information regarding the same objects can be collected from increasingly more sources. Unfortunately, there usually exist conflicts among the information coming from different sources. To tackle this challenge, truth discovery, i.e., to integrate multi-source noisy information by estimating the reliability of each source, has emerged as a hot topic. In many real world applications, however, the information may come sequentially, and as a consequence, the truth of objects as well as the reliability of sources may be dynamically evolving. Existing truth discovery methods, unfortunately, cannot handle such scenarios. To address this problem, we investigate the temporal relations among both object truths and source reliability, and propose an incremental truth discovery framework that can dynamically update object truths and source weights upon the arrival of new data. Theoretical analysis is provided to show that the proposed method is guaranteed to converge at a fast rate. The experiments on three real world applications and a set of synthetic data demonstrate the advantages of the proposed method over state-of-the-art truth discovery methods. PMID:26705502

  15. Palomar Planet-Crossing Asteroid Survey (PCAS): Recent discovery rate

    NASA Technical Reports Server (NTRS)

    Helin, Eleanor F.

    1992-01-01

    The discovery rate of Near-Earth Asteroids (NEA's) has increased significantly in the last decade. As greater numbers of NEA's are discovered, worldwide interest has grown leading to new programs. With the introduction of CCD telescopes throughout the world, an increase of 1-2 orders of magnitude in the discovery rate can be anticipated. Nevertheless, it will take several decades of dedicated searching to accomplish a 95 percent completeness, even for large objects.

  16. Structure-function analysis of diacylglycerol acyltransferase sequences for metabolic engineering and drug discovery

    USDA-ARS?s Scientific Manuscript database

    Diacylglycerol acyltransferase families (DGATs) catalyze the final and rate-limiting step of triacylglycerol (TAG) biosynthesis in eukaryotic organisms. DGAT knockout mice are resistant to diet-induced obesity and lack milk secretion. Over-expression of DGATs increases TAG in plants. Therefore, unde...

  17. A statistical method for the conservative adjustment of false discovery rate (q-value).

    PubMed

    Lai, Yinglei

    2017-03-14

    q-value is a widely used statistical method for estimating false discovery rate (FDR), which is a conventional significance measure in the analysis of genome-wide expression data. q-value is a random variable and it may underestimate FDR in practice. An underestimated FDR can lead to unexpected false discoveries in the follow-up validation experiments. This issue has not been well addressed in literature, especially in the situation when the permutation procedure is necessary for p-value calculation. We proposed a statistical method for the conservative adjustment of q-value. In practice, it is usually necessary to calculate p-value by a permutation procedure. This was also considered in our adjustment method. We used simulation data as well as experimental microarray or sequencing data to illustrate the usefulness of our method. The conservativeness of our approach has been mathematically confirmed in this study. We have demonstrated the importance of conservative adjustment of q-value, particularly in the situation that the proportion of differentially expressed genes is small or the overall differential expression signal is weak.

  18. A pleiotropy-informed Bayesian false discovery rate adapted to a shared control design finds new disease associations from GWAS summary statistics.

    PubMed

    Liley, James; Wallace, Chris

    2015-02-01

    Genome-wide association studies (GWAS) have been successful in identifying single nucleotide polymorphisms (SNPs) associated with many traits and diseases. However, at existing sample sizes, these variants explain only part of the estimated heritability. Leverage of GWAS results from related phenotypes may improve detection without the need for larger datasets. The Bayesian conditional false discovery rate (cFDR) constitutes an upper bound on the expected false discovery rate (FDR) across a set of SNPs whose p values for two diseases are both less than two disease-specific thresholds. Calculation of the cFDR requires only summary statistics and have several advantages over traditional GWAS analysis. However, existing methods require distinct control samples between studies. Here, we extend the technique to allow for some or all controls to be shared, increasing applicability. Several different SNP sets can be defined with the same cFDR value, and we show that the expected FDR across the union of these sets may exceed expected FDR in any single set. We describe a procedure to establish an upper bound for the expected FDR among the union of such sets of SNPs. We apply our technique to pairwise analysis of p values from ten autoimmune diseases with variable sharing of controls, enabling discovery of 59 SNP-disease associations which do not reach GWAS significance after genomic control in individual datasets. Most of the SNPs we highlight have previously been confirmed using replication studies or larger GWAS, a useful validation of our technique; we report eight SNP-disease associations across five diseases not previously declared. Our technique extends and strengthens the previous algorithm, and establishes robust limits on the expected FDR. This approach can improve SNP detection in GWAS, and give insight into shared aetiology between phenotypically related conditions.

  19. Monitoring Growth of Hard Corals as Performance Indicators for Coral Reefs

    ERIC Educational Resources Information Center

    Crabbe, M. James; Karaviotis, Sarah; Smith, David J.

    2004-01-01

    Digital videophotography, computer image analysis and physical measurements have been used to monitor sedimentation rates, coral cover, genera richness, rugosity, and estimated recruitment dates of massive corals at three different sites in the Wakatobi Marine National Park, Indonesia, and on the reefs around Discovery Bay, Jamaica.…

  20. 45 CFR 304.20 - Availability and rate of Federal financial participation.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... development of evidence including the use of the polygraph and genetic tests; (C) Pre-trial discovery; (ii... regulations having the effect of law; (iii) Identifying competent laboratories that perform genetic tests as... transporting blood and other samples of genetic material, repeated testing when necessary, analysis of test...

  1. 45 CFR 304.20 - Availability and rate of Federal financial participation.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... development of evidence including the use of the polygraph and genetic tests; (C) Pre-trial discovery; (ii... regulations having the effect of law; (iii) Identifying competent laboratories that perform genetic tests as... transporting blood and other samples of genetic material, repeated testing when necessary, analysis of test...

  2. 45 CFR 304.20 - Availability and rate of Federal financial participation.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... development of evidence including the use of the polygraph and genetic tests; (C) Pre-trial discovery; (ii... regulations having the effect of law; (iii) Identifying competent laboratories that perform genetic tests as... transporting blood and other samples of genetic material, repeated testing when necessary, analysis of test...

  3. Characterization and correction of the false-discovery rates in resting state connectivity using functional near-infrared spectroscopy

    NASA Astrophysics Data System (ADS)

    Santosa, Hendrik; Aarabi, Ardalan; Perlman, Susan B.; Huppert, Theodore J.

    2017-05-01

    Functional near-infrared spectroscopy (fNIRS) is a noninvasive neuroimaging technique that uses low levels of red to near-infrared light to measure changes in cerebral blood oxygenation. Spontaneous (resting state) functional connectivity (sFC) has become a critical tool for cognitive neuroscience for understanding task-independent neural networks, revealing pertinent details differentiating healthy from disordered brain function, and discovering fluctuations in the synchronization of interacting individuals during hyperscanning paradigms. Two of the main challenges to sFC-NIRS analysis are (i) the slow temporal structure of both systemic physiology and the response of blood vessels, which introduces false spurious correlations, and (ii) motion-related artifacts that result from movement of the fNIRS sensors on the participants' head and can introduce non-normal and heavy-tailed noise structures. In this work, we systematically examine the false-discovery rates of several time- and frequency-domain metrics of functional connectivity for characterizing sFC-NIRS. Specifically, we detail the modifications to the statistical models of these methods needed to avoid high levels of false-discovery related to these two sources of noise in fNIRS. We compare these analysis procedures using both simulated and experimental resting-state fNIRS data. Our proposed robust correlation method has better performance in terms of being more reliable to the noise outliers due to the motion artifacts.

  4. Simulation of effect of anti-radar stealth principle

    NASA Astrophysics Data System (ADS)

    Zhao, Borao; Xing, Shuchen; Li, Chunyi

    1988-02-01

    The paper presents simulation methods and results of the anti-radar stealth principle, proving that anti-radar stealth aircraft can drastically reduce the combat efficiency of an air defense radar system. In particular, when anti-radar stealth aircraft are coordinated with jamming as a self-defense soft weapon, the discovery probability, response time and hit rate of the air defense radar system are much lower, with extensive reduction in jamming power and maximum exposure distance of self-defense and long-range support. The paper describes an assumed combat situation and construction of a calculation model for the aircraft survival rate, as well as simulation results and analysis. Four figures show an enemy bomber attacking an airfield, as well as the effects of the radar effective reflecting surface on discovery probability, guidance radius, aircraft survival and exposure distance (for long-range support and jamming).

  5. Testing jumps via false discovery rate control.

    PubMed

    Yen, Yu-Min

    2013-01-01

    Many recently developed nonparametric jump tests can be viewed as multiple hypothesis testing problems. For such multiple hypothesis tests, it is well known that controlling type I error often makes a large proportion of erroneous rejections, and such situation becomes even worse when the jump occurrence is a rare event. To obtain more reliable results, we aim to control the false discovery rate (FDR), an efficient compound error measure for erroneous rejections in multiple testing problems. We perform the test via the Barndorff-Nielsen and Shephard (BNS) test statistic, and control the FDR with the Benjamini and Hochberg (BH) procedure. We provide asymptotic results for the FDR control. From simulations, we examine relevant theoretical results and demonstrate the advantages of controlling the FDR. The hybrid approach is then applied to empirical analysis on two benchmark stock indices with high frequency data.

  6. Statistical detection of EEG synchrony using empirical bayesian inference.

    PubMed

    Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven

    2015-01-01

    There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.

  7. SPIRE: Systematic protein investigative research environment.

    PubMed

    Kolker, Eugene; Higdon, Roger; Morgan, Phil; Sedensky, Margaret; Welch, Dean; Bauman, Andrew; Stewart, Elizabeth; Haynes, Winston; Broomall, William; Kolker, Natali

    2011-12-10

    The SPIRE (Systematic Protein Investigative Research Environment) provides web-based experiment-specific mass spectrometry (MS) proteomics analysis (https://www.proteinspire.org). Its emphasis is on usability and integration of the best analytic tools. SPIRE provides an easy to use web-interface and generates results in both interactive and simple data formats. In contrast to run-based approaches, SPIRE conducts the analysis based on the experimental design. It employs novel methods to generate false discovery rates and local false discovery rates (FDR, LFDR) and integrates the best and complementary open-source search and data analysis methods. The SPIRE approach of integrating X!Tandem, OMSSA and SpectraST can produce an increase in protein IDs (52-88%) over current combinations of scoring and single search engines while also providing accurate multi-faceted error estimation. One of SPIRE's primary assets is combining the results with data on protein function, pathways and protein expression from model organisms. We demonstrate some of SPIRE's capabilities by analyzing mitochondrial proteins from the wild type and 3 mutants of C. elegans. SPIRE also connects results to publically available proteomics data through its Model Organism Protein Expression Database (MOPED). SPIRE can also provide analysis and annotation for user supplied protein ID and expression data. Copyright © 2011. Published by Elsevier B.V.

  8. OGLE ATLAS OF CLASSICAL NOVAE. II. MAGELLANIC CLOUDS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mróz, P.; Udalski, A.; Poleski, R.

    2016-01-15

    The population of classical novae in the Magellanic Clouds was poorly known because of a lack of systematic studies. There were some suggestions that nova rates per unit mass in the Magellanic Clouds were higher than in any other galaxy. Here, we present an analysis of data collected over 16 years by the OGLE survey with the aim of characterizing the nova population in the Clouds. We found 20 eruptions of novae, half of which are new discoveries. We robustly measure nova rates of 2.4 ± 0.8 yr{sup −1} (LMC) and 0.9 ± 0.4 yr{sup −1} (SMC) and confirm that the K-band luminosity-specific novamore » rates in both Clouds are 2–3 times higher than in other galaxies. This can be explained by the star formation history in the Magellanic Clouds, specifically the re-ignition of the star formation rate a few Gyr ago. We also present the discovery of the intriguing system OGLE-MBR133.25.1160, which mimics recurrent nova eruptions.« less

  9. An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening

    PubMed Central

    2017-01-01

    DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing. PMID:28199790

  10. An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening.

    PubMed

    MacConnell, Andrew B; Price, Alexander K; Paegel, Brian M

    2017-03-13

    DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing.

  11. Global gene expression in channel catfish after vaccination with an attenuated Edwardsiella ictaluri

    USDA-ARS?s Scientific Manuscript database

    To understand the global gene expression in channel catfish after immersion vaccination with an attenuated Edwardsiella ictaluri (AquaVac ESCTM), microarray analysis of 65,182 UniGene transcripts were performed. With a filter of false-discovery rate less than 0.05 and fold change greater than 2, a t...

  12. Near-Earth asteroid discovery rate review

    NASA Technical Reports Server (NTRS)

    Helin, Eleanor F.

    1991-01-01

    Fifteen to twenty years ago the discovery of 1 or 2 Near Earth Asteroids (NEAs) per year was typical from one systematic search program, Palomar Planet Crossing Asteroid Survey (PCAS), and the incidental discovery from a variety of other astronomical program. Sky coverage and magnitude were both limited by slower emulsions, requiring longer exposures. The 1970's sky coverage of 15,000 to 25,000 sq. deg. per year led to about 1 NEA discovery every 13,000 sq. deg. Looking at the years from 1987 through 1990, it was found that by comparing 1987/1988 and 1989/1990, the world discovery rate of NEAs went from 20 to 43. More specifically, PCAS' results when grouped into the two year periods, show an increase from 5 discoveries in the 1st period to 20 in the 2nd period, a fourfold increase. Also, the discoveries went from representing about 25 pct. of the world total to about 50 pct. of discoveries worldwide. The surge of discoveries enjoyed by PCAS in particular is attributed to new fine grain sensitive emulsions, film hypering, more uniformity in the quality of the photograph, more equitable scheduling, better weather, and coordination of efforts. The maximum discoveries seem to have been attained at Palomar Schmidt.

  13. An extended sequential goodness-of-fit multiple testing method for discrete data.

    PubMed

    Castro-Conde, Irene; Döhler, Sebastian; de Uña-Álvarez, Jacobo

    2017-10-01

    The sequential goodness-of-fit (SGoF) multiple testing method has recently been proposed as an alternative to the familywise error rate- and the false discovery rate-controlling procedures in high-dimensional problems. For discrete data, the SGoF method may be very conservative. In this paper, we introduce an alternative SGoF-type procedure that takes into account the discreteness of the test statistics. Like the original SGoF, our new method provides weak control of the false discovery rate/familywise error rate but attains false discovery rate levels closer to the desired nominal level, and thus it is more powerful. We study the performance of this method in a simulation study and illustrate its application to a real pharmacovigilance data set.

  14. Handling Neighbor Discovery and Rendezvous Consistency with Weighted Quorum-Based Approach

    PubMed Central

    Own, Chung-Ming; Meng, Zhaopeng; Liu, Kehan

    2015-01-01

    Neighbor discovery and the power of sensors play an important role in the formation of Wireless Sensor Networks (WSNs) and mobile networks. Many asynchronous protocols based on wake-up time scheduling have been proposed to enable neighbor discovery among neighboring nodes for the energy saving, especially in the difficulty of clock synchronization. However, existing researches are divided two parts with the neighbor-discovery methods, one is the quorum-based protocols and the other is co-primality based protocols. Their distinction is on the arrangements of time slots, the former uses the quorums in the matrix, the latter adopts the numerical analysis. In our study, we propose the weighted heuristic quorum system (WQS), which is based on the quorum algorithm to eliminate redundant paths of active slots. We demonstrate the specification of our system: fewer active slots are required, the referring rate is balanced, and remaining power is considered particularly when a device maintains rendezvous with discovered neighbors. The evaluation results showed that our proposed method can effectively reschedule the active slots and save the computing time of the network system. PMID:26404297

  15. Forecasting petroleum discoveries in sparsely drilled areas: Nigeria and the North Sea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Attanasi, E.D.; Root, D.H.

    1988-10-01

    Decline function methods for projecting future discoveries generally capture the crowding effects of wildcat wells on the discovery rate. However, these methods do not accommodate easily situations where exploration areas and horizons are expanding. In this paper, a method is presented that uses a mapping algorithm for separating these often countervailing influences. The method is applied to Nigeria and the North Sea. For an amount of future drilling equivalent to past drilling (825 wildcat wells), future discoveries (in resources found) for Nigeria are expected to decline by 68% per well but still amount to 8.5 billion barrels of oil equivalentmore » (BOE). Similarly, for the total North Sea for an equivalent amount and mix among areas of past drilling (1322 wildcat wells), future discoveries are expected to amount to 17.9 billion BOE, whereas the average discovery rate per well is expected to decline by 71%.« less

  16. Forecasting petroleum discoveries in sparsely drilled areas: Nigeria and the North Sea

    USGS Publications Warehouse

    Attanasi, E.D.; Root, D.H.

    1988-01-01

    Decline function methods for projecting future discoveries generally capture the crowding effects of wildcat wells on the discovery rate. However, these methods do not accommodate easily situations where exploration areas and horizons are expanding. In this paper, a method is presented that uses a mapping algorithm for separating these often countervailing influences. The method is applied to Nigeria and the North Sea. For an amount of future drilling equivalent to past drilling (825 wildcat wells), future discoveries (in resources found) for Nigeria are expected to decline by 68% per well but still amount to 8.5 billion barrels of oil equivalent (BOE). Similarly, for the total North Sea for an equivalent amount and mix among areas of past drilling (1322 wildcat wells), future discoveries are expected to amount to 17.9 billion BOE, whereas the average discovery rate per well is expected to decline by 71%. ?? 1988 International Association for Mathematical Geology.

  17. Bioelectrical Markers of ADHD: Enhancement of Direct EEG Analysis

    ERIC Educational Resources Information Center

    Martín-Brufau, Ramón; Nombela Gómez, Manuel

    2017-01-01

    Introduction: So far some methods to help diagnosis of ADHD grounded in EEG decomposition by the FFT method and the discovery of the relationship between different frequency bands, the most clarifying the TBR rate in the prefrontal regions, have been proposed. This procedure requires a complex gadgetry so we evaluate the advantages of a simple…

  18. An effect size filter improves the reproducibility in spectral counting-based comparative proteomics.

    PubMed

    Gregori, Josep; Villarreal, Laura; Sánchez, Alex; Baselga, José; Villanueva, Josep

    2013-12-16

    The microarray community has shown that the low reproducibility observed in gene expression-based biomarker discovery studies is partially due to relying solely on p-values to get the lists of differentially expressed genes. Their conclusions recommended complementing the p-value cutoff with the use of effect-size criteria. The aim of this work was to evaluate the influence of such an effect-size filter on spectral counting-based comparative proteomic analysis. The results proved that the filter increased the number of true positives and decreased the number of false positives and the false discovery rate of the dataset. These results were confirmed by simulation experiments where the effect size filter was used to evaluate systematically variable fractions of differentially expressed proteins. Our results suggest that relaxing the p-value cut-off followed by a post-test filter based on effect size and signal level thresholds can increase the reproducibility of statistical results obtained in comparative proteomic analysis. Based on our work, we recommend using a filter consisting of a minimum absolute log2 fold change of 0.8 and a minimum signal of 2-4 SpC on the most abundant condition for the general practice of comparative proteomics. The implementation of feature filtering approaches could improve proteomic biomarker discovery initiatives by increasing the reproducibility of the results obtained among independent laboratories and MS platforms. Quality control analysis of microarray-based gene expression studies pointed out that the low reproducibility observed in the lists of differentially expressed genes could be partially attributed to the fact that these lists are generated relying solely on p-values. Our study has established that the implementation of an effect size post-test filter improves the statistical results of spectral count-based quantitative proteomics. The results proved that the filter increased the number of true positives whereas decreased the false positives and the false discovery rate of the datasets. The results presented here prove that a post-test filter applying a reasonable effect size and signal level thresholds helps to increase the reproducibility of statistical results in comparative proteomic analysis. Furthermore, the implementation of feature filtering approaches could improve proteomic biomarker discovery initiatives by increasing the reproducibility of results obtained among independent laboratories and MS platforms. This article is part of a Special Issue entitled: Standardization and Quality Control in Proteomics. Copyright © 2013 Elsevier B.V. All rights reserved.

  19. Nova Discovery Efficiency 1890-2014; Only 43%±6% of the Brightest Nova Are Discovered

    NASA Astrophysics Data System (ADS)

    Schaefer, Bradley E.

    2014-06-01

    Galactic nova discovery has always been the domain of the best amateur astronomers, with the only substantial exception being the use of the Harvard plates from 1890-1947. (Modern CCD surveys have not produced any significant nova discoveries.) From 1890-1946, novae were discovered by gentlemen who deeply knew the stars in the sky and who checked for new stars on every clear night. This all changed when war surplus binoculars became commonly available, so the various organizations (e.g., AAVSO, BAA) instructed their hunters to use binoculars to regularly search small areas of the Milky Way. In the 1970s the hunters largely switched to blinking photographs, while they switched to CCD images in the 1990s, all exclusively in Milky Way regions. Currently, most hunters use 'go-to' scopes to look deeply only in the Milky Way, use weekly or monthly cadences, never go outside to look up at the light-polluted skies, and do not have the stars memorized at all. This situation is good for catching many faint novae, but is inefficient for catching the more isotropic and systematically-fast bright novae.I have made an exhaustive analysis of all known novae to isolate the effects on the relative discovery efficiency as a function of decade, the elongation from the Sun, the Moon's phase, the declination, the peak magnitude, and the duration of the peak. For example, the relative efficiency for novae south of declination -33° is 0.5 before 1953, 0.2 from 1953-1990, and 0.8 after 1990. My analysis gives the overall discovery efficiency to be 43%±6%, 30%, 22%, 12%, and 6% for novae peaking brighter than 2, 4, 6, 8, and 10 mag. Thus, the majority of first magnitude novae are being missed. The bright novae are lost because they are too close to the Sun, in the far south, and/or very fast. This is illustrated by the discovery rate for Vpeak<2 novae being once every five years before 1946, yet only one such nova (V1500 Cyg) has been seen in the last 68 years. A critical consequence of this result is that the nova rate for our Milky Way has doubled.

  20. Implementation of false discovery rate for exploring novel paradigms and trait dimensions with ERPs.

    PubMed

    Crowley, Michael J; Wu, Jia; McCreary, Scott; Miller, Kelly; Mayes, Linda C

    2012-01-01

    False discovery rate (FDR) is a multiple comparison procedure that targets the expected proportion of false discoveries among the discoveries. Employing FDR methods in event-related potential (ERP) research provides an approach to explore new ERP paradigms and ERP-psychological trait/behavior relations. In Study 1, we examined neural responses to escape behavior from an aversive noise. In Study 2, we correlated a relatively unexplored trait dimension, ostracism, with neural response. In both situations we focused on the frontal cortical region, applying a channel by time plots to display statistically significant uncorrected data and FDR corrected data, controlling for multiple comparisons.

  1. POWER-ENHANCED MULTIPLE DECISION FUNCTIONS CONTROLLING FAMILY-WISE ERROR AND FALSE DISCOVERY RATES.

    PubMed

    Peña, Edsel A; Habiger, Joshua D; Wu, Wensong

    2011-02-01

    Improved procedures, in terms of smaller missed discovery rates (MDR), for performing multiple hypotheses testing with weak and strong control of the family-wise error rate (FWER) or the false discovery rate (FDR) are developed and studied. The improvement over existing procedures such as the Šidák procedure for FWER control and the Benjamini-Hochberg (BH) procedure for FDR control is achieved by exploiting possible differences in the powers of the individual tests. Results signal the need to take into account the powers of the individual tests and to have multiple hypotheses decision functions which are not limited to simply using the individual p -values, as is the case, for example, with the Šidák, Bonferroni, or BH procedures. They also enhance understanding of the role of the powers of individual tests, or more precisely the receiver operating characteristic (ROC) functions of decision processes, in the search for better multiple hypotheses testing procedures. A decision-theoretic framework is utilized, and through auxiliary randomizers the procedures could be used with discrete or mixed-type data or with rank-based nonparametric tests. This is in contrast to existing p -value based procedures whose theoretical validity is contingent on each of these p -value statistics being stochastically equal to or greater than a standard uniform variable under the null hypothesis. Proposed procedures are relevant in the analysis of high-dimensional "large M , small n " data sets arising in the natural, physical, medical, economic and social sciences, whose generation and creation is accelerated by advances in high-throughput technology, notably, but not limited to, microarray technology.

  2. Concordant integrative gene set enrichment analysis of multiple large-scale two-sample expression data sets.

    PubMed

    Lai, Yinglei; Zhang, Fanni; Nayak, Tapan K; Modarres, Reza; Lee, Norman H; McCaffrey, Timothy A

    2014-01-01

    Gene set enrichment analysis (GSEA) is an important approach to the analysis of coordinate expression changes at a pathway level. Although many statistical and computational methods have been proposed for GSEA, the issue of a concordant integrative GSEA of multiple expression data sets has not been well addressed. Among different related data sets collected for the same or similar study purposes, it is important to identify pathways or gene sets with concordant enrichment. We categorize the underlying true states of differential expression into three representative categories: no change, positive change and negative change. Due to data noise, what we observe from experiments may not indicate the underlying truth. Although these categories are not observed in practice, they can be considered in a mixture model framework. Then, we define the mathematical concept of concordant gene set enrichment and calculate its related probability based on a three-component multivariate normal mixture model. The related false discovery rate can be calculated and used to rank different gene sets. We used three published lung cancer microarray gene expression data sets to illustrate our proposed method. One analysis based on the first two data sets was conducted to compare our result with a previous published result based on a GSEA conducted separately for each individual data set. This comparison illustrates the advantage of our proposed concordant integrative gene set enrichment analysis. Then, with a relatively new and larger pathway collection, we used our method to conduct an integrative analysis of the first two data sets and also all three data sets. Both results showed that many gene sets could be identified with low false discovery rates. A consistency between both results was also observed. A further exploration based on the KEGG cancer pathway collection showed that a majority of these pathways could be identified by our proposed method. This study illustrates that we can improve detection power and discovery consistency through a concordant integrative analysis of multiple large-scale two-sample gene expression data sets.

  3. Discovery of potential protein biomarkers of lung adenocarcinoma in bronchoalveolar lavage fluid by SWATH MS data-independent acquisition and targeted data extraction.

    PubMed

    Ortea, I; Rodríguez-Ariza, A; Chicano-Gálvez, E; Arenas Vacas, M S; Jurado Gámez, B

    2016-04-14

    Lung cancer currently ranks as the neoplasia with the highest global mortality rate. Although some improvements have been introduced in recent years, new advances in diagnosis are required in order to increase survival rates. New mildly invasive endoscopy-based diagnostic techniques include the collection of bronchoalveolar lavage fluid (BALF), which is discarded after using a portion of the fluid for standard pathological procedures. BALF proteomic analysis can contribute to clinical practice with more sensitive biomarkers, and can complement cytohistological studies by aiding in the diagnosis, prognosis, and subtyping of lung cancer, as well as the monitoring of treatment response. The range of quantitative proteomics methodologies used for biomarker discovery is currently being broadened with the introduction of data-independent acquisition (DIA) analysis-related approaches that address the massive quantitation of the components of a proteome. Here we report for the first time a DIA-based quantitative proteomics study using BALF as the source for the discovery of potential lung cancer biomarkers. The results have been encouraging in terms of the number of identified and quantified proteins. A panel of candidate protein biomarkers for adenocarcinoma in BALF is reported; this points to the activation of the complement network as being strongly over-represented and suggests this pathway as a potential target for lung cancer research. In addition, the results reported for haptoglobin, complement C4-A, and glutathione S-transferase pi are consistent with previous studies, which indicates that these proteins deserve further consideration as potential lung cancer biomarkers in BALF. Our study demonstrates that the analysis of BALF proteins by liquid chromatography-tandem mass spectrometry (LC-MS/MS), combining a simple sample pre-treatment and SWATH DIA MS, is a useful method for the discovery of potential lung cancer biomarkers. Bronchoalveolar lavage fluid (BALF) analysis can contribute to clinical practice with more sensitive biomarkers, thus complementing cytohistological studies in order to aid in the diagnosis, prognosis, and subtyping of lung cancer, as well as the monitoring of treatment response. Here we report a panel of candidate protein biomarkers for adenocarcinoma in BALF. Forty-four proteins showed a fold-change higher than 3.75 among adenocarcinoma patients compared with controls. This report is the first DIA-based quantitative proteomics study to use bronchoalveolar lavage fluid (BALF) as a matrix for discovering potential biomarkers. The results are encouraging in terms of the number of identified and quantified proteins, demonstrating that the analysis of BALF proteins by a SWATH approach is a useful method for the discovery of potential biomarkers of pulmonary diseases. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Maximizing the sensitivity and reliability of peptide identification in large-scale proteomic experiments by harnessing multiple search engines.

    PubMed

    Yu, Wen; Taylor, J Alex; Davis, Michael T; Bonilla, Leo E; Lee, Kimberly A; Auger, Paul L; Farnsworth, Chris C; Welcher, Andrew A; Patterson, Scott D

    2010-03-01

    Despite recent advances in qualitative proteomics, the automatic identification of peptides with optimal sensitivity and accuracy remains a difficult goal. To address this deficiency, a novel algorithm, Multiple Search Engines, Normalization and Consensus is described. The method employs six search engines and a re-scoring engine to search MS/MS spectra against protein and decoy sequences. After the peptide hits from each engine are normalized to error rates estimated from the decoy hits, peptide assignments are then deduced using a minimum consensus model. These assignments are produced in a series of progressively relaxed false-discovery rates, thus enabling a comprehensive interpretation of the data set. Additionally, the estimated false-discovery rate was found to have good concordance with the observed false-positive rate calculated from known identities. Benchmarking against standard proteins data sets (ISBv1, sPRG2006) and their published analysis, demonstrated that the Multiple Search Engines, Normalization and Consensus algorithm consistently achieved significantly higher sensitivity in peptide identifications, which led to increased or more robust protein identifications in all data sets compared with prior methods. The sensitivity and the false-positive rate of peptide identification exhibit an inverse-proportional and linear relationship with the number of participating search engines.

  5. Toward a Quantitative Theory of Intellectual Discovery (Especially in Physics).

    ERIC Educational Resources Information Center

    Fowler, Richard G.

    1987-01-01

    Studies time intervals in a list of critical ideas in physics. Infers that the rate of growth of ideas has been proportional to the totality of known ideas multiplied by the totality of people in the world. Indicates that the rate of discovery in physics has been decreasing. (CW)

  6. The petroleum exponential (again)

    NASA Astrophysics Data System (ADS)

    Bell, Peter M.

    The U.S. production and reserves of liquid and gaseous petroleum have declined since 1960, at least in the lower 48 states. This decline stems from decreased discovery rates, as predicted by M. King Hubbert in the mid-1950's. Hubbert's once unpopular views were based on statistical analysis of the production history of the petroleum industry, and now, even with inclusion of the statistical perturbation caused by the Prudhoe Bay-North Alaskan Slope discovery (the largest oil field ever found in the United States), it seems clear again that production is following the exponential curve to depletion of the resource—to the end of the ultimate yield of petroleum from wells in the United States.In a recent report, C. Hall and C. Cleveland of Cornell University show that large atypical discoveries, such as the Prudhoe Bay find, are but minor influences on what now appears to be the crucial intersection of two exponentials [Science, 211, 576-579, 1981]: the production-per-drilled-foot curve of Hubbert, which crosses zero production no later than the year 2005; the other, a curve that plots the energy cost of drilling and extraction with time; that is, the cost-time rate of how much oil is used to drill and extract oil from the ground. The intersection, if no other discoveries the size of the Prudhoe Bay field are made, could be as early as 1990, the end of the present decade. The inclusion of each Prudhoe-Bay-size find extends the year of intersection by only about 6 years. Beyond that point, more than one barrel of petroleum would be expended for each barrel extracted from the ground. The oil exploration-extraction and refining industry is currently the second most energy-intensive industry in the U.S., and the message seems clear. Either more efficient drilling and production techniques are discovered, or domestic production will cease well before the end of this century if the Hubbert analysis modified by Hall and Cleveland is correct.

  7. Mass spectrometry of peptides and proteins from human blood.

    PubMed

    Zhu, Peihong; Bowden, Peter; Zhang, Du; Marshall, John G

    2011-01-01

    It is difficult to convey the accelerating rate and growing importance of mass spectrometry applications to human blood proteins and peptides. Mass spectrometry can rapidly detect and identify the ionizable peptides from the proteins in a simple mixture and reveal many of their post-translational modifications. However, blood is a complex mixture that may contain many proteins first expressed in cells and tissues. The complete analysis of blood proteins is a daunting task that will rely on a wide range of disciplines from physics, chemistry, biochemistry, genetics, electromagnetic instrumentation, mathematics and computation. Therefore the comprehensive discovery and analysis of blood proteins will rank among the great technical challenges and require the cumulative sum of many of mankind's scientific achievements together. A variety of methods have been used to fractionate, analyze and identify proteins from blood, each yielding a small piece of the whole and throwing the great size of the task into sharp relief. The approaches attempted to date clearly indicate that enumerating the proteins and peptides of blood can be accomplished. There is no doubt that the mass spectrometry of blood will be crucial to the discovery and analysis of proteins, enzyme activities, and post-translational processes that underlay the mechanisms of disease. At present both discovery and quantification of proteins from blood are commonly reaching sensitivities of ∼1 ng/mL. Copyright © 2010 Wiley Periodicals, Inc.

  8. Characterization of a Genomic Signature of Pregnancy in the Breast

    PubMed Central

    Belitskaya-Lévy, Ilana; Zeleniuch-Jacquotte, Anne; Russo, Jose; Russo, Irma H.; Bordás, Pal; Åhman, Janet; Afanasyeva, Yelena; Johansson, Robert; Lenner, Per; Li, Xiaochun; de Cicco, Ricardo López; Peri, Suraj; Ross, Eric; Russo, Patricia A.; Santucci-Pereira, Julia; Sheriff, Fathima S.; Slifker, Michael; Hallmans, Göran; Toniolo, Paolo; Arslan, Alan A.

    2012-01-01

    The objective of the current study was to comprehensively compare the genomic profiles in the breast of parous and nulliparous postmenopausal women to identify genes that permanently change their expression following pregnancy. The study was designed as a two-phase approach. In the discovery phase, we compared breast genomic profiles of 37 parous with 18 nulliparous postmenopausal women. In the validation phase, confirmation of the genomic patterns observed in the discovery phase was sought in an independent set of 30 parous and 22 nulliparous postmenopausal women. RNA was hybridized to Affymetrix HG_U133 Plus 2.0 oligonucleotide arrays containing probes to 54,675 transcripts; scanned and the images analyzed using Affymetrix GCOS software. Surrogate variable analysis, logistic regression and significance analysis for microarrays were used to identify statistically significant differences in expression of genes. The False Discovery Rate (FDR) approach was used to control for multiple comparisons. We found that 208 genes (305 probe sets) were differentially expressed between parous and nulliparous women in both discovery and validation phases of the study at a FDR of 10% and with at least a 1.25-fold change. These genes are involved in regulation of transcription, centrosome organization, RNA splicing, cell cycle control, adhesion and differentiation. The results provide persuasive evidence that full-term pregnancy induces long-term genomic changes in the breast. The genomic signature of pregnancy could be used as an intermediate marker to assess potential chemopreventive interventions with hormones mimicking the effects of pregnancy for prevention of breast cancer. PMID:21622728

  9. MAGIC database and interfaces: an integrated package for gene discovery and expression.

    PubMed

    Cordonnier-Pratt, Marie-Michèle; Liang, Chun; Wang, Haiming; Kolychev, Dmitri S; Sun, Feng; Freeman, Robert; Sullivan, Robert; Pratt, Lee H

    2004-01-01

    The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC) Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs), and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.

  10. Coordinate based random effect size meta-analysis of neuroimaging studies.

    PubMed

    Tench, C R; Tanasescu, Radu; Constantinescu, C S; Auer, D P; Cottam, W J

    2017-06-01

    Low power in neuroimaging studies can make them difficult to interpret, and Coordinate based meta-analysis (CBMA) may go some way to mitigating this issue. CBMA has been used in many analyses to detect where published functional MRI or voxel-based morphometry studies testing similar hypotheses report significant summary results (coordinates) consistently. Only the reported coordinates and possibly t statistics are analysed, and statistical significance of clusters is determined by coordinate density. Here a method of performing coordinate based random effect size meta-analysis and meta-regression is introduced. The algorithm (ClusterZ) analyses both coordinates and reported t statistic or Z score, standardised by the number of subjects. Statistical significance is determined not by coordinate density, but by a random effects meta-analyses of reported effects performed cluster-wise using standard statistical methods and taking account of censoring inherent in the published summary results. Type 1 error control is achieved using the false cluster discovery rate (FCDR), which is based on the false discovery rate. This controls both the family wise error rate under the null hypothesis that coordinates are randomly drawn from a standard stereotaxic space, and the proportion of significant clusters that are expected under the null. Such control is necessary to avoid propagating and even amplifying the very issues motivating the meta-analysis in the first place. ClusterZ is demonstrated on both numerically simulated data and on real data from reports of grey matter loss in multiple sclerosis (MS) and syndromes suggestive of MS, and of painful stimulus in healthy controls. The software implementation is available to download and use freely. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Target-decoy Based False Discovery Rate Estimation for Large-scale Metabolite Identification.

    PubMed

    Wang, Xusheng; Jones, Drew R; Shaw, Timothy I; Cho, Ji-Hoon; Wang, Yuanyuan; Tan, Haiyan; Xie, Boer; Zhou, Suiping; Li, Yuxin; Peng, Junmin

    2018-05-23

    Metabolite identification is a crucial step in mass spectrometry (MS)-based metabolomics. However, it is still challenging to assess the confidence of assigned metabolites. In this study, we report a novel method for estimating false discovery rate (FDR) of metabolite assignment with a target-decoy strategy, in which the decoys are generated through violating the octet rule of chemistry by adding small odd numbers of hydrogen atoms. The target-decoy strategy was integrated into JUMPm, an automated metabolite identification pipeline for large-scale MS analysis, and was also evaluated with two other metabolomics tools, mzMatch and mzMine 2. The reliability of FDR calculation was examined by false datasets, which were simulated by altering MS1 or MS2 spectra. Finally, we used the JUMPm pipeline coupled with the target-decoy strategy to process unlabeled and stable-isotope labeled metabolomic datasets. The results demonstrate that the target-decoy strategy is a simple and effective method for evaluating the confidence of high-throughput metabolite identification.

  12. Thermochemical Analysis of Neutralization Reactions: An Introductory Discovery Experiment

    ERIC Educational Resources Information Center

    Mills, Kenneth V.; Gullmette, Louise W.

    2007-01-01

    The article describes a new discovery experiment that uses thermodynamical analysis to study neutralization reactions based on neutralization of citric acid. The experiment would be able to reinforce students' understanding of stoichiometry and allow for the discovery of basic concepts of thermochemistry.

  13. An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq.

    PubMed

    Yuan, Yongxian; Xu, Huaiqian; Leung, Ross Ka-Kit

    2016-05-26

    Previous studies compared running cost, time and other performance measures of popular sequencing platforms. However, comprehensive assessment of library construction and analysis protocols for Proton sequencing platform remains unexplored. Unlike Illumina sequencing platforms, Proton reads are heterogeneous in length and quality. When sequencing data from different platforms are combined, this can result in reads with various read length. Whether the performance of the commonly used software for handling such kind of data is satisfactory is unknown. By using universal human reference RNA as the initial material, RNaseIII and chemical fragmentation methods in library construction showed similar result in gene and junction discovery number and expression level estimated accuracy. In contrast, sequencing quality, read length and the choice of software affected mapping rate to a much larger extent. Unspliced aligner TMAP attained the highest mapping rate (97.27 % to genome, 86.46 % to transcriptome), though 47.83 % of mapped reads were clipped. Long reads could paradoxically reduce mapping in junctions. With reference annotation guide, the mapping rate of TopHat2 significantly increased from 75.79 to 92.09 %, especially for long (>150 bp) reads. Sailfish, a k-mer based gene expression quantifier attained highly consistent results with that of TaqMan array and highest sensitivity. We provided for the first time, the reference statistics of library preparation methods, gene detection and quantification and junction discovery for RNA-Seq by the Ion Proton platform. Chemical fragmentation performed equally well with the enzyme-based one. The optimal Ion Proton sequencing options and analysis software have been evaluated.

  14. Low-z Type Ia Supernova Calibration

    NASA Astrophysics Data System (ADS)

    Hamuy, Mario

    The discovery of acceleration and dark energy in 1998 arguably constitutes one of the most revolutionary discoveries in astrophysics in recent years. This paradigm shift was possible thanks to one of the most traditional cosmological tests: the redshift-distance relation between galaxies. This discovery was based on a differential measurement of the expansion rate of the universe: the current one provided by nearby (low-z) type Ia supernovae and the one in the past measured from distant (high-z) supernovae. This paper focuses on the first part of this journey: the calibration of the type Ia supernova luminosities and the local expansion rate of the universe, which was made possible thanks to the introduction of digital CCD (charge-coupled device) digital photometry. The new technology permitted us in the early 1990s to convert supernovae as precise tools to measure extragalactic distances through two key surveys: (1) the "Tololo Supernova Program" which made possible the critical discovery of the "peak luminosity-decline rate" relation for type Ia supernovae, the key underlying idea today behind precise cosmology from supernovae, and (2) the Calán/Tololo project which provided the low - z type Ia supernova sample for the discovery of acceleration.

  15. Natural products and drug discovery: a survey of stakeholders in industry and academia.

    PubMed

    Amirkia, Vafa; Heinrich, Michael

    2015-01-01

    In recent decades, natural products have undisputedly played a leading role in the development of novel medicines. Yet, trends in the pharmaceutical industry at the level of research investments indicate that natural product research is neither prioritized nor perceived as fruitful in drug discovery programmes as compared with incremental structural modifications and large volume HTS screening of synthetics. We seek to understand this phenomenon through insights from highly experienced natural product experts in industry and academia. We conducted a survey including a series of qualitative and quantitative questions related to current insights and prospective developments in natural product drug development. The survey was completed by a cross-section of 52 respondents in industry and academia. One recurrent theme is the dissonance between the perceived high potential of NP as drug leads among individuals and the survey participants' assessment of the overall industry and/or company level strategies and their success. The study's industry and academic respondents did not perceive current discovery efforts as more effective as compared with previous decades, yet industry contacts perceived higher hit rates in HTS efforts as compared with academic respondents. Surprisingly, many industry contacts were highly critical to prevalent company and industry-wide drug discovery strategies indicating a high level of dissatisfaction within the industry. These findings support the notion that there is an increasing gap in perception between the effectiveness of well established, commercially widespread drug discovery strategies between those working in industry and academic experts. This research seeks to shed light on this gap and aid in furthering natural product discovery endeavors through an analysis of current bottlenecks in industry drug discovery programmes.

  16. DiscoverySpace: an interactive data analysis application

    PubMed Central

    Robertson, Neil; Oveisi-Fordorei, Mehrdad; Zuyderduyn, Scott D; Varhol, Richard J; Fjell, Christopher; Marra, Marco; Jones, Steven; Siddiqui, Asim

    2007-01-01

    DiscoverySpace is a graphical application for bioinformatics data analysis. Users can seamlessly traverse references between biological databases and draw together annotations in an intuitive tabular interface. Datasets can be compared using a suite of novel tools to aid in the identification of significant patterns. DiscoverySpace is of broad utility and its particular strength is in the analysis of serial analysis of gene expression (SAGE) data. The application is freely available online. PMID:17210078

  17. Normalization and microbial differential abundance strategies depend upon data characteristics.

    PubMed

    Weiss, Sophie; Xu, Zhenjiang Zech; Peddada, Shyamal; Amir, Amnon; Bittinger, Kyle; Gonzalez, Antonio; Lozupone, Catherine; Zaneveld, Jesse R; Vázquez-Baeza, Yoshiki; Birmingham, Amanda; Hyde, Embriette R; Knight, Rob

    2017-03-03

    Data from 16S ribosomal RNA (rRNA) amplicon sequencing present challenges to ecological and statistical interpretation. In particular, library sizes often vary over several ranges of magnitude, and the data contains many zeros. Although we are typically interested in comparing relative abundance of taxa in the ecosystem of two or more groups, we can only measure the taxon relative abundance in specimens obtained from the ecosystems. Because the comparison of taxon relative abundance in the specimen is not equivalent to the comparison of taxon relative abundance in the ecosystems, this presents a special challenge. Second, because the relative abundance of taxa in the specimen (as well as in the ecosystem) sum to 1, these are compositional data. Because the compositional data are constrained by the simplex (sum to 1) and are not unconstrained in the Euclidean space, many standard methods of analysis are not applicable. Here, we evaluate how these challenges impact the performance of existing normalization methods and differential abundance analyses. Effects on normalization: Most normalization methods enable successful clustering of samples according to biological origin when the groups differ substantially in their overall microbial composition. Rarefying more clearly clusters samples according to biological origin than other normalization techniques do for ordination metrics based on presence or absence. Alternate normalization measures are potentially vulnerable to artifacts due to library size. Effects on differential abundance testing: We build on a previous work to evaluate seven proposed statistical methods using rarefied as well as raw data. Our simulation studies suggest that the false discovery rates of many differential abundance-testing methods are not increased by rarefying itself, although of course rarefying results in a loss of sensitivity due to elimination of a portion of available data. For groups with large (~10×) differences in the average library size, rarefying lowers the false discovery rate. DESeq2, without addition of a constant, increased sensitivity on smaller datasets (<20 samples per group) but tends towards a higher false discovery rate with more samples, very uneven (~10×) library sizes, and/or compositional effects. For drawing inferences regarding taxon abundance in the ecosystem, analysis of composition of microbiomes (ANCOM) is not only very sensitive (for >20 samples per group) but also critically the only method tested that has a good control of false discovery rate. These findings guide which normalization and differential abundance techniques to use based on the data characteristics of a given study.

  18. PERSONAL AND CIRCUMSTANTIAL FACTORS INFLUENCING THE ACT OF DISCOVERY.

    ERIC Educational Resources Information Center

    OSTRANDER, EDWARD R.

    HOW STUDENTS SAY THEY LEARN WAS INVESTIGATED. INTERVIEWS WITH A RANDOM SAMPLE OF 74 WOMEN STUDENTS POSED QUESTIONS ABOUT THE NATURE, FREQUENCY, PATTERNS, AND CIRCUMSTANCES UNDER WHICH ACTS OF DISCOVERY TAKE PLACE IN THE ACADEMIC SETTING. STUDENTS WERE ASSIGNED DISCOVERY RATINGS BASED ON READINGS OF TYPESCRIPTS. EACH STUDENT WAS CLASSIFIED AND…

  19. Specificity control for read alignments using an artificial reference genome-guided false discovery rate.

    PubMed

    Giese, Sven H; Zickmann, Franziska; Renard, Bernhard Y

    2014-01-01

    Accurate estimation, comparison and evaluation of read mapping error rates is a crucial step in the processing of next-generation sequencing data, as further analysis steps and interpretation assume the correctness of the mapping results. Current approaches are either focused on sensitivity estimation and thereby disregard specificity or are based on read simulations. Although continuously improving, read simulations are still prone to introduce a bias into the mapping error quantitation and cannot capture all characteristics of an individual dataset. We introduce ARDEN (artificial reference driven estimation of false positives in next-generation sequencing data), a novel benchmark method that estimates error rates of read mappers based on real experimental reads, using an additionally generated artificial reference genome. It allows a dataset-specific computation of error rates and the construction of a receiver operating characteristic curve. Thereby, it can be used for optimization of parameters for read mappers, selection of read mappers for a specific problem or for filtering alignments based on quality estimation. The use of ARDEN is demonstrated in a general read mapper comparison, a parameter optimization for one read mapper and an application example in single-nucleotide polymorphism discovery with a significant reduction in the number of false positive identifications. The ARDEN source code is freely available at http://sourceforge.net/projects/arden/.

  20. Isomer-specific profiling of N-glycans derived from human serum for potential biomarker discovery in pancreatic cancer.

    PubMed

    Liu, Yufei; Wang, Chang; Wang, Ran; Wu, Yike; Zhang, Liang; Liu, Bi-Feng; Cheng, Liming; Liu, Xin

    2018-06-15

    Glycosylation is one of the most important post-translational modifications of protein. Recently, global profiling of human serum glycomics has become a noninvasive method for cancer-related biomarker discovery and many studies have focused on compositional glycan profiling. In contrast, structure-specific glycan profiling may provide more potential biomarkers with higher specificity than compositional profiling. In this work, N-glycans released from human serum were neutralized with methylamine and reduced by ammonia-borane complex prior to profiling using nanoLC-ESI-MS with porous graphitized carbon (PGC) and relative abundances of over 280 isomers were compared between pancreatic cancer (PC) cases (n = 32) and healthy controls (n = 32). Statistical analysis identified 25 specific-isomeric biomarkers with significant differences (p-value < 0.05). ROC and PCA analysis were performed to assess the potential biomarkers which were identified as being significantly altered in cancer. The AUCs of the significantly changed specific-isomers were ranging from 0.712 to 0.949. In addition, with the combination of all potential biomarkers, a higher AUC of 0.976 with sensitivity (93.5%) and specificity (90.6%) was obtained. Overall, the proposed strategy coupled to relative quantitative analysis of isomeric glycans make it possible to discover new biomarkers for the diagnosis of PC. Pancreatic cancer (PC) has a poor prognosis with a five-year survival rate <5%. Therefore, a strategy for accurate diagnosis of PC is indeed required. In this paper, a dual-derivatized strategy for structure-specific glycan profiling has been used and according to our best knowledge, this is the first application of this strategy for PC biomarker discovery, in which the separation, identification and relative quantification of isomeric glycans can be simultaneously obtained. In addition, by in-depth analysis of isomeric glycans, the full description of the stereo- and region- diversity of glycans can also be achieved, which might provide more potential information for PC biomarker discovery. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. Independent component analysis for the extraction of reliable protein signal profiles from MALDI-TOF mass spectra.

    PubMed

    Mantini, Dante; Petrucci, Francesca; Del Boccio, Piero; Pieragostino, Damiana; Di Nicola, Marta; Lugaresi, Alessandra; Federici, Giorgio; Sacchetta, Paolo; Di Ilio, Carmine; Urbani, Andrea

    2008-01-01

    Independent component analysis (ICA) is a signal processing technique that can be utilized to recover independent signals from a set of their linear mixtures. We propose ICA for the analysis of signals obtained from large proteomics investigations such as clinical multi-subject studies based on MALDI-TOF MS profiling. The method is validated on simulated and experimental data for demonstrating its capability of correctly extracting protein profiles from MALDI-TOF mass spectra. The comparison on peak detection with an open-source and two commercial methods shows its superior reliability in reducing the false discovery rate of protein peak masses. Moreover, the integration of ICA and statistical tests for detecting the differences in peak intensities between experimental groups allows to identify protein peaks that could be indicators of a diseased state. This data-driven approach demonstrates to be a promising tool for biomarker-discovery studies based on MALDI-TOF MS technology. The MATLAB implementation of the method described in the article and both simulated and experimental data are freely available at http://www.unich.it/proteomica/bioinf/.

  2. MSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines.

    PubMed

    Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I; Marcotte, Edward M

    2011-07-01

    Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for every possible PSM and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for most proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses.

  3. MSblender: a probabilistic approach for integrating peptide identifications from multiple database search engines

    PubMed Central

    Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I.; Marcotte, Edward M.

    2011-01-01

    Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for all possible PSMs and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for all detected proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses. PMID:21488652

  4. Discovery of Host Factors and Pathways Utilized in Hantaviral Infection

    DTIC Science & Technology

    2016-09-01

    AWARD NUMBER: W81XWH-14-1-0204 TITLE: Discovery of Host Factors and Pathways Utilized in Hantaviral Infection PRINCIPAL INVESTIGATOR: Paul...Aug 2016 4. TITLE AND SUBTITLE Discovery of Host Factors and Pathways Utilized in Hantaviral Infection 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c...after significance values were calculated and corrected for false discovery rate. The top hit is ATP6V0A1, a gene encoding a subunit of a vacuolar

  5. Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics.

    PubMed

    Keich, Uri; Kertesz-Farkas, Attila; Noble, William Stafford

    2015-08-07

    Interpreting the potentially vast number of hypotheses generated by a shotgun proteomics experiment requires a valid and accurate procedure for assigning statistical confidence estimates to identified tandem mass spectra. Despite the crucial role such procedures play in most high-throughput proteomics experiments, the scientific literature has not reached a consensus about the best confidence estimation methodology. In this work, we evaluate, using theoretical and empirical analysis, four previously proposed protocols for estimating the false discovery rate (FDR) associated with a set of identified tandem mass spectra: two variants of the target-decoy competition protocol (TDC) of Elias and Gygi and two variants of the separate target-decoy search protocol of Käll et al. Our analysis reveals significant biases in the two separate target-decoy search protocols. Moreover, the one TDC protocol that provides an unbiased FDR estimate among the target PSMs does so at the cost of forfeiting a random subset of high-scoring spectrum identifications. We therefore propose the mix-max procedure to provide unbiased, accurate FDR estimates in the presence of well-calibrated scores. The method avoids biases associated with the two separate target-decoy search protocols and also avoids the propensity for target-decoy competition to discard a random subset of high-scoring target identifications.

  6. Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics

    PubMed Central

    2016-01-01

    Interpreting the potentially vast number of hypotheses generated by a shotgun proteomics experiment requires a valid and accurate procedure for assigning statistical confidence estimates to identified tandem mass spectra. Despite the crucial role such procedures play in most high-throughput proteomics experiments, the scientific literature has not reached a consensus about the best confidence estimation methodology. In this work, we evaluate, using theoretical and empirical analysis, four previously proposed protocols for estimating the false discovery rate (FDR) associated with a set of identified tandem mass spectra: two variants of the target-decoy competition protocol (TDC) of Elias and Gygi and two variants of the separate target-decoy search protocol of Käll et al. Our analysis reveals significant biases in the two separate target-decoy search protocols. Moreover, the one TDC protocol that provides an unbiased FDR estimate among the target PSMs does so at the cost of forfeiting a random subset of high-scoring spectrum identifications. We therefore propose the mix-max procedure to provide unbiased, accurate FDR estimates in the presence of well-calibrated scores. The method avoids biases associated with the two separate target-decoy search protocols and also avoids the propensity for target-decoy competition to discard a random subset of high-scoring target identifications. PMID:26152888

  7. The IPAC Image Subtraction and Discovery Pipeline for the Intermediate Palomar Transient Factory

    NASA Astrophysics Data System (ADS)

    Masci, Frank J.; Laher, Russ R.; Rebbapragada, Umaa D.; Doran, Gary B.; Miller, Adam A.; Bellm, Eric; Kasliwal, Mansi; Ofek, Eran O.; Surace, Jason; Shupe, David L.; Grillmair, Carl J.; Jackson, Ed; Barlow, Tom; Yan, Lin; Cao, Yi; Cenko, S. Bradley; Storrie-Lombardi, Lisa J.; Helou, George; Prince, Thomas A.; Kulkarni, Shrinivas R.

    2017-01-01

    We describe the near real-time transient-source discovery engine for the intermediate Palomar Transient Factory (iPTF), currently in operations at the Infrared Processing and Analysis Center (IPAC), Caltech. We coin this system the IPAC/iPTF Discovery Engine (or IDE). We review the algorithms used for PSF-matching, image subtraction, detection, photometry, and machine-learned (ML) vetting of extracted transient candidates. We also review the performance of our ML classifier. For a limiting signal-to-noise ratio of 4 in relatively unconfused regions, bogus candidates from processing artifacts and imperfect image subtractions outnumber real transients by ≃10:1. This can be considerably higher for image data with inaccurate astrometric and/or PSF-matching solutions. Despite this occasionally high contamination rate, the ML classifier is able to identify real transients with an efficiency (or completeness) of ≃97% for a maximum tolerable false-positive rate of 1% when classifying raw candidates. All subtraction-image metrics, source features, ML probability-based real-bogus scores, contextual metadata from other surveys, and possible associations with known Solar System objects are stored in a relational database for retrieval by the various science working groups. We review our efforts in mitigating false-positives and our experience in optimizing the overall system in response to the multitude of science projects underway with iPTF.

  8. The IPAC Image Subtraction and Discovery Pipeline for the Intermediate Palomar Transient Factory

    NASA Technical Reports Server (NTRS)

    Masci, Frank J.; Laher, Russ R.; Rebbapragada, Umaa D.; Doran, Gary B.; Miller, Adam A.; Bellm, Eric; Kasliwal, Mansi; Ofek, Eran O.; Surace, Jason; Shupe, David L.; hide

    2016-01-01

    We describe the near real-time transient-source discovery engine for the intermediate Palomar Transient Factory (iPTF), currently in operations at the Infrared Processing and Analysis Center (IPAC), Caltech. We coin this system the IPAC/iPTF Discovery Engine (or IDE). We review the algorithms used for PSF-matching, image subtraction, detection, photometry, and machine-learned (ML) vetting of extracted transient candidates. We also review the performance of our ML classifier. For a limiting signal-to-noise ratio of 4 in relatively unconfused regions, bogus candidates from processing artifacts and imperfect image subtractions outnumber real transients by approximately equal to 10:1. This can be considerably higher for image data with inaccurate astrometric and/or PSF-matching solutions. Despite this occasionally high contamination rate, the ML classifier is able to identify real transients with an efficiency (or completeness) of approximately equal to 97% for a maximum tolerable false-positive rate of 1% when classifying raw candidates. All subtraction-image metrics, source features, ML probability-based real-bogus scores, contextual metadata from other surveys, and possible associations with known Solar System objects are stored in a relational database for retrieval by the various science working groups. We review our efforts in mitigating false-positives and our experience in optimizing the overall system in response to the multitude of science projects underway with iPTF.

  9. Estimating the rate of biological introductions: Lessepsian fishes in the Mediterranean.

    PubMed

    Belmaker, Jonathan; Brokovich, Eran; China, Victor; Golani, Daniel; Kiflawi, Moshe

    2009-04-01

    Sampling issues preclude the direct use of the discovery rate of exotic species as a robust estimate of their rate of introduction. Recently, a method was advanced that allows maximum-likelihood estimation of both the observational probability and the introduction rate from the discovery record. Here, we propose an alternative approach that utilizes the discovery record of native species to control for sampling effort. Implemented in a Bayesian framework using Markov chain Monte Carlo simulations, the approach provides estimates of the rate of introduction of the exotic species, and of additional parameters such as the size of the species pool from which they are drawn. We illustrate the approach using Red Sea fishes recorded in the eastern Mediterranean, after crossing the Suez Canal, and show that the two approaches may lead to different conclusions. The analytical framework is highly flexible and could provide a basis for easy modification to other systems for which first-sighting data on native and introduced species are available.

  10. The asteroids - Accretion, differentiation, fragmentation, and irradiation

    NASA Technical Reports Server (NTRS)

    Wilkening, L. L.

    1979-01-01

    Various types of meteorites have experienced processes of condensation, accretion, metamorphism, differentiation, brecciation, irradiation and fragmentation. A typical view of meteorite formation has been that the processes following accretion take place in a few asteroidal-sized (approximately 100 km) objects. Discovery of decay products of now extinct Al-26 and Pd-107 in meteorites, discovery of isotopic heterogeneity among meteorite types, re-analysis of meteorite cooling rates, and continuing study of meteoritic compositions have led some meteoriticists to conclude that meteorites obtained their chemical, isotopic, and some textural characteristics in objects initially less than 10 km in diameter. Such a scenario, which is described in this paper, raises the possibility that some of these small planetesimals may have been 'condensation nuclei' for the formation of comets as well as the precursors of asteroids.

  11. Lithography hotspot discovery at 70nm DRAM 300mm fab: process window qualification using design base binning

    NASA Astrophysics Data System (ADS)

    Chen, Daniel; Chen, Damian; Yen, Ray; Cheng, Mingjen; Lan, Andy; Ghaskadvi, Rajesh

    2008-11-01

    Identifying hotspots--structures that limit the lithography process window--become increasingly important as the industry relies heavily on RET to print sub-wavelength designs. KLA-Tencor's patented Process Window Qualification (PWQ) methodology has been used for this purpose in various fabs. PWQ methodology has three key advantages (a) PWQ Layout--to obtain the best sensitivity (b) Design Based Binning--for pattern repeater analysis (c) Intelligent sampling--for the best DOI sampling rate. This paper evaluates two different analysis strategies for SEM review sampling successfully deployed at Inotera Memories, Inc. We propose a new approach combining the location repeater and pattern repeaters. Based on a recent case study the new sampling flow reduces the data analysis and sampling time from 6 hours to 1.5 hour maintaining maximum DOI sample rate.

  12. [Evaluation of performance of five bioinformatics software for the prediction of missense mutations].

    PubMed

    Chen, Qianting; Dai, Congling; Zhang, Qianjun; Du, Juan; Li, Wen

    2016-10-01

    To study the prediction performance evaluation with five kinds of bioinformatics software (SIFT, PolyPhen2, MutationTaster, Provean, MutationAssessor). From own database for genetic mutations collected over the past five years, Chinese literature database, Human Gene Mutation Database, and dbSNP, 121 missense mutations confirmed by functional studies, and 121 missense mutations suspected to be pathogenic by pedigree analysis were used as positive gold standard, while 242 missense mutations with minor allele frequency (MAF)>5% in dominant hereditary diseases were used as negative gold standard. The selected mutations were predicted with the five software. Based on the results, the performance of the five software was evaluated for their sensitivity, specificity, positive predict value, false positive rate, negative predict value, false negative rate, false discovery rate, accuracy, and receiver operating characteristic curve (ROC). In terms of sensitivity, negative predictive value and false negative rate, the rank was MutationTaster, PolyPhen2, Provean, SIFT, and MutationAssessor. For specificity and false positive rate, the rank was MutationTaster, Provean, MutationAssessor, SIFT, and PolyPhen2. For positive predict value and false discovery rate, the rank was MutationTaster, Provean, MutationAssessor, PolyPhen2, and SIFT. For area under the ROC curve (AUC) and accuracy, the rank was MutationTaster, Provean, PolyPhen2, MutationAssessor, and SIFT. The prediction performance of software may be different when using different parameters. Among the five software, MutationTaster has the best prediction performance.

  13. Bon-EV: an improved multiple testing procedure for controlling false discovery rates.

    PubMed

    Li, Dongmei; Xie, Zidian; Zand, Martin; Fogg, Thomas; Dye, Timothy

    2017-01-03

    Stability of multiple testing procedures, defined as the standard deviation of total number of discoveries, can be used as an indicator of variability of multiple testing procedures. Improving stability of multiple testing procedures can help to increase the consistency of findings from replicated experiments. Benjamini-Hochberg's and Storey's q-value procedures are two commonly used multiple testing procedures for controlling false discoveries in genomic studies. Storey's q-value procedure has higher power and lower stability than Benjamini-Hochberg's procedure. To improve upon the stability of Storey's q-value procedure and maintain its high power in genomic data analysis, we propose a new multiple testing procedure, named Bon-EV, to control false discovery rate (FDR) based on Bonferroni's approach. Simulation studies show that our proposed Bon-EV procedure can maintain the high power of the Storey's q-value procedure and also result in better FDR control and higher stability than Storey's q-value procedure for samples of large size(30 in each group) and medium size (15 in each group) for either independent, somewhat correlated, or highly correlated test statistics. When sample size is small (5 in each group), our proposed Bon-EV procedure has performance between the Benjamini-Hochberg procedure and the Storey's q-value procedure. Examples using RNA-Seq data show that the Bon-EV procedure has higher stability than the Storey's q-value procedure while maintaining equivalent power, and higher power than the Benjamini-Hochberg's procedure. For medium or large sample sizes, the Bon-EV procedure has improved FDR control and stability compared with the Storey's q-value procedure and improved power compared with the Benjamini-Hochberg procedure. The Bon-EV multiple testing procedure is available as the BonEV package in R for download at https://CRAN.R-project.org/package=BonEV .

  14. ACFIS: a web server for fragment-based drug discovery

    PubMed Central

    Hao, Ge-Fei; Jiang, Wen; Ye, Yuan-Nong; Wu, Feng-Xu; Zhu, Xiao-Lei; Guo, Feng-Biao; Yang, Guang-Fu

    2016-01-01

    In order to foster innovation and improve the effectiveness of drug discovery, there is a considerable interest in exploring unknown ‘chemical space’ to identify new bioactive compounds with novel and diverse scaffolds. Hence, fragment-based drug discovery (FBDD) was developed rapidly due to its advanced expansive search for ‘chemical space’, which can lead to a higher hit rate and ligand efficiency (LE). However, computational screening of fragments is always hampered by the promiscuous binding model. In this study, we developed a new web server Auto Core Fragment in silico Screening (ACFIS). It includes three computational modules, PARA_GEN, CORE_GEN and CAND_GEN. ACFIS can generate core fragment structure from the active molecule using fragment deconstruction analysis and perform in silico screening by growing fragments to the junction of core fragment structure. An integrated energy calculation rapidly identifies which fragments fit the binding site of a protein. We constructed a simple interface to enable users to view top-ranking molecules in 2D and the binding mode in 3D for further experimental exploration. This makes the ACFIS a highly valuable tool for drug discovery. The ACFIS web server is free and open to all users at http://chemyang.ccnu.edu.cn/ccb/server/ACFIS/. PMID:27150808

  15. ACFIS: a web server for fragment-based drug discovery.

    PubMed

    Hao, Ge-Fei; Jiang, Wen; Ye, Yuan-Nong; Wu, Feng-Xu; Zhu, Xiao-Lei; Guo, Feng-Biao; Yang, Guang-Fu

    2016-07-08

    In order to foster innovation and improve the effectiveness of drug discovery, there is a considerable interest in exploring unknown 'chemical space' to identify new bioactive compounds with novel and diverse scaffolds. Hence, fragment-based drug discovery (FBDD) was developed rapidly due to its advanced expansive search for 'chemical space', which can lead to a higher hit rate and ligand efficiency (LE). However, computational screening of fragments is always hampered by the promiscuous binding model. In this study, we developed a new web server Auto Core Fragment in silico Screening (ACFIS). It includes three computational modules, PARA_GEN, CORE_GEN and CAND_GEN. ACFIS can generate core fragment structure from the active molecule using fragment deconstruction analysis and perform in silico screening by growing fragments to the junction of core fragment structure. An integrated energy calculation rapidly identifies which fragments fit the binding site of a protein. We constructed a simple interface to enable users to view top-ranking molecules in 2D and the binding mode in 3D for further experimental exploration. This makes the ACFIS a highly valuable tool for drug discovery. The ACFIS web server is free and open to all users at http://chemyang.ccnu.edu.cn/ccb/server/ACFIS/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Spacewatch search for near-Earth asteroids

    NASA Technical Reports Server (NTRS)

    Gehreis, Tom

    1991-01-01

    The objective of the Spacewatch Program is to develop new techniques for the discovery of near-earth asteroids and to prove the efficiency of the techniques. Extensive experience was obtained with the 0.91-m Spacewatch Telescope on Kitt Peak that now has the largest CCD detector in the world: a Tektronix 2048 x 2048 with 27-micron pixel size. During the past year, software and hardware for optimizing the discovery of near-earth asteroids were installed. As a result, automatic detection of objects that move with rates between 0.1 and 4 degrees per day has become routine since September 1990. Apparently, one or two near-earth asteroids are discovered per month, on average. The follow up is with astrometry over as long an arc as the geometry and faintness of the object allow, typically three months following the discovery observations. During the second half of 1990, replacing the 0.91-m mirror with a larger one, to increase the discovery rate, was considered. Studies and planning for this switch are proposed for funding during the coming year. It was also proposed that the Spacewatch Telescope be turned on the sky, instead of having the drive turned off, in order to increase the rate of discoveries by perhaps a factor of two.

  17. Enhancing the chemical selectivity in discovery-based analysis with tandem ionization time-of-flight mass spectrometry detection for comprehensive two-dimensional gas chromatography.

    PubMed

    Freye, Chris E; Moore, Nicholas R; Synovec, Robert E

    2018-02-16

    The complementary information provided by tandem ionization time-of-flight mass spectrometry (TI-TOFMS) is investigated for comparative discovery-based analysis, when coupled with comprehensive two-dimensional gas chromatography (GC × GC). The TI conditions implemented were a hard ionization energy (70 eV) concurrently collected with a soft ionization energy (14 eV). Tile-based Fisher ratio (F-ratio) analysis is used to analyze diesel fuel spiked with twelve analytes at a nominal concentration of 50 ppm. F-ratio analysis is a supervised discovery-based technique that compares two different sample classes, in this case spiked and unspiked diesel, to reduce the complex GC × GC-TI-TOFMS data into a hit list of class distinguishing analyte features. Hit lists of the 70 eV and 14 eV data sets, and the single hit list produced when the two data sets are fused together, are all investigated. For the 70 eV hit list, eleven of the twelve analytes were found in the top thirteen hits. For the 14 eV hit list, nine of the twelve analytes were found in the top nine hits, with the other three analytes either not found or well down the hit list. As expected, the F-ratios per m/z used to calculate each average F-ratio per hit were generally smaller fragment ions for the 70 eV data set, while the larger fragment ions were emphasized in the 14 eV data set, supporting the notion that complementary information was provided. The discovery rate was improved when F-ratio analysis was performed on the fused data sets resulted in eleven of the twelve analytes being at the top of the single hit list. Using PARAFAC, analytes that were "discovered" were deconvoluted in order to obtain their identification via match values (MV). Location of the analytes and the "F-ratio spectra" obtained from F-ratio analysis were used to guide the deconvolution. Eight of the twelve analytes where successfully deconvoluted and identified using the in-house library for the 70 eV data set. PARAFAC deconvolution of the two separate data sets provided increased confidence in identification of "discovered" analytes. Herein, we explore the limit of analyte discovery and limit of analyte identification, and demonstrate a general workflow for the investigation of key chemical features in complex samples. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Assessment of Metabolome Annotation Quality: A Method for Evaluating the False Discovery Rate of Elemental Composition Searches

    PubMed Central

    Matsuda, Fumio; Shinbo, Yoko; Oikawa, Akira; Hirai, Masami Yokota; Fiehn, Oliver; Kanaya, Shigehiko; Saito, Kazuki

    2009-01-01

    Background In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. Methodology/Principal Findings The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30–50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR >70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. Conclusions/Significance High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR <10%). In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data. PMID:19847304

  19. A collaborative filtering-based approach to biomedical knowledge discovery.

    PubMed

    Lever, Jake; Gakkhar, Sitanshu; Gottlieb, Michael; Rashnavadi, Tahereh; Lin, Santina; Siu, Celia; Smith, Maia; Jones, Martin R; Krzywinski, Martin; Jones, Steven J M; Wren, Jonathan

    2018-02-15

    The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers. We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations. All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery. sjones@bcgsc.ca. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  20. Simultaneous isoform discovery and quantification from RNA-seq.

    PubMed

    Hiller, David; Wong, Wing Hung

    2013-05-01

    RNA sequencing is a recent technology which has seen an explosion of methods addressing all levels of analysis, from read mapping to transcript assembly to differential expression modeling. In particular the discovery of isoforms at the transcript assembly stage is a complex problem and current approaches suffer from various limitations. For instance, many approaches use graphs to construct a minimal set of isoforms which covers the observed reads, then perform a separate algorithm to quantify the isoforms, which can result in a loss of power. Current methods also use ad-hoc solutions to deal with the vast number of possible isoforms which can be constructed from a given set of reads. Finally, while the need of taking into account features such as read pairing and sampling rate of reads has been acknowledged, most existing methods do not seamlessly integrate these features as part of the model. We present Montebello, an integrated statistical approach which performs simultaneous isoform discovery and quantification by using a Monte Carlo simulation to find the most likely isoform composition leading to a set of observed reads. We compare Montebello to Cufflinks, a popular isoform discovery approach, on a simulated data set and on 46.3 million brain reads from an Illumina tissue panel. On this data set Montebello appears to offer a modest improvement over Cufflinks when considering discovery and parsimony metrics. In addition Montebello mitigates specific difficulties inherent in the Cufflinks approach. Finally, Montebello can be fine-tuned depending on the type of solution desired.

  1. An epigenome-wide study of body mass index and DNA methylation in blood using participants from the Sister Study cohort.

    PubMed

    Wilson, L E; Harlid, S; Xu, Z; Sandler, D P; Taylor, J A

    2017-01-01

    The relationship between obesity and chronic disease risk is well-established; the underlying biological mechanisms driving this risk increase may include obesity-related epigenetic modifications. To explore this hypothesis, we conducted a genome-wide analysis of DNA methylation and body mass index (BMI) using data from a subset of women in the Sister Study. The Sister Study is a cohort of 50 884 US women who had a sister with breast cancer but were free of breast cancer themselves at enrollment. Study participants completed examinations which included measurements of height and weight, and provided blood samples. Blood DNA methylation data generated with the Illumina Infinium HumanMethylation27 BeadChip array covering 27,589 CpG sites was available for 871 women from a prior study of breast cancer and DNA methylation. To identify differentially methylated CpG sites associated with BMI, we analyzed this methylation data using robust linear regression with adjustment for age and case status. For those CpGs passing the false discovery rate significance level, we examined the association in a replication set comprised of a non-overlapping group of 187 women from the Sister Study who had DNA methylation data generated using the Infinium HumanMethylation450 BeadChip array. Analysis of this expanded 450 K array identified additional BMI-associated sites which were investigated with targeted pyrosequencing. Four CpG sites reached genome-wide significance (false discovery rate (FDR) q<0.05) in the discovery set and associations for all four were significant at strict Bonferroni correction in the replication set. An additional 23 sites passed FDR in the replication set and five were replicated by pyrosequencing in the discovery set. Several of the genes identified including ANGPT4, RORC, SOCS3, FSD2, XYLT1, ABCG1, STK39, ASB2 and CRHR2 have been linked to obesity and obesity-related chronic diseases. Our findings support the hypothesis that obesity-related epigenetic differences are detectable in blood and may be related to risk of chronic disease.

  2. Probing light sterile neutrino signatures at reactor and Spallation Neutron Source neutrino experiments

    NASA Astrophysics Data System (ADS)

    Kosmas, T. S.; Papoulias, D. K.; Tórtola, M.; Valle, J. W. F.

    2017-09-01

    We investigate the impact of a fourth sterile neutrino at reactor and Spallation Neutron Source neutrino detectors. Specifically, we explore the discovery potential of the TEXONO and COHERENT experiments to subleading sterile neutrino effects through the measurement of the coherent elastic neutrino-nucleus scattering event rate. Our dedicated χ2-sensitivity analysis employs realistic nuclear structure calculations adequate for high purity sub-keV threshold Germanium detectors.

  3. Nearing saturation of cancer driver gene discovery.

    PubMed

    Hsiehchen, David; Hsieh, Antony

    2018-06-15

    Extensive sequencing efforts of cancer genomes such as The Cancer Genome Atlas (TCGA) have been undertaken to uncover bona fide cancer driver genes which has enhanced our understanding of cancer and revealed therapeutic targets. However, the number of driver gene mutations is bounded, indicating that there must be a point when further sequencing efforts will be excessive. We found that there was a significant positive correlation between sample size and identified driver gene mutations across 33 cancers sequenced by the TCGA, which is expected if additional sequencing is still leading to the identification of more driver genes. However, the rate of new cancer driver genes being discovered with larger samples is declining rapidly. Our analysis provides a general guide for determining which cancer types would likely benefit from additional sequencing efforts, particularly those with relatively high rates of cancer driver gene discovery. Our results argue that past strategies of indiscriminately sequencing as many specimens as possible for all cancer types is becoming inefficient. In addition, without significant investments into applying our knowledge of cancer genomes, we risk sequencing more cancer genomes for the sake of sequencing rather than meaningful patient benefit.

  4. Differential DNA methylation patterns of polycystic ovarian syndrome in whole blood of Chinese women.

    PubMed

    Li, Shuxia; Zhu, Dongyi; Duan, Hongmei; Ren, Anran; Glintborg, Dorte; Andersen, Marianne; Skov, Vibe; Thomassen, Mads; Kruse, Torben; Tan, Qihua

    2017-03-28

    As a universally common endocrinopathy in women of reproductive age, the polycystic ovarian syndrome is characterized by composite clinical phenotypes reflecting the contributions of reproductive impact of ovarian dysfunction and metabolic abnormalities with widely varying symptoms resulting from interference of the genome with the environment through integrative biological mechanisms including epigenetics. We have performed a genome-wide DNA methylation analysis on polycystic ovarian syndrome and identified a substantial number of genomic sites differentially methylated in the whole blood of PCOS patients and healthy controls (52 sites, false discovery rate < 0.05 and corresponding p value < 5.68e-06), highly consistently replicating biological pathways extensively implicated in immunity and immunity-related inflammatory disorders (false discovery rate < 0.05) that were reportedly regulated in the DNA methylome from ovarian tissue under PCOS condition. Most importantly, our genome-wide profiling focusing on PCOS patients revealed a large number of DNA methylation sites and their enriched functional pathways significantly associated with diverse clinical features (levels of prolactin, estradiol, progesterone and menstrual cycle) that could serve as novel molecular basis of the clinical heterogeneity observed in PCOS women.

  5. Motif-based analysis of large nucleotide data sets using MEME-ChIP

    PubMed Central

    Ma, Wenxiu; Noble, William S; Bailey, Timothy L

    2014-01-01

    MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by cLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix–based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP’s interactive HTML output groups and aligns significant motifs to ease interpretation. this protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods. PMID:24853928

  6. Petroleum-resource appraisal and discovery rate forecasting in partially explored regions

    USGS Publications Warehouse

    Drew, Lawrence J.; Schuenemeyer, J.H.; Root, David H.; Attanasi, E.D.

    1980-01-01

    PART A: A model of the discovery process can be used to predict the size distribution of future petroleum discoveries in partially explored basins. The parameters of the model are estimated directly from the historical drilling record, rather than being determined by assumptions or analogies. The model is based on the concept of the area of influence of a drill hole, which states that the area of a basin exhausted by a drill hole varies with the size and shape of targets in the basin and with the density of previously drilled wells. It also uses the concept of discovery efficiency, which measures the rate of discovery within several classes of deposit size. The model was tested using 25 years of historical exploration data (1949-74) from the Denver basin. From the trend in the discovery rate (the number of discoveries per unit area exhausted), the discovery efficiencies in each class of deposit size were estimated. Using pre-1956 discovery and drilling data, the model accurately predicted the size distribution of discoveries for the 1956-74 period. PART B: A stochastic model of the discovery process has been developed to predict, using past drilling and discovery data, the distribution of future petroleum deposits in partially explored basins, and the basic mathematical properties of the model have been established. The model has two exogenous parameters, the efficiency of exploration and the effective basin size. The first parameter is the ratio of the probability that an actual exploratory well will make a discovery to the probability that a randomly sited well will make a discovery. The second parameter, the effective basin size, is the area of that part of the basin in which drillers are willing to site wells. Methods for estimating these parameters from locations of past wells and from the sizes and locations of past discoveries were derived, and the properties of estimators of the parameters were studied by simulation. PART C: This study examines the temporal properties and determinants of petroleum exploration for firms operating in the Denver basin. Expectations associated with the favorability of a specific area are modeled by using distributed lag proxy variables (of previous discoveries) and predictions from a discovery process model. In the second part of the study, a discovery process model is linked with a behavioral well-drilling model in order to predict the supply of new reserves. Results of the study indicate that the positive effects of new discoveries on drilling increase for several periods and then diminish to zero within 2? years after the deposit discovery date. Tests of alternative specifications of the argument of the distributed lag function using alternative minimum size classes of deposits produced little change in the model's explanatory power. This result suggests that, once an exploration play is underway, favorable operator expectations are sustained by the quantity of oil found per time period rather than by the discovery of specific size deposits. When predictions of the value of undiscovered deposits (generated from a discovery process model) were substituted for the expectations variable in models used to explain exploration effort, operator behavior was found to be consistent with these predictions. This result suggests that operators, on the average, were efficiently using information contained in the discovery history of the basin in carrying out their exploration plans. Comparison of the two approaches to modeling unobservable operator expectations indicates that the two models produced very similar results. The integration of the behavioral well-drilling model and discovery process model to predict the additions to reserves per unit time was successful only when the quarterly predictions were aggregated to annual values. The accuracy of the aggregated predictions was also found to be reasonably robust to errors in predictions from the behavioral well-drilling equation.

  7. Fostering First-Graders' Reasoning Strategies with the Most Basic Sums

    ERIC Educational Resources Information Center

    Purpura, David J.; Baroody, Arthur J.; Eiland, Michael D.; Reid, Erin E.

    2012-01-01

    In a meta-analysis of 164 studies, Alfieri, Brooks, Aldrich, and Tenenbaum (2010) found that assisted discovery learning was more effective than explicit instruction or unassisted discovery learning and that explicit instruction resulted in more favorable outcomes than unassisted discovery learning. In other words, "unassisted discovery does…

  8. Precision and recall estimates for two-hybrid screens

    PubMed Central

    Huang, Hailiang; Bader, Joel S.

    2009-01-01

    Motivation: Yeast two-hybrid screens are an important method to map pairwise protein interactions. This method can generate spurious interactions (false discoveries), and true interactions can be missed (false negatives). Previously, we reported a capture–recapture estimator for bait-specific precision and recall. Here, we present an improved method that better accounts for heterogeneity in bait-specific error rates. Result: For yeast, worm and fly screens, we estimate the overall false discovery rates (FDRs) to be 9.9%, 13.2% and 17.0% and the false negative rates (FNRs) to be 51%, 42% and 28%. Bait-specific FDRs and the estimated protein degrees are then used to identify protein categories that yield more (or fewer) false positive interactions and more (or fewer) interaction partners. While membrane proteins have been suggested to have elevated FDRs, the current analysis suggests that intrinsic membrane proteins may actually have reduced FDRs. Hydrophobicity is positively correlated with decreased error rates and fewer interaction partners. These methods will be useful for future two-hybrid screens, which could use ultra-high-throughput sequencing for deeper sampling of interacting bait–prey pairs. Availability: All software (C source) and datasets are available as supplemental files and at http://www.baderzone.org under the Lesser GPL v. 3 license. Contact: joel.bader@jhu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19091773

  9. The relationship between target-class and the physicochemical properties of antibacterial drugs

    PubMed Central

    Mugumbate, Grace; Overington, John P.

    2015-01-01

    The discovery of novel mechanism of action (MOA) antibacterials has been associated with the concept that antibacterial drugs occupy a differentiated region of physicochemical space compared to human-targeted drugs. With, in broad terms, antibacterials having higher molecular weight, lower log P and higher polar surface area (PSA). By analysing the physicochemical properties of about 1700 approved drugs listed in the ChEMBL database, we show, that antibacterials for whose targets are riboproteins (i.e., composed of a complex of RNA and protein) fall outside the conventional human ‘drug-like’ chemical space; whereas antibacterials that modulate bacterial protein targets, generally comply with the ‘rule-of-five’ guidelines for classical oral human drugs. Our analysis suggests a strong target-class association for antibacterials—either protein-targeted or riboprotein-targeted. There is much discussion in the literature on the failure of screening approaches to deliver novel antibacterial lead series, and linkage of this poor success rate for antibacterials with the chemical space properties of screening collections. Our analysis suggests that consideration of target-class may be an underappreciated factor in antibacterial lead discovery, and that in fact bacterial protein-targets may well have similar binding site characteristics to human protein targets, and questions the assumption that larger, more polar compounds are a key part of successful future antibacterial discovery. PMID:25975639

  10. ADEPT, a dynamic next generation sequencing data error-detection program with trimming

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Feng, Shihai; Lo, Chien-Chi; Li, Po-E

    Illumina is the most widely used next generation sequencing technology and produces millions of short reads that contain errors. These sequencing errors constitute a major problem in applications such as de novo genome assembly, metagenomics analysis and single nucleotide polymorphism discovery. In this study, we present ADEPT, a dynamic error detection method, based on the quality scores of each nucleotide and its neighboring nucleotides, together with their positions within the read and compares this to the position-specific quality score distribution of all bases within the sequencing run. This method greatly improves upon other available methods in terms of the truemore » positive rate of error discovery without affecting the false positive rate, particularly within the middle of reads. We conclude that ADEPT is the only tool to date that dynamically assesses errors within reads by comparing position-specific and neighboring base quality scores with the distribution of quality scores for the dataset being analyzed. The result is a method that is less prone to position-dependent under-prediction, which is one of the most prominent issues in error prediction. The outcome is that ADEPT improves upon prior efforts in identifying true errors, primarily within the middle of reads, while reducing the false positive rate.« less

  11. ADEPT, a dynamic next generation sequencing data error-detection program with trimming

    DOE PAGES

    Feng, Shihai; Lo, Chien-Chi; Li, Po-E; ...

    2016-02-29

    Illumina is the most widely used next generation sequencing technology and produces millions of short reads that contain errors. These sequencing errors constitute a major problem in applications such as de novo genome assembly, metagenomics analysis and single nucleotide polymorphism discovery. In this study, we present ADEPT, a dynamic error detection method, based on the quality scores of each nucleotide and its neighboring nucleotides, together with their positions within the read and compares this to the position-specific quality score distribution of all bases within the sequencing run. This method greatly improves upon other available methods in terms of the truemore » positive rate of error discovery without affecting the false positive rate, particularly within the middle of reads. We conclude that ADEPT is the only tool to date that dynamically assesses errors within reads by comparing position-specific and neighboring base quality scores with the distribution of quality scores for the dataset being analyzed. The result is a method that is less prone to position-dependent under-prediction, which is one of the most prominent issues in error prediction. The outcome is that ADEPT improves upon prior efforts in identifying true errors, primarily within the middle of reads, while reducing the false positive rate.« less

  12. Analysis of latency performance of bluetooth low energy (BLE) networks.

    PubMed

    Cho, Keuchul; Park, Woojin; Hong, Moonki; Park, Gisu; Cho, Wooseong; Seo, Jihoon; Han, Kijun

    2014-12-23

    Bluetooth Low Energy (BLE) is a short-range wireless communication technology aiming at low-cost and low-power communication. The performance evaluation of classical Bluetooth device discovery have been intensively studied using analytical modeling and simulative methods, but these techniques are not applicable to BLE, since BLE has a fundamental change in the design of the discovery mechanism, including the usage of three advertising channels. Recently, there several works have analyzed the topic of BLE device discovery, but these studies are still far from thorough. It is thus necessary to develop a new, accurate model for the BLE discovery process. In particular, the wide range settings of the parameters introduce lots of potential for BLE devices to customize their discovery performance. This motivates our study of modeling the BLE discovery process and performing intensive simulation. This paper is focused on building an analytical model to investigate the discovery probability, as well as the expected discovery latency, which are then validated via extensive experiments. Our analysis considers both continuous and discontinuous scanning modes. We analyze the sensitivity of these performance metrics to parameter settings to quantitatively examine to what extent parameters influence the performance metric of the discovery processes.

  13. Analysis of Latency Performance of Bluetooth Low Energy (BLE) Networks

    PubMed Central

    Cho, Keuchul; Park, Woojin; Hong, Moonki; Park, Gisu; Cho, Wooseong; Seo, Jihoon; Han, Kijun

    2015-01-01

    Bluetooth Low Energy (BLE) is a short-range wireless communication technology aiming at low-cost and low-power communication. The performance evaluation of classical Bluetooth device discovery have been intensively studied using analytical modeling and simulative methods, but these techniques are not applicable to BLE, since BLE has a fundamental change in the design of the discovery mechanism, including the usage of three advertising channels. Recently, there several works have analyzed the topic of BLE device discovery, but these studies are still far from thorough. It is thus necessary to develop a new, accurate model for the BLE discovery process. In particular, the wide range settings of the parameters introduce lots of potential for BLE devices to customize their discovery performance. This motivates our study of modeling the BLE discovery process and performing intensive simulation. This paper is focused on building an analytical model to investigate the discovery probability, as well as the expected discovery latency, which are then validated via extensive experiments. Our analysis considers both continuous and discontinuous scanning modes. We analyze the sensitivity of these performance metrics to parameter settings to quantitatively examine to what extent parameters influence the performance metric of the discovery processes. PMID:25545266

  14. What Does Galileo's Discovery of Jupiter's Moons Tell Us about the Process of Scientific Discovery?

    ERIC Educational Resources Information Center

    Lawson, Anton E.

    2002-01-01

    Given that hypothetico-deductive reasoning has played a role in other important scientific discoveries, asks the question whether it plays a role in all important scientific discoveries. Explores and rejects as viable alternatives possible alternative scientific methods such as Baconian induction and combinatorial analysis. Discusses the…

  15. Where Have All the Interactions Gone? Estimating the Coverage of Two-Hybrid Protein Interaction Maps

    PubMed Central

    Huang, Hailiang; Jedynak, Bruno M; Bader, Joel S

    2007-01-01

    Yeast two-hybrid screens are an important method for mapping pairwise physical interactions between proteins. The fraction of interactions detected in independent screens can be very small, and an outstanding challenge is to determine the reason for the low overlap. Low overlap can arise from either a high false-discovery rate (interaction sets have low overlap because each set is contaminated by a large number of stochastic false-positive interactions) or a high false-negative rate (interaction sets have low overlap because each misses many true interactions). We extend capture–recapture theory to provide the first unified model for false-positive and false-negative rates for two-hybrid screens. Analysis of yeast, worm, and fly data indicates that 25% to 45% of the reported interactions are likely false positives. Membrane proteins have higher false-discovery rates on average, and signal transduction proteins have lower rates. The overall false-negative rate ranges from 75% for worm to 90% for fly, which arises from a roughly 50% false-negative rate due to statistical undersampling and a 55% to 85% false-negative rate due to proteins that appear to be systematically lost from the assays. Finally, statistical model selection conclusively rejects the Erdös-Rényi network model in favor of the power law model for yeast and the truncated power law for worm and fly degree distributions. Much as genome sequencing coverage estimates were essential for planning the human genome sequencing project, the coverage estimates developed here will be valuable for guiding future proteomic screens. All software and datasets are available in Datasets S1 and S2, Figures S1–S5, and Tables S1−S6, and are also available from our Web site, http://www.baderzone.org. PMID:18039026

  16. Science of the science, drug discovery and artificial neural networks.

    PubMed

    Patel, Jigneshkumar

    2013-03-01

    Drug discovery process many times encounters complex problems, which may be difficult to solve by human intelligence. Artificial Neural Networks (ANNs) are one of the Artificial Intelligence (AI) technologies used for solving such complex problems. ANNs are widely used for primary virtual screening of compounds, quantitative structure activity relationship studies, receptor modeling, formulation development, pharmacokinetics and in all other processes involving complex mathematical modeling. Despite having such advanced technologies and enough understanding of biological systems, drug discovery is still a lengthy, expensive, difficult and inefficient process with low rate of new successful therapeutic discovery. In this paper, author has discussed the drug discovery science and ANN from very basic angle, which may be helpful to understand the application of ANN for drug discovery to improve efficiency.

  17. BinQuasi: a peak detection method for ChIP-sequencing data with biological replicates.

    PubMed

    Goren, Emily; Liu, Peng; Wang, Chao; Wang, Chong

    2018-04-19

    ChIP-seq experiments that are aimed at detecting DNA-protein interactions require biological replication to draw inferential conclusions, however there is no current consensus on how to analyze ChIP-seq data with biological replicates. Very few methodologies exist for the joint analysis of replicated ChIP-seq data, with approaches ranging from combining the results of analyzing replicates individually to joint modeling of all replicates. Combining the results of individual replicates analyzed separately can lead to reduced peak classification performance compared to joint modeling. Currently available methods for joint analysis may fail to control the false discovery rate at the nominal level. We propose BinQuasi, a peak caller for replicated ChIP-seq data, that jointly models biological replicates using a generalized linear model framework and employs a one-sided quasi-likelihood ratio test to detect peaks. When applied to simulated data and real datasets, BinQuasi performs favorably compared to existing methods, including better control of false discovery rate than existing joint modeling approaches. BinQuasi offers a flexible approach to joint modeling of replicated ChIP-seq data which is preferable to combining the results of replicates analyzed individually. Source code is freely available for download at https://cran.r-project.org/package=BinQuasi, implemented in R. pliu@iastate.edu or egoren@iastate.edu. Supplementary material is available at Bioinformatics online.

  18. Serendipity in Cancer Drug Discovery: Rational or Coincidence?

    PubMed

    Prasad, Sahdeo; Gupta, Subash C; Aggarwal, Bharat B

    2016-06-01

    Novel drug development leading to final approval by the US FDA can cost as much as two billion dollars. Why the cost of novel drug discovery is so expensive is unclear, but high failure rates at the preclinical and clinical stages are major reasons. Although therapies targeting a given cell signaling pathway or a protein have become prominent in drug discovery, such treatments have done little in preventing or treating any disease alone because most chronic diseases have been found to be multigenic. A review of the discovery of numerous drugs currently being used for various diseases including cancer, diabetes, cardiovascular, pulmonary, and autoimmune diseases indicates that serendipity has played a major role in the discovery. In this review we provide evidence that rational drug discovery and targeted therapies have minimal roles in drug discovery, and that serendipity and coincidence have played and continue to play major roles. The primary focus in this review is on cancer-related drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    PubMed

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.

  20. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework

    PubMed Central

    Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org PMID:26989145

  1. The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology

    PubMed Central

    Gardner, Eugene J.; Lam, Vincent K.; Harris, Daniel N.; Chuang, Nelson T.; Scott, Emma C.; Pittard, W. Stephen; Mills, Ryan E.; Devine, Scott E.

    2017-01-01

    Mobile element insertions (MEIs) represent ∼25% of all structural variants in human genomes. Moreover, when they disrupt genes, MEIs can influence human traits and diseases. Therefore, MEIs should be fully discovered along with other forms of genetic variation in whole genome sequencing (WGS) projects involving population genetics, human diseases, and clinical genomics. Here, we describe the Mobile Element Locator Tool (MELT), which was developed as part of the 1000 Genomes Project to perform MEI discovery on a population scale. Using both Illumina WGS data and simulations, we demonstrate that MELT outperforms existing MEI discovery tools in terms of speed, scalability, specificity, and sensitivity, while also detecting a broader spectrum of MEI-associated features. Several run modes were developed to perform MEI discovery on local and cloud systems. In addition to using MELT to discover MEIs in modern humans as part of the 1000 Genomes Project, we also used it to discover MEIs in chimpanzees and ancient (Neanderthal and Denisovan) hominids. We detected diverse patterns of MEI stratification across these populations that likely were caused by (1) diverse rates of MEI production from source elements, (2) diverse patterns of MEI inheritance, and (3) the introgression of ancient MEIs into modern human genomes. Overall, our study provides the most comprehensive map of MEIs to date spanning chimpanzees, ancient hominids, and modern humans and reveals new aspects of MEI biology in these lineages. We also demonstrate that MELT is a robust platform for MEI discovery and analysis in a variety of experimental settings. PMID:28855259

  2. Open Access Could Transform Drug Discovery: A Case Study of JQ1.

    PubMed

    Arshad, Zeeshaan; Smith, James; Roberts, Mackenna; Lee, Wen Hwa; Davies, Ben; Bure, Kim; Hollander, Georg A; Dopson, Sue; Bountra, Chas; Brindley, David

    2016-01-01

    The cost to develop a new drug from target discovery to market is a staggering $1.8 billion, largely due to the very high attrition rate of drug candidates and the lengthy transition times during development. Open access is an emerging model of open innovation that places no restriction on the use of information and has the potential to accelerate the development of new drugs. To date, no quantitative assessment has yet taken place to determine the effects and viability of open access on the process of drug translation. This need is addressed within this study. The literature and intellectual property landscapes of the drug candidate JQ1, which was made available on an open access basis when discovered, and conventionally developed equivalents that were not are compared using the Web of Science and Thomson Innovation software, respectively. Results demonstrate that openly sharing the JQ1 molecule led to a greater uptake by a wider and more multi-disciplinary research community. A comparative analysis of the patent landscapes for each candidate also found that the broader scientific diaspora of the publically released JQ1 data enhanced innovation, evidenced by a greater number of downstream patents filed in relation to JQ1. The authors' findings counter the notion that open access drug discovery would leak commercial intellectual property. On the contrary, JQ1 serves as a test case to evidence that open access drug discovery can be an economic model that potentially improves efficiency and cost of drug discovery and its subsequent commercialization.

  3. MRM as a discovery tool?

    PubMed

    Rudnick, Paul A

    2015-04-01

    Multiple-reaction monitoring (MRM) of peptides has been recognized as a promising technology because it is sensitive and robust. Borrowed from stable-isotope dilution (SID) methodologies in the field of small molecules, MRM is now routinely used in proteomics laboratories. While its usefulness validating candidate targets is widely accepted, it has not been established as a discovery tool. Traditional thinking has been that MRM workflows cannot be multiplexed high enough to efficiently profile. This is due to slower instrument scan rates and the complexities of developing increasingly large scheduling methods. In this issue, Colangelo et al. (Proteomics 2015, 15, 1202-1214) describe a pipeline (xMRM) for discovery-style MRM using label-free methods (i.e. relative quantitation). Label-free comes with cost benefits as does MRM, where data are easier to analyze than full-scan. Their paper offers numerous improvements in method design and data analysis. The robustness of their pipeline was tested on rodent postsynaptic density fractions. There, they were able to accurately quantify 112 proteins at a CV% of 11.4, with only 2.5% of the 1697 transitions requiring user intervention. Colangelo et al. aim to extend the reach of MRM deeper into the realm of discovery proteomics, an area that is currently dominated by data-dependent and data-independent workflows. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. An automated assay for the assessment of cardiac arrest in fish embryo.

    PubMed

    Puybareau, Elodie; Genest, Diane; Barbeau, Emilie; Léonard, Marc; Talbot, Hugues

    2017-02-01

    Studies on fish embryo models are widely developed in research. They are used in several research fields including drug discovery or environmental toxicology. In this article, we propose an entirely automated assay to detect cardiac arrest in Medaka (Oryzias latipes) based on image analysis. We propose a multi-scale pipeline based on mathematical morphology. Starting from video sequences of entire wells in 24-well plates, we focus on the embryo, detect its heart, and ascertain whether or not the heart is beating based on intensity variation analysis. Our image analysis pipeline only uses commonly available operators. It has a low computational cost, allowing analysis at the same rate as acquisition. From an initial dataset of 3192 videos, 660 were discarded as unusable (20.7%), 655 of them correctly so (99.25%) and only 5 incorrectly so (0.75%). The 2532 remaining videos were used for our test. On these, 45 errors were made, leading to a success rate of 98.23%. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Accelerating the Rate of Astronomical Discovery

    NASA Astrophysics Data System (ADS)

    Norris, Ray P. Ruggles, Clive L. N.

    2010-05-01

    Special Session 5 on Accelerating the Rate of Astronomical Discovery addressed a range of potential limits to progress - paradigmatic, technological, organisational, and political - examining each issue both from modern and historical perspectives, and drawing lessons to guide future progress. A number of issues were identified which potentially regulate the flow of discoveries, such as the balance between large strongly-focussed projects and instruments, designed to answer the most fundamental questions confronting us, and the need to maintain a creative environment with room for unorthodox thinkers and bold, high risk, projects. Also important is the need to maintain historical and cultural perspectives, and the need to engage the minds of the most brilliant young people on the planet, regardless of their background, ethnicity, gender, or geography.

  6. SpS5: Accelerating the Rate of Astronomical Discovery

    NASA Astrophysics Data System (ADS)

    Norris, Ray P.

    2010-11-01

    Special Session 5 on Accelerating the Rate of Astronomical Discovery addressed a range of potential limits to progress: paradigmatic, technological, organizational, and political. It examined each issue both from modern and historical perspectives, and drew lessons to guide future progress. A number of issues were identified which may regulate the flow of discoveries, such as the balance between large strongly-focussed projects and instruments, designed to answer the most fundamental questions confronting us, and the need to maintain a creative environment with room for unorthodox thinkers and bold, high risk, projects. Also important is the need to maintain historical and cultural perspectives, and the need to engage the minds of the most brilliant young people on the planet, regardless of their background, ethnicity, gender, or geography.

  7. Final Report for Geometric Analysis for Data Reduction and Structure Discovery DE-FG02-10ER25983, STRIPES award # DE-SC0004096

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vixie, Kevin R.

    This is the final report for the project "Geometric Analysis for Data Reduction and Structure Discovery" in which insights and tools from geometric analysis were developed and exploited for their potential to large scale data challenges.

  8. Application in pesticide analysis: Liquid chromatography - A review of the state of science for biomarker discovery and identification

    EPA Science Inventory

    Book Chapter 18, titled Application in pesticide analysis: Liquid chromatography - A review of the state of science for biomarker discovery and identification, will be published in the book titled High Performance Liquid Chromatography in Pesticide Residue Analysis (Part of the C...

  9. Climatic shocks associate with innovation in science and technology.

    PubMed

    De Dreu, Carsten K W; van Dijk, Mathijs A

    2018-01-01

    Human history is shaped by landmark discoveries in science and technology. However, across both time and space the rate of innovation is erratic: Periods of relative inertia alternate with bursts of creative science and rapid cascades of technological innovations. While the origins of the rise and fall in rates of discovery and innovation remain poorly understood, they may reflect adaptive responses to exogenously emerging threats and pressures. Here we examined this possibility by fitting annual rates of scientific discovery and technological innovation to climatic variability and its associated economic pressures and resource scarcity. In time-series data from Europe (1500-1900CE), we indeed found that rates of innovation are higher during prolonged periods of cold (versus warm) surface temperature and during the presence (versus absence) of volcanic dust veils. This negative temperature-innovation link was confirmed in annual time-series for France, Germany, and the United Kingdom (1901-1965CE). Combined, across almost 500 years and over 5,000 documented innovations and discoveries, a 0.5°C increase in temperature associates with a sizable 0.30-0.60 standard deviation decrease in innovation. Results were robust to controlling for fluctuations in population size. Furthermore, and consistent with economic theory and micro-level data on group innovation, path analyses revealed that the relation between harsher climatic conditions between 1500-1900CE and more innovation is mediated by climate-induced economic pressures and resource scarcity.

  10. Innovative Methodology in the Discovery of Novel Drug Targets in the Free-Living Amoebae

    PubMed

    Baig, Abdul Mannan

    2018-04-25

    Despite advances in drug discovery and modifications in the chemotherapeutic regimens, human infections caused by free-living amoebae (FLA) have high mortality rates (~95%). The FLA that cause fatal human cerebral infections include Naegleria fowleri, Balamuthia mandrillaris and Acanthamoeba spp. Novel drug-target discovery remains the only viable option to tackle these central nervous system (CNS) infection in order to lower the mortality rates caused by the FLA. Of these FLA, N. fowleri causes primary amoebic meningoencephalitis (PAM), while the A. castellanii and B. Mandrillaris are known to cause granulomatous amoebic encephalitis (GAE). The infections caused by the FLA have been treated with drugs like Rifampin, Fluconazole, Amphotericin-B and Miltefosine. Miltefosine is an anti-leishmanial agent and an experimental anti-cancer drug. With only rare incidences of success, these drugs have remained unsuccessful to lower the mortality rates of the cerebral infection caused by FLA. Recently, with the help of bioinformatic computational tools and the discovered genomic data of the FLA, discovery of newer drug targets has become possible. These cellular targets are proteins that are either unique to the FLA or shared between the humans and these unicellular eukaryotes. The latter group of proteins has shown to be targets of some FDA approved drugs prescribed in non-infectious diseases. This review out-lines the bioinformatic methodologies that can be used in the discovery of such novel drug-targets, their chronicle by in-vitro assays done in the past and the translational value of such target discoveries in human diseases caused by FLA. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  11. Open discovery: An integrated live Linux platform of Bioinformatics tools.

    PubMed

    Vetrivel, Umashankar; Pilla, Kalabharath

    2008-01-01

    Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery - a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in.

  12. Tertiary oil discoveries whet explorer interest off Tunisia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Long, M.

    Prospects for increased Tertiary oil production in the S. Mediterranean have brightened with discoveries off Tunisia, but more evaluation is needed before commercial potential is known. Several groups of U.S. and European companies have tested oil in the relatively unexplored Miocene in the Gulf of Hannamet. These include groups operated by Buttes Resources Tunisia, Elf-Aquitaine Tunisia, and Shell Tunirex. Oil test rates of 1,790 to 1,800 bpd have been reported by the Buttes group in 2 Gulf of Hammamet wells. The initial discovery probably was the first Tertiary oil ever tested in that part of the Mediterranean. The discoveries havemore » helped boost exploratory interest in the northern waters of Tunisia and northeast toward Sicily. There are reports more U.S. and European companies are requesting exploration permits from the government of Tunisia. Companies with permits are planning new exploration for 1978. Probably the most significant discovery to date has been the Buttes group's 1 Jasmine (2 BGH). The group tested high-quality 39.5'-gravity oil at a rate of 1,790 bpd. Test flow was from the Sabri Sand at 6,490 to 6,590 ft. The well was drilled in 458 ft of water.« less

  13. High-throughput discovery of rare human nucleotide polymorphisms by Ecotilling

    PubMed Central

    Till, Bradley J.; Zerr, Troy; Bowers, Elisabeth; Greene, Elizabeth A.; Comai, Luca; Henikoff, Steven

    2006-01-01

    Human individuals differ from one another at only ∼0.1% of nucleotide positions, but these single nucleotide differences account for most heritable phenotypic variation. Large-scale efforts to discover and genotype human variation have been limited to common polymorphisms. However, these efforts overlook rare nucleotide changes that may contribute to phenotypic diversity and genetic disorders, including cancer. Thus, there is an increasing need for high-throughput methods to robustly detect rare nucleotide differences. Toward this end, we have adapted the mismatch discovery method known as Ecotilling for the discovery of human single nucleotide polymorphisms. To increase throughput and reduce costs, we developed a universal primer strategy and implemented algorithms for automated band detection. Ecotilling was validated by screening 90 human DNA samples for nucleotide changes in 5 gene targets and by comparing results to public resequencing data. To increase throughput for discovery of rare alleles, we pooled samples 8-fold and found Ecotilling to be efficient relative to resequencing, with a false negative rate of 5% and a false discovery rate of 4%. We identified 28 new rare alleles, including some that are predicted to damage protein function. The detection of rare damaging mutations has implications for models of human disease. PMID:16893952

  14. The development of biomarkers to reduce attrition rate in drug discovery focused on oncology and central nervous system.

    PubMed

    Safavi, Maliheh; Sabourian, Reyhaneh; Abdollahi, Mohammad

    2016-10-01

    The task of discovery and development of novel therapeutic agents remains an expensive, uncertain, time-consuming, competitive, and inefficient enterprise. Due to a steady increase in the cost and time of drug development and the considerable amount of resources required, a predictive tool is needed for assessing the safety and efficacy of a new chemical entity. This study is focused on the high attrition rate in discovery and development of oncology and central nervous system (CNS) medicines, because the failure rate of these medicines is higher than others. Some approaches valuable in reducing attrition rates are proposed and the judicious use of biomarkers is discussed. Unlike the significant progress made in identifying and characterizing novel mechanisms of disease processes and targeted therapies, the process of novel drug development is associated with an unacceptably high attrition rate. The application of clinically qualified predictive biomarkers holds great promise for further development of therapeutic targets, improved survival, and ultimately personalized medicine sets for patients. Decisions such as candidate selection, development risks, dose ranging, early proof of concept/principle, and patient stratification are based on the measurements of biologically and/or clinically validated biomarkers.

  15. Biomarker Discovery and Verification of Esophageal Squamous Cell Carcinoma Using Integration of SWATH/MRM.

    PubMed

    Hou, Guixue; Lou, Xiaomin; Sun, Yulin; Xu, Shaohang; Zi, Jin; Wang, Quanhui; Zhou, Baojin; Han, Bo; Wu, Lin; Zhao, Xiaohang; Lin, Liang; Liu, Siqi

    2015-09-04

    We propose an efficient integration of SWATH with MRM for biomarker discovery and verification when the corresponding ion library is well established. We strictly controlled the false positive rate associated with SWATH MS signals and carefully selected the target peptides coupled with SWATH and MRM. We collected 10 samples of esophageal squamous cell carcinoma (ESCC) tissues paired with tumors and adjacent regions and quantified 1758 unique proteins with FDR 1% at protein level using SWATH, in which 467 proteins were abundance-dependent with ESCC. After carefully evaluating the SWATH MS signals of the up-regulated proteins, we selected 120 proteins for MRM verification. MRM analysis of the pooled and individual esophageal tissues resulted in 116 proteins that exhibited similar abundance response modes to ESCC that were acquired with SWATH. Because the ESCC-related proteins consisted of a high percentile of secreted proteins, we conducted the MRM assay on patient sera that were collected from pre- and postoperation. Of the 116 target proteins, 42 were identified in the ESCC sera, including 11 with lowered abundances postoperation. Coupling SWATH and MRM is thus feasible and efficient for the discovery and verification of cancer-related protein biomarkers.

  16. Public-Private Partnerships in Lead Discovery: Overview and Case Studies.

    PubMed

    Gottwald, Matthias; Becker, Andreas; Bahr, Inke; Mueller-Fahrnow, Anke

    2016-09-01

    The pharmaceutical industry is faced with significant challenges in its efforts to discover new drugs that address unmet medical needs. Safety concerns and lack of efficacy are the two main technical reasons for attrition. Improved early research tools including predictive in silico, in vitro, and in vivo models, as well as a deeper understanding of the disease biology, therefore have the potential to improve success rates. The combination of internal activities with external collaborations in line with the interests and needs of all partners is a successful approach to foster innovation and to meet the challenges. Collaboration can take place in different ways, depending on the requirements of the participants. In this review, the value of public-private partnership approaches will be discussed, using examples from the Innovative Medicines Initiative (IMI). These examples describe consortia approaches to develop tools and processes for improving target identification and validation, as well as lead identification and optimization. The project "Kinetics for Drug Discovery" (K4DD), focusing on the adoption of drug-target binding kinetics analysis in the drug discovery decision-making process, is described in more detail. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Nanosurveyor: a framework for real-time data processing

    DOE PAGES

    Daurer, Benedikt J.; Krishnan, Hari; Perciano, Talita; ...

    2017-01-31

    Background: The ever improving brightness of accelerator based sources is enabling novel observations and discoveries with faster frame rates, larger fields of view, higher resolution, and higher dimensionality. Results: Here we present an integrated software/algorithmic framework designed to capitalize on high-throughput experiments through efficient kernels, load-balanced workflows, which are scalable in design. We describe the streamlined processing pipeline of ptychography data analysis. Conclusions: The pipeline provides throughput, compression, and resolution as well as rapid feedback to the microscope operators.

  18. Mississippi State University Center for Air Sea Technology. FY93 and FY 94 Research Program in Navy Ocean Modeling and Prediction

    DTIC Science & Technology

    1994-09-30

    relational versus object oriented DBMS, knowledge discovery, data models, rnetadata, data filtering, clustering techniques, and synthetic data. A secondary...The first was the investigation of Al/ES Lapplications (knowledge discovery, data mining, and clustering ). Here CAST collabo.rated with Dr. Fred Petry...knowledge discovery system based on clustering techniques; implemented an on-line data browser to the DBMS; completed preliminary efforts to apply object

  19. New Perspectives on How to Discover Drugs from Herbal Medicines: CAM's Outstanding Contribution to Modern Therapeutics.

    PubMed

    Pan, Si-Yuan; Zhou, Shu-Feng; Gao, Si-Hua; Yu, Zhi-Ling; Zhang, Shuo-Feng; Tang, Min-Ke; Sun, Jian-Ning; Ma, Dik-Lung; Han, Yi-Fan; Fong, Wang-Fun; Ko, Kam-Ming

    2013-01-01

    With tens of thousands of plant species on earth, we are endowed with an enormous wealth of medicinal remedies from Mother Nature. Natural products and their derivatives represent more than 50% of all the drugs in modern therapeutics. Because of the low success rate and huge capital investment need, the research and development of conventional drugs are very costly and difficult. Over the past few decades, researchers have focused on drug discovery from herbal medicines or botanical sources, an important group of complementary and alternative medicine (CAM) therapy. With a long history of herbal usage for the clinical management of a variety of diseases in indigenous cultures, the success rate of developing a new drug from herbal medicinal preparations should, in theory, be higher than that from chemical synthesis. While the endeavor for drug discovery from herbal medicines is "experience driven," the search for a therapeutically useful synthetic drug, like "looking for a needle in a haystack," is a daunting task. In this paper, we first illustrated various approaches of drug discovery from herbal medicines. Typical examples of successful drug discovery from botanical sources were given. In addition, problems in drug discovery from herbal medicines were described and possible solutions were proposed. The prospect of drug discovery from herbal medicines in the postgenomic era was made with the provision of future directions in this area of drug development.

  20. Allchin's Shoehorn, or Why Science Is Hypothetico-Deductive.

    ERIC Educational Resources Information Center

    Lawson, Anton E.

    2003-01-01

    Criticizes Allchin's article about Lawson's analysis of Galileo's discovery of Jupiter's moons. Suggests that a careful analysis of the way humans spontaneously process information and reason supports a general hypothetico-deductive theory of human information processing, reasoning, and scientific discovery. (SOE)

  1. 75 FR 22394 - Combined Notice of Filings No. 2

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-28

    ... 21, 2010. Take notice that the Commission has received the following Natural Gas Pipeline Rate and Refund Report filings: Docket Numbers: RP10-539-001. Applicants: Discovery Gas Transmission LLC. Description: Discovery Gas Transmission, LLC submits Substitute First Revised Sheet 225 et al. to FERC Gas...

  2. [The problem of small "n" and big "P" in neuropsycho-pharmacology, or how to keep the rate of false discoveries under control].

    PubMed

    Petschner, Péter; Bagdy, György; Tóthfalusi, Laszló

    2015-03-01

    One of the characteristics of many methods used in neuropsychopharmacology is that a large number of parameters (P) are measured in relatively few subjects (n). Functional magnetic resonance imaging, electroencephalography (EEG) and genomic studies are typical examples. For example one microarray chip can contain thousands of probes. Therefore, in studies using microarray chips, P may be several thousand-fold larger than n. Statistical analysis of such studies is a challenging task and they are refereed to in the statistical literature such as the small "n" big "P" problem. The problem has many facets including the controversies associated with multiple hypothesis testing. A typical scenario in this context is, when two or more groups are compared by the individual attributes. If the increased classification error due to the multiple testing is neglected, then several highly significant differences will be discovered. But in reality, some of these significant differences are coincidental, not reproducible findings. Several methods were proposed to solve this problem. In this review we discuss two of the proposed solutions, algorithms to compare sets and statistical hypothesis tests controlling the false discovery rate.

  3. Thermoplastic nanofluidic devices for biomedical applications.

    PubMed

    Weerakoon-Ratnayake, Kumuditha M; O'Neil, Colleen E; Uba, Franklin I; Soper, Steven A

    2017-01-31

    Microfluidics is now moving into a developmental stage where basic discoveries are being transitioned into the commercial sector so that these discoveries can affect, for example, healthcare. Thus, high production rate microfabrication technologies, such as thermal embossing and/or injection molding, are being used to produce low-cost consumables appropriate for commercial applications. Based on recent reports, it is clear that nanofluidics offers some attractive process capabilities that may provide unique venues for biomolecular analyses that cannot be realized at the microscale. Thus, it would be attractive to consider early in the developmental cycle of nanofluidics production pipelines that can generate devices possessing sub-150 nm dimensions in a high production mode and at low-cost to accommodate the commercialization of this exciting technology. Recently, functional sub-150 nm thermoplastic nanofluidic devices have been reported that can provide high process yield rates, which can enable commercial translation of nanofluidics. This review presents an overview of recent advancements in the fabrication, assembly, surface modification and the characterization of thermoplastic nanofluidic devices. Also, several examples in which nanoscale phenomena have been exploited for the analysis of biomolecules are highlighted. Lastly, some general conclusions and future outlooks are presented.

  4. Visual representation of scientific information.

    PubMed

    Wong, Bang

    2011-02-15

    Great technological advances have enabled researchers to generate an enormous amount of data. Data analysis is replacing data generation as the rate-limiting step in scientific research. With this wealth of information, we have an opportunity to understand the molecular causes of human diseases. However, the unprecedented scale, resolution, and variety of data pose new analytical challenges. Visual representation of data offers insights that can lead to new understanding, whether the purpose is analysis or communication. This presentation shows how art, design, and traditional illustration can enable scientific discovery. Examples will be drawn from the Broad Institute's Data Visualization Initiative, aimed at establishing processes for creating informative visualization models.

  5. Identification of genomic loci associated with resting heart rate and shared genetic predictors with all-cause mortality.

    PubMed

    Eppinga, Ruben N; Hagemeijer, Yanick; Burgess, Stephen; Hinds, David A; Stefansson, Kari; Gudbjartsson, Daniel F; van Veldhuisen, Dirk J; Munroe, Patricia B; Verweij, Niek; van der Harst, Pim

    2016-12-01

    Resting heart rate is a heritable trait correlated with life span. Little is known about the genetic contribution to resting heart rate and its relationship with mortality. We performed a genome-wide association discovery and replication analysis starting with 19.9 million genetic variants and studying up to 265,046 individuals to identify 64 loci associated with resting heart rate (P < 5 × 10 -8 ); 46 of these were novel. We then used the genetic variants identified to study the association between resting heart rate and all-cause mortality. We observed that a genetically predicted resting heart rate increase of 5 beats per minute was associated with a 20% increase in mortality risk (hazard ratio 1.20, 95% confidence interval 1.11-1.28, P = 8.20 × 10 -7 ) translating to a reduction in life expectancy of 2.9 years for males and 2.6 years for females. Our findings provide evidence for shared genetic predictors of resting heart rate and all-cause mortality.

  6. Shilling attack detection for recommender systems based on credibility of group users and rating time series.

    PubMed

    Zhou, Wei; Wen, Junhao; Qu, Qiang; Zeng, Jun; Cheng, Tian

    2018-01-01

    Recommender systems are vulnerable to shilling attacks. Forged user-generated content data, such as user ratings and reviews, are used by attackers to manipulate recommendation rankings. Shilling attack detection in recommender systems is of great significance to maintain the fairness and sustainability of recommender systems. The current studies have problems in terms of the poor universality of algorithms, difficulty in selection of user profile attributes, and lack of an optimization mechanism. In this paper, a shilling behaviour detection structure based on abnormal group user findings and rating time series analysis is proposed. This paper adds to the current understanding in the field by studying the credibility evaluation model in-depth based on the rating prediction model to derive proximity-based predictions. A method for detecting suspicious ratings based on suspicious time windows and target item analysis is proposed. Suspicious rating time segments are determined by constructing a time series, and data streams of the rating items are examined and suspicious rating segments are checked. To analyse features of shilling attacks by a group user's credibility, an abnormal group user discovery method based on time series and time window is proposed. Standard testing datasets are used to verify the effect of the proposed method.

  7. Shilling attack detection for recommender systems based on credibility of group users and rating time series

    PubMed Central

    Wen, Junhao; Qu, Qiang; Zeng, Jun; Cheng, Tian

    2018-01-01

    Recommender systems are vulnerable to shilling attacks. Forged user-generated content data, such as user ratings and reviews, are used by attackers to manipulate recommendation rankings. Shilling attack detection in recommender systems is of great significance to maintain the fairness and sustainability of recommender systems. The current studies have problems in terms of the poor universality of algorithms, difficulty in selection of user profile attributes, and lack of an optimization mechanism. In this paper, a shilling behaviour detection structure based on abnormal group user findings and rating time series analysis is proposed. This paper adds to the current understanding in the field by studying the credibility evaluation model in-depth based on the rating prediction model to derive proximity-based predictions. A method for detecting suspicious ratings based on suspicious time windows and target item analysis is proposed. Suspicious rating time segments are determined by constructing a time series, and data streams of the rating items are examined and suspicious rating segments are checked. To analyse features of shilling attacks by a group user’s credibility, an abnormal group user discovery method based on time series and time window is proposed. Standard testing datasets are used to verify the effect of the proposed method. PMID:29742134

  8. Open discovery: An integrated live Linux platform of Bioinformatics tools

    PubMed Central

    Vetrivel, Umashankar; Pilla, Kalabharath

    2008-01-01

    Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery ‐ a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. Availability The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in PMID:19238235

  9. Academic drug discovery: current status and prospects.

    PubMed

    Everett, Jeremy R

    2015-01-01

    The contraction in pharmaceutical drug discovery operations in the past decade has been counter-balanced by a significant rise in the number of academic drug discovery groups. In addition, pharmaceutical companies that used to operate in completely independent, vertically integrated operations for drug discovery, are now collaborating more with each other, and with academic groups. We are in a new era of drug discovery. This review provides an overview of the current status of academic drug discovery groups, their achievements and the challenges they face, together with perspectives on ways to achieve improved outcomes. Academic groups have made important contributions to drug discovery, from its earliest days and continue to do so today. However, modern drug discovery and development is exceedingly complex, and has high failure rates, principally because human biology is complex and poorly understood. Academic drug discovery groups need to play to their strengths and not just copy what has gone before. However, there are lessons to be learnt from the experiences of the industrial drug discoverers and four areas are highlighted for attention: i) increased validation of targets; ii) elimination of false hits from high throughput screening (HTS); iii) increasing the quality of molecular probes; and iv) investing in a high-quality informatics infrastructure.

  10. Webinar: Airborne Data Discovery and Analysis with Toolsets for Airborne Data (TAD)

    Atmospheric Science Data Center

    2016-10-18

    Webinar: Airborne Data Discovery and Analysis with Toolsets for Airborne Data (TAD) Wednesday, October 26, 2016 Join us on ... and flight data ranges are available. Registration is now open.  Access the full announcement   For TAD Information, ...

  11. The optimal power puzzle: scrutiny of the monotone likelihood ratio assumption in multiple testing.

    PubMed

    Cao, Hongyuan; Sun, Wenguang; Kosorok, Michael R

    2013-01-01

    In single hypothesis testing, power is a non-decreasing function of type I error rate; hence it is desirable to test at the nominal level exactly to achieve optimal power. The puzzle lies in the fact that for multiple testing, under the false discovery rate paradigm, such a monotonic relationship may not hold. In particular, exact false discovery rate control may lead to a less powerful testing procedure if a test statistic fails to fulfil the monotone likelihood ratio condition. In this article, we identify different scenarios wherein the condition fails and give caveats for conducting multiple testing in practical settings.

  12. Quantitative Analysis of Tissue Samples by Combining iTRAQ Isobaric Labeling with Selected/Multiple Reaction Monitoring (SRM/MRM).

    PubMed

    Narumi, Ryohei; Tomonaga, Takeshi

    2016-01-01

    Mass spectrometry-based phosphoproteomics is an indispensible technique used in the discovery and quantification of phosphorylation events on proteins in biological samples. The application of this technique to tissue samples is especially useful for the discovery of biomarkers as well as biological studies. We herein describe the application of a large-scale phosphoproteome analysis and SRM/MRM-based quantitation to develop a strategy for the systematic discovery and validation of biomarkers using tissue samples.

  13. Building Cognition: The Construction of Computational Representations for Scientific Discovery.

    PubMed

    Chandrasekharan, Sanjay; Nersessian, Nancy J

    2015-11-01

    Novel computational representations, such as simulation models of complex systems and video games for scientific discovery (Foldit, EteRNA etc.), are dramatically changing the way discoveries emerge in science and engineering. The cognitive roles played by such computational representations in discovery are not well understood. We present a theoretical analysis of the cognitive roles such representations play, based on an ethnographic study of the building of computational models in a systems biology laboratory. Specifically, we focus on a case of model-building by an engineer that led to a remarkable discovery in basic bioscience. Accounting for such discoveries requires a distributed cognition (DC) analysis, as DC focuses on the roles played by external representations in cognitive processes. However, DC analyses by and large have not examined scientific discovery, and they mostly focus on memory offloading, particularly how the use of existing external representations changes the nature of cognitive tasks. In contrast, we study discovery processes and argue that discoveries emerge from the processes of building the computational representation. The building process integrates manipulations in imagination and in the representation, creating a coupled cognitive system of model and modeler, where the model is incorporated into the modeler's imagination. This account extends DC significantly, and we present some of the theoretical and application implications of this extended account. Copyright © 2014 Cognitive Science Society, Inc.

  14. Climatic shocks associate with innovation in science and technology

    PubMed Central

    van Dijk, Mathijs A.

    2018-01-01

    Human history is shaped by landmark discoveries in science and technology. However, across both time and space the rate of innovation is erratic: Periods of relative inertia alternate with bursts of creative science and rapid cascades of technological innovations. While the origins of the rise and fall in rates of discovery and innovation remain poorly understood, they may reflect adaptive responses to exogenously emerging threats and pressures. Here we examined this possibility by fitting annual rates of scientific discovery and technological innovation to climatic variability and its associated economic pressures and resource scarcity. In time-series data from Europe (1500–1900CE), we indeed found that rates of innovation are higher during prolonged periods of cold (versus warm) surface temperature and during the presence (versus absence) of volcanic dust veils. This negative temperature–innovation link was confirmed in annual time-series for France, Germany, and the United Kingdom (1901–1965CE). Combined, across almost 500 years and over 5,000 documented innovations and discoveries, a 0.5°C increase in temperature associates with a sizable 0.30–0.60 standard deviation decrease in innovation. Results were robust to controlling for fluctuations in population size. Furthermore, and consistent with economic theory and micro-level data on group innovation, path analyses revealed that the relation between harsher climatic conditions between 1500–1900CE and more innovation is mediated by climate-induced economic pressures and resource scarcity. PMID:29364910

  15. Integration of lyoplate based flow cytometry and computational analysis for standardized immunological biomarker discovery.

    PubMed

    Villanova, Federica; Di Meglio, Paola; Inokuma, Margaret; Aghaeepour, Nima; Perucha, Esperanza; Mollon, Jennifer; Nomura, Laurel; Hernandez-Fuentes, Maria; Cope, Andrew; Prevost, A Toby; Heck, Susanne; Maino, Vernon; Lord, Graham; Brinkman, Ryan R; Nestle, Frank O

    2013-01-01

    Discovery of novel immune biomarkers for monitoring of disease prognosis and response to therapy in immune-mediated inflammatory diseases is an important unmet clinical need. Here, we establish a novel framework for immunological biomarker discovery, comparing a conventional (liquid) flow cytometry platform (CFP) and a unique lyoplate-based flow cytometry platform (LFP) in combination with advanced computational data analysis. We demonstrate that LFP had higher sensitivity compared to CFP, with increased detection of cytokines (IFN-γ and IL-10) and activation markers (Foxp3 and CD25). Fluorescent intensity of cells stained with lyophilized antibodies was increased compared to cells stained with liquid antibodies. LFP, using a plate loader, allowed medium-throughput processing of samples with comparable intra- and inter-assay variability between platforms. Automated computational analysis identified novel immunophenotypes that were not detected with manual analysis. Our results establish a new flow cytometry platform for standardized and rapid immunological biomarker discovery with wide application to immune-mediated diseases.

  16. Integration of Lyoplate Based Flow Cytometry and Computational Analysis for Standardized Immunological Biomarker Discovery

    PubMed Central

    Villanova, Federica; Di Meglio, Paola; Inokuma, Margaret; Aghaeepour, Nima; Perucha, Esperanza; Mollon, Jennifer; Nomura, Laurel; Hernandez-Fuentes, Maria; Cope, Andrew; Prevost, A. Toby; Heck, Susanne; Maino, Vernon; Lord, Graham; Brinkman, Ryan R.; Nestle, Frank O.

    2013-01-01

    Discovery of novel immune biomarkers for monitoring of disease prognosis and response to therapy in immune-mediated inflammatory diseases is an important unmet clinical need. Here, we establish a novel framework for immunological biomarker discovery, comparing a conventional (liquid) flow cytometry platform (CFP) and a unique lyoplate-based flow cytometry platform (LFP) in combination with advanced computational data analysis. We demonstrate that LFP had higher sensitivity compared to CFP, with increased detection of cytokines (IFN-γ and IL-10) and activation markers (Foxp3 and CD25). Fluorescent intensity of cells stained with lyophilized antibodies was increased compared to cells stained with liquid antibodies. LFP, using a plate loader, allowed medium-throughput processing of samples with comparable intra- and inter-assay variability between platforms. Automated computational analysis identified novel immunophenotypes that were not detected with manual analysis. Our results establish a new flow cytometry platform for standardized and rapid immunological biomarker discovery with wide application to immune-mediated diseases. PMID:23843942

  17. Experimental Null Method to Guide the Development of Technical Procedures and to Control False-Positive Discovery in Quantitative Proteomics.

    PubMed

    Shen, Xiaomeng; Hu, Qiang; Li, Jun; Wang, Jianmin; Qu, Jun

    2015-10-02

    Comprehensive and accurate evaluation of data quality and false-positive biomarker discovery is critical to direct the method development/optimization for quantitative proteomics, which nonetheless remains challenging largely due to the high complexity and unique features of proteomic data. Here we describe an experimental null (EN) method to address this need. Because the method experimentally measures the null distribution (either technical or biological replicates) using the same proteomic samples, the same procedures and the same batch as the case-vs-contol experiment, it correctly reflects the collective effects of technical variability (e.g., variation/bias in sample preparation, LC-MS analysis, and data processing) and project-specific features (e.g., characteristics of the proteome and biological variation) on the performances of quantitative analysis. To show a proof of concept, we employed the EN method to assess the quantitative accuracy and precision and the ability to quantify subtle ratio changes between groups using different experimental and data-processing approaches and in various cellular and tissue proteomes. It was found that choices of quantitative features, sample size, experimental design, data-processing strategies, and quality of chromatographic separation can profoundly affect quantitative precision and accuracy of label-free quantification. The EN method was also demonstrated as a practical tool to determine the optimal experimental parameters and rational ratio cutoff for reliable protein quantification in specific proteomic experiments, for example, to identify the necessary number of technical/biological replicates per group that affords sufficient power for discovery. Furthermore, we assessed the ability of EN method to estimate levels of false-positives in the discovery of altered proteins, using two concocted sample sets mimicking proteomic profiling using technical and biological replicates, respectively, where the true-positives/negatives are known and span a wide concentration range. It was observed that the EN method correctly reflects the null distribution in a proteomic system and accurately measures false altered proteins discovery rate (FADR). In summary, the EN method provides a straightforward, practical, and accurate alternative to statistics-based approaches for the development and evaluation of proteomic experiments and can be universally adapted to various types of quantitative techniques.

  18. Linnorm: improved statistical analysis for single cell RNA-seq expression data

    PubMed Central

    Yip, Shun H.; Wang, Panwen; Kocher, Jean-Pierre A.; Sham, Pak Chung

    2017-01-01

    Abstract Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy. PMID:28981748

  19. Error analysis of stochastic gradient descent ranking.

    PubMed

    Chen, Hong; Tang, Yi; Li, Luoqing; Yuan, Yuan; Li, Xuelong; Tang, Yuanyan

    2013-06-01

    Ranking is always an important task in machine learning and information retrieval, e.g., collaborative filtering, recommender systems, drug discovery, etc. A kernel-based stochastic gradient descent algorithm with the least squares loss is proposed for ranking in this paper. The implementation of this algorithm is simple, and an expression of the solution is derived via a sampling operator and an integral operator. An explicit convergence rate for leaning a ranking function is given in terms of the suitable choices of the step size and the regularization parameter. The analysis technique used here is capacity independent and is novel in error analysis of ranking learning. Experimental results on real-world data have shown the effectiveness of the proposed algorithm in ranking tasks, which verifies the theoretical analysis in ranking error.

  20. Discovery and New Frontiers Project Budget Analysis Tool

    NASA Technical Reports Server (NTRS)

    Newhouse, Marilyn E.

    2011-01-01

    The Discovery and New Frontiers (D&NF) programs are multi-project, uncoupled programs that currently comprise 13 missions in phases A through F. The ability to fly frequent science missions to explore the solar system is the primary measure of program success. The program office uses a Budget Analysis Tool to perform "what-if" analyses and compare mission scenarios to the current program budget, and rapidly forecast the programs ability to meet their launch rate requirements. The tool allows the user to specify the total mission cost (fixed year), mission development and operations profile by phase (percent total mission cost and duration), launch vehicle, and launch date for multiple missions. The tool automatically applies inflation and rolls up the total program costs (in real year dollars) for comparison against available program budget. Thus, the tool allows the user to rapidly and easily explore a variety of launch rates and analyze the effect of changes in future mission or launch vehicle costs, the differing development profiles or operational durations of a future mission, or a replan of a current mission on the overall program budget. Because the tool also reports average monthly costs for the specified mission profile, the development or operations cost profile can easily be validate against program experience for similar missions. While specifically designed for predicting overall program budgets for programs that develop and operate multiple missions concurrently, the basic concept of the tool (rolling up multiple, independently-budget lines) could easily be adapted to other applications.

  1. Structure-based discovery and binding site analysis of histamine receptor ligands.

    PubMed

    Kiss, Róbert; Keserű, György M

    2016-12-01

    The application of structure-based drug discovery in histamine receptor projects was previously hampered by the lack of experimental structures. The publication of the first X-ray structure of the histamine H1 receptor has been followed by several successful virtual screens and binding site analysis studies of H1-antihistamines. This structure together with several other recently solved aminergic G-protein coupled receptors (GPCRs) enabled the development of more realistic homology models for H2, H3 and H4 receptors. Areas covered: In this paper, the authors review the development of histamine receptor models and their application in drug discovery. Expert opinion: In the authors' opinion, the application of atomistic histamine receptor models has played a significant role in understanding key ligand-receptor interactions as well as in the discovery of novel chemical starting points. The recently solved H1 receptor structure is a major milestone in structure-based drug discovery; however, our analysis also demonstrates that for building H3 and H4 receptor homology models, other GPCRs may be more suitable as templates. For these receptors, the authors envisage that the development of higher quality homology models will significantly contribute to the discovery and optimization of novel H3 and H4 ligands.

  2. Distributed Drug Discovery: Advancing Chemical Education through Contextualized Combinatorial Solid-Phase Organic Laboratories

    ERIC Educational Resources Information Center

    Scott, William L.; Denton, Ryan E.; Marrs, Kathleen A.; Durrant, Jacob D.; Samaritoni, J. Geno; Abraham, Milata M.; Brown, Stephen P.; Carnahan, Jon M.; Fischer, Lindsey G.; Glos, Courtney E.; Sempsrott, Peter J.; O'Donnell, Martin J.

    2015-01-01

    The Distributed Drug Discovery (D3) program trains students in three drug discovery disciplines (synthesis, computational analysis, and biological screening) while addressing the important challenge of discovering drug leads for neglected diseases. This article focuses on implementation of the synthesis component in the second-semester…

  3. Collected Notes on the Workshop for Pattern Discovery in Large Databases

    NASA Technical Reports Server (NTRS)

    Buntine, Wray (Editor); Delalto, Martha (Editor)

    1991-01-01

    These collected notes are a record of material presented at the Workshop. The core data analysis is addressed that have traditionally required statistical or pattern recognition techniques. Some of the core tasks include classification, discrimination, clustering, supervised and unsupervised learning, discovery and diagnosis, i.e., general pattern discovery.

  4. New Perspectives on How to Discover Drugs from Herbal Medicines: CAM's Outstanding Contribution to Modern Therapeutics

    PubMed Central

    Pan, Si-Yuan; Zhou, Shu-Feng; Gao, Si-Hua; Yu, Zhi-Ling; Zhang, Shuo-Feng; Tang, Min-Ke; Sun, Jian-Ning; Han, Yi-Fan; Fong, Wang-Fun; Ko, Kam-Ming

    2013-01-01

    With tens of thousands of plant species on earth, we are endowed with an enormous wealth of medicinal remedies from Mother Nature. Natural products and their derivatives represent more than 50% of all the drugs in modern therapeutics. Because of the low success rate and huge capital investment need, the research and development of conventional drugs are very costly and difficult. Over the past few decades, researchers have focused on drug discovery from herbal medicines or botanical sources, an important group of complementary and alternative medicine (CAM) therapy. With a long history of herbal usage for the clinical management of a variety of diseases in indigenous cultures, the success rate of developing a new drug from herbal medicinal preparations should, in theory, be higher than that from chemical synthesis. While the endeavor for drug discovery from herbal medicines is “experience driven,” the search for a therapeutically useful synthetic drug, like “looking for a needle in a haystack,” is a daunting task. In this paper, we first illustrated various approaches of drug discovery from herbal medicines. Typical examples of successful drug discovery from botanical sources were given. In addition, problems in drug discovery from herbal medicines were described and possible solutions were proposed. The prospect of drug discovery from herbal medicines in the postgenomic era was made with the provision of future directions in this area of drug development. PMID:23634172

  5. Competitive intelligence and patent analysis in drug discovery.

    PubMed

    Grandjean, Nicolas; Charpiot, Brigitte; Pena, Carlos Andres; Peitsch, Manuel C

    2005-01-01

    Patents are a major source of information in drug discovery and, when properly processed and analyzed, can yield a wealth of information on competitors activities, R&D trends, emerging fields, collaborations, among others. This review discusses the current state-of-the-art in textual data analysis and exploration methods as applied to patent analysis.: © 2005 Elsevier Ltd . All rights reserved.

  6. Genome-wide gene–environment interaction analysis for asbestos exposure in lung cancer susceptibility

    PubMed Central

    Wei, Qingyi Wei

    2012-01-01

    Asbestos exposure is a known risk factor for lung cancer. Although recent genome-wide association studies (GWASs) have identified some novel loci for lung cancer risk, few addressed genome-wide gene–environment interactions. To determine gene–asbestos interactions in lung cancer risk, we conducted genome-wide gene–environment interaction analyses at levels of single nucleotide polymorphisms (SNPs), genes and pathways, using our published Texas lung cancer GWAS dataset. This dataset included 317 498 SNPs from 1154 lung cancer cases and 1137 cancer-free controls. The initial SNP-level P-values for interactions between genetic variants and self-reported asbestos exposure were estimated by unconditional logistic regression models with adjustment for age, sex, smoking status and pack-years. The P-value for the most significant SNP rs13383928 was 2.17×10–6, which did not reach the genome-wide statistical significance. Using a versatile gene-based test approach, we found that the top significant gene was C7orf54, located on 7q32.1 (P = 8.90×10–5). Interestingly, most of the other significant genes were located on 11q13. When we used an improved gene-set-enrichment analysis approach, we found that the Fas signaling pathway and the antigen processing and presentation pathway were most significant (nominal P < 0.001; false discovery rate < 0.05) among 250 pathways containing 17 572 genes. We believe that our analysis is a pilot study that first describes the gene–asbestos interaction in lung cancer risk at levels of SNPs, genes and pathways. Our findings suggest that immune function regulation-related pathways may be mechanistically involved in asbestos-associated lung cancer risk. Abbreviations:CIconfidence intervalEenvironmentFDRfalse discovery rateGgeneGSEAgene-set-enrichment analysisGWASgenome-wide association studiesi-GSEAimproved gene-set-enrichment analysis approachORodds ratioSNPsingle nucleotide polymorphism PMID:22637743

  7. A Tutorial on Multiple Testing: False Discovery Control

    NASA Astrophysics Data System (ADS)

    Chatelain, F.

    2016-09-01

    This paper presents an overview of criteria and methods in multiple testing, with an emphasis on the false discovery rate control. The popular Benjamini and Hochberg procedure is described. The rationale for this approach is explained through a simple Bayesian interpretation. Some state-of-the-art variations and extensions are also presented.

  8. A petroleum discovery-rate forecast revisited-The problem of field growth

    USGS Publications Warehouse

    Drew, L.J.; Schuenemeyer, J.H.

    1992-01-01

    A forecast of the future rates of discovery of crude oil and natural gas for the 123,027-km2 Miocene/Pliocene trend in the Gulf of Mexico was made in 1980. This forecast was evaluated in 1988 by comparing two sets of data: (1) the actual versus the forecasted number of fields discovered, and (2) the actual versus the forecasted volumes of crude oil and natural gas discovered with the drilling of 1,820 wildcat wells along the trend between January 1, 1977, and December 31, 1985. The forecast specified that this level of drilling would result in the discovery of 217 fields containing 1.78 billion barrels of oil equivalent; however, 238 fields containing 3.57 billion barrels of oil equivalent were actually discovered. This underestimation is attributed to biases introduced by field growth and, to a lesser degree, the artificially low, pre-1970's price of natural gas that prevented many smaller gas fields from being brought into production at the time of their discovery; most of these fields contained less than 50 billion cubic feet of producible natural gas. ?? 1992 Oxford University Press.

  9. Network-based discovery through mechanistic systems biology. Implications for applications--SMEs and drug discovery: where the action is.

    PubMed

    Benson, Neil

    2015-08-01

    Phase II attrition remains the most important challenge for drug discovery. Tackling the problem requires improved understanding of the complexity of disease biology. Systems biology approaches to this problem can, in principle, deliver this. This article reviews the reports of the application of mechanistic systems models to drug discovery questions and discusses the added value. Although we are on the journey to the virtual human, the length, path and rate of learning from this remain an open question. Success will be dependent on the will to invest and make the most of the insight generated along the way. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Drug Discovery for Neglected Diseases: Molecular Target-Based and Phenotypic Approaches

    PubMed Central

    2013-01-01

    Drug discovery for neglected tropical diseases is carried out using both target-based and phenotypic approaches. In this paper, target-based approaches are discussed, with a particular focus on human African trypanosomiasis. Target-based drug discovery can be successful, but careful selection of targets is required. There are still very few fully validated drug targets in neglected diseases, and there is a high attrition rate in target-based drug discovery for these diseases. Phenotypic screening is a powerful method in both neglected and non-neglected diseases and has been very successfully used. Identification of molecular targets from phenotypic approaches can be a way to identify potential new drug targets. PMID:24015767

  11. Perspectives on the agrochemical industry and agrochemical discovery.

    PubMed

    Sparks, Thomas C; Lorsbach, Beth A

    2017-04-01

    Agrochemicals have been critical to the production of food and fiber, as well as the control of vectors of disease. The need for the discovery and development of new agrochemicals continues unabated due to the loss of existing products through the development of resistance, the desire for products with more favorable environmental and toxicological profiles, shifting pest spectra, and changing agricultural needs and practices. As presented in the associated analysis of the agrochemical industry, the rising costs and complexities of agrochemical discovery have, in part, led to increasing consolidation, especially in the USA and Europe. However, as demonstrated by the present analysis, the discovery of new agrochemicals continues in spite of the challenges. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  12. Collaborative Core Research Program for Chemical-Biological Warfare Defense

    DTIC Science & Technology

    2015-01-04

    Discovery through High Throughput Screening (HTS) and Fragment-Based Drug Design (FBDD...Discovery through High Throughput Screening (HTS) and Fragment-Based Drug Design (FBDD) Current pharmaceutical approaches involving drug discovery...structural analysis and docking program generally known as fragment based drug design (FBDD). The main advantage of using these approaches is that

  13. On the Detectability of Interstellar Objects Like 1I/'Oumuamua

    NASA Astrophysics Data System (ADS)

    Ragozzine, Darin

    2018-04-01

    Almost since Oort's 1950 hypothesis of a tenuously bound cloud of comets, planetary formation theorists have realized that the process of planet formation must have ejected very large numbers of planetesimals into interstellar space. Unforunately, these objects are distributed over galactic volumes, while they are only likely to be detectable if they pass within a few AU of Earth, resulting in an incredibly sparse detectable population. Furthermore, hypotheses for the formation and distribution of these bodies allows for uncertainties of orders of magnitude in the expected detection rate: our analysis suggested LSST would discover 0.01-100 objects during its lifetime (Cook et al. 2016). The discovery of 1I/'Oumuamua by a survey less powerful that LSST indicates either a low probability event and/or that properties of this population are on the more favorable end of the spectrum. We revisit the detailed detection analysis of Cook et al. 2016 in light of the detection of 1I/'Oumuamua. We use these results to better understand 1I/'Oumuamua and to update our assessment of future detections of interstellar objects. We highlight some key questions that can be answered only by additional discoveries.

  14. [Biomarkers: "Found in translation"].

    PubMed

    Lockhart, Brian P; Walther, Bernard

    2009-04-01

    Despite continued increase in global Pharma R & D expenditure, the number of innovative drugs obtaining market approval has declined since 1994. The pharmaceutical industry is now entering a crucial juncture where increasing rates of attrition in clinical drug development as well as increasing development timelines are impacted by external factors such as intense regulatory pricing and safety pressures, increasing sales erosion due to generics, as well as exponential increases in the costs of bringing a drug to market. Despite these difficulties, numerous opportunities exist such as multiple unmet medical needs, the increasing incidence of certain diseases such as Alzheimer's disease, cancer, diabetes and obesity due to demographic changes, as well as the emergence of evolving markets such as China, India, and Eastern Europe. Consequently, Pharma is now responding to this challenge by improving both the productivity and the innovation in its drug discovery and development pipelines. In this regard, the advent of new technologies and expertise such as genomics, proteomics, structural biology, and molecular informatics in an integrated systems biology approach also provides a powerful opportunity for Pharma to address some of these difficulties. The key features behind this new strategy imply a discovery process based on an improved understanding of the molecular mechanism of diseases and drugs, translational research that places the patient at the center of the research process, and the application of biomarkers throughout the discovery and development phases. Moreover, new paradigms are required to improve target validation and develop more predictive cellular and animal models of human pathologies, a greater capacity in informatics-based analysis, and, consequently, a greater access to the vast sources of accumulating biological data and its integrated analysis. In the present review, we will address some of these issues and in particular emphasize how the application of biomarkers could potentially lead to improved productivity, quality, and innovation in drug discovery and ultimately better and safer medicines with improved therapeutic efficacy in specific pathologies for targeted patients.

  15. A comparative review of estimates of the proportion unchanged genes and the false discovery rate

    PubMed Central

    Broberg, Per

    2005-01-01

    Background In the analysis of microarray data one generally produces a vector of p-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these p-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR) and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the p-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available. Results A simulation model based on the distribution of real microarray data plus two real data sets were used to assess the methods. The proposed alternative methods for estimating the proportion unchanged fared very well, and gave evidence of low bias and very low variance. Different methods perform well depending upon whether there are few or many regulated genes. Furthermore, the methods for estimating FDR showed a varying performance, and were sometimes misleading. The new method had a very low error. Conclusion The concept of the q-value or false discovery rate is useful in practical research, despite some theoretical and practical shortcomings. However, it seems possible to challenge the performance of the published methods, and there is likely scope for further developing the estimates of the FDR. The new methods provide the scientist with more options to choose a suitable method for any particular experiment. The article advocates the use of the conjoint information regarding false positive and negative rates as well as the proportion unchanged when identifying changed genes. PMID:16086831

  16. Recent development in software and automation tools for high-throughput discovery bioanalysis.

    PubMed

    Shou, Wilson Z; Zhang, Jun

    2012-05-01

    Bioanalysis with LC-MS/MS has been established as the method of choice for quantitative determination of drug candidates in biological matrices in drug discovery and development. The LC-MS/MS bioanalytical support for drug discovery, especially for early discovery, often requires high-throughput (HT) analysis of large numbers of samples (hundreds to thousands per day) generated from many structurally diverse compounds (tens to hundreds per day) with a very quick turnaround time, in order to provide important activity and liability data to move discovery projects forward. Another important consideration for discovery bioanalysis is its fit-for-purpose quality requirement depending on the particular experiments being conducted at this stage, and it is usually not as stringent as those required in bioanalysis supporting drug development. These aforementioned attributes of HT discovery bioanalysis made it an ideal candidate for using software and automation tools to eliminate manual steps, remove bottlenecks, improve efficiency and reduce turnaround time while maintaining adequate quality. In this article we will review various recent developments that facilitate automation of individual bioanalytical procedures, such as sample preparation, MS/MS method development, sample analysis and data review, as well as fully integrated software tools that manage the entire bioanalytical workflow in HT discovery bioanalysis. In addition, software tools supporting the emerging high-resolution accurate MS bioanalytical approach are also discussed.

  17. The dark energy survey Y1 supernova search: Survey strategy compared to forecasts and the photometric type Is SN volumetric rate

    NASA Astrophysics Data System (ADS)

    Fischer, John Arthur

    For 70 years, the physics community operated under the assumption that the expansion of the Universe must be slowing due to gravitational attraction. Then, in 1998, two teams of scientists used Type Ia supernovae to discover that cosmic expansion was actually acceler- ating due to a mysterious "dark energy." As a result, Type Ia supernovae have become the most cosmologically important transient events in the last 20 years, with a large amount of effort going into their discovery as well as understanding their progenitor systems. One such probe for understanding Type Ia supernovae is to use rate measurements to de- termine the time delay between star formation and supernova explosion. For the last 30 years, the discovery of individual Type Ia supernova events has been accelerating. How- ever, those discoveries were happening in time-domain surveys that probed only a portion of the redshift range where expansion was impacted by dark energy. The Dark Energy Survey (DES) is the first project in the "next generation" of time-domain surveys that will discovery thousands of Type Ia supernovae out to a redshift of 1.2 (where dark energy be- comes subdominant) and DES will have better systematic uncertainties over that redshift range than any survey to date. In order to gauge the discovery effectiveness of this survey, we will use the first season's 469 photometrically typed supernovee and compare it with simulations in order to update the full survey Type Ia projections from 3500 to 2250. We will then use 165 of the 469 supernovae out to a redshift of 0.6 to measure the supernovae rate both as a function of comoving volume and of the star formation rate as it evolves with redshift. We find the most statistically significant prompt fraction of any survey to date (with a 3.9? prompt fraction detection). We will also reinforce the already existing tension in the measurement of the delayed fraction between high (z > 1.2) and low red- shift rate measurements, where we find no significant evidence of a delayed fraction at all in our photometric sample.

  18. Open innovation for phenotypic drug discovery: The PD2 assay panel.

    PubMed

    Lee, Jonathan A; Chu, Shaoyou; Willard, Francis S; Cox, Karen L; Sells Galvin, Rachelle J; Peery, Robert B; Oliver, Sarah E; Oler, Jennifer; Meredith, Tamika D; Heidler, Steven A; Gough, Wendy H; Husain, Saba; Palkowitz, Alan D; Moxham, Christopher M

    2011-07-01

    Phenotypic lead generation strategies seek to identify compounds that modulate complex, physiologically relevant systems, an approach that is complementary to traditional, target-directed strategies. Unlike gene-specific assays, phenotypic assays interrogate multiple molecular targets and signaling pathways in a target "agnostic" fashion, which may reveal novel functions for well-studied proteins and discover new pathways of therapeutic value. Significantly, existing compound libraries may not have sufficient chemical diversity to fully leverage a phenotypic strategy. To address this issue, Eli Lilly and Company launched the Phenotypic Drug Discovery Initiative (PD(2)), a model of open innovation whereby external research groups can submit compounds for testing in a panel of Lilly phenotypic assays. This communication describes the statistical validation, operations, and initial screening results from the first PD(2) assay panel. Analysis of PD(2) submissions indicates that chemical diversity from open source collaborations complements internal sources. Screening results for the first 4691 compounds submitted to PD(2) have confirmed hit rates from 1.6% to 10%, with the majority of active compounds exhibiting acceptable potency and selectivity. Phenotypic lead generation strategies, in conjunction with novel chemical diversity obtained via open-source initiatives such as PD(2), may provide a means to identify compounds that modulate biology by novel mechanisms and expand the innovation potential of drug discovery.

  19. Better cancer biomarker discovery through better study design.

    PubMed

    Rundle, Andrew; Ahsan, Habibul; Vineis, Paolo

    2012-12-01

    High-throughput laboratory technologies coupled with sophisticated bioinformatics algorithms have tremendous potential for discovering novel biomarkers, or profiles of biomarkers, that could serve as predictors of disease risk, response to treatment or prognosis. We discuss methodological issues in wedding high-throughput approaches for biomarker discovery with the case-control study designs typically used in biomarker discovery studies, especially focusing on nested case-control designs. We review principles for nested case-control study design in relation to biomarker discovery studies and describe how the efficiency of biomarker discovery can be effected by study design choices. We develop a simulated prostate cancer cohort data set and a series of biomarker discovery case-control studies nested within the cohort to illustrate how study design choices can influence biomarker discovery process. Common elements of nested case-control design, incidence density sampling and matching of controls to cases are not typically factored correctly into biomarker discovery analyses, inducing bias in the discovery process. We illustrate how incidence density sampling and matching of controls to cases reduce the apparent specificity of truly valid biomarkers 'discovered' in a nested case-control study. We also propose and demonstrate a new case-control matching protocol, we call 'antimatching', that improves the efficiency of biomarker discovery studies. For a valid, but as yet undiscovered, biomarker(s) disjunctions between correctly designed epidemiologic studies and the practice of biomarker discovery reduce the likelihood that true biomarker(s) will be discovered and increases the false-positive discovery rate. © 2012 The Authors. European Journal of Clinical Investigation © 2012 Stichting European Society for Clinical Investigation Journal Foundation.

  20. Computational modeling in melanoma for novel drug discovery.

    PubMed

    Pennisi, Marzio; Russo, Giulia; Di Salvatore, Valentina; Candido, Saverio; Libra, Massimo; Pappalardo, Francesco

    2016-06-01

    There is a growing body of evidence highlighting the applications of computational modeling in the field of biomedicine. It has recently been applied to the in silico analysis of cancer dynamics. In the era of precision medicine, this analysis may allow the discovery of new molecular targets useful for the design of novel therapies and for overcoming resistance to anticancer drugs. According to its molecular behavior, melanoma represents an interesting tumor model in which computational modeling can be applied. Melanoma is an aggressive tumor of the skin with a poor prognosis for patients with advanced disease as it is resistant to current therapeutic approaches. This review discusses the basics of computational modeling in melanoma drug discovery and development. Discussion includes the in silico discovery of novel molecular drug targets, the optimization of immunotherapies and personalized medicine trials. Mathematical and computational models are gradually being used to help understand biomedical data produced by high-throughput analysis. The use of advanced computer models allowing the simulation of complex biological processes provides hypotheses and supports experimental design. The research in fighting aggressive cancers, such as melanoma, is making great strides. Computational models represent the key component to complement these efforts. Due to the combinatorial complexity of new drug discovery, a systematic approach based only on experimentation is not possible. Computational and mathematical models are necessary for bringing cancer drug discovery into the era of omics, big data and personalized medicine.

  1. The Screening Compound Collection: A Key Asset for Drug Discovery.

    PubMed

    Boss, Christoph; Hazemann, Julien; Kimmerlin, Thierry; von Korff, Modest; Lüthi, Urs; Peter, Oliver; Sander, Thomas; Siegrist, Romain

    2017-10-25

    In this case study on an essential instrument of modern drug discovery, we summarize our successful efforts in the last four years toward enhancing the Actelion screening compound collection. A key organizational step was the establishment of the Compound Library Committee (CLC) in September 2013. This cross-functional team consisting of computational scientists, medicinal chemists and a biologist was endowed with a significant annual budget for regular new compound purchases. Based on an initial library analysis performed in 2013, the CLC developed a New Library Strategy. The established continuous library turn-over mode, and the screening library size of 300'000 compounds were maintained, while the structural library quality was increased. This was achieved by shifting the selection criteria from 'druglike' to 'leadlike' structures, enriching for non-flat structures, aiming for compound novelty, and increasing the ratio of higher cost 'Premium Compounds'. Novel chemical space was gained by adding natural compounds, macrocycles, designed and focused libraries to the collection, and through mutual exchanges of proprietary compounds with agrochemical companies. A comparative analysis in 2016 provided evidence for the positive impact of these measures. Screening the improved library has provided several highly promising hits, including a macrocyclic compound, that are currently followed up in different Hit-to-Lead and Lead Optimization programs. It is important to state that the goal of the CLC was not to achieve higher HTS hit rates, but to increase the chances of identified hits to serve as the basis of successful early drug discovery programs. The experience gathered so far legitimates the New Library Strategy.

  2. Emilio Segrè and Spontaneous Fission

    Science.gov Websites

    fissioned instead. The discovery of fission led in turn to the discovery of the chain reaction that, if material apart before it had a chance to undergo an efficient chain reaction. The possibility of chain reaction. If a similar rate was found in plutonium, it might rule out the use of that element as

  3. Integration of Guided Discovery in the Teaching of Real Analysis

    ERIC Educational Resources Information Center

    Dumitrascu, Dorin

    2009-01-01

    I discuss my experience with teaching an advanced undergraduate Real Analysis class using both lecturing and the small-group guided discovery method. The article is structured as follows. The first section is about the organizational and administrative components of the class. In the second section I give examples of successes and difficulties…

  4. Empirical Bayes method for reducing false discovery rates of correlation matrices with block diagonal structure.

    PubMed

    Pacini, Clare; Ajioka, James W; Micklem, Gos

    2017-04-12

    Correlation matrices are important in inferring relationships and networks between regulatory or signalling elements in biological systems. With currently available technology sample sizes for experiments are typically small, meaning that these correlations can be difficult to estimate. At a genome-wide scale estimation of correlation matrices can also be computationally demanding. We develop an empirical Bayes approach to improve covariance estimates for gene expression, where we assume the covariance matrix takes a block diagonal form. Our method shows lower false discovery rates than existing methods on simulated data. Applied to a real data set from Bacillus subtilis we demonstrate it's ability to detecting known regulatory units and interactions between them. We demonstrate that, compared to existing methods, our method is able to find significant covariances and also to control false discovery rates, even when the sample size is small (n=10). The method can be used to find potential regulatory networks, and it may also be used as a pre-processing step for methods that calculate, for example, partial correlations, so enabling the inference of the causal and hierarchical structure of the networks.

  5. The Localized Discovery and Recovery for Query Packet Losses in Wireless Sensor Networks with Distributed Detector Clusters

    PubMed Central

    Teng, Rui; Leibnitz, Kenji; Miura, Ryu

    2013-01-01

    An essential application of wireless sensor networks is to successfully respond to user queries. Query packet losses occur in the query dissemination due to wireless communication problems such as interference, multipath fading, packet collisions, etc. The losses of query messages at sensor nodes result in the failure of sensor nodes reporting the requested data. Hence, the reliable and successful dissemination of query messages to sensor nodes is a non-trivial problem. The target of this paper is to enable highly successful query delivery to sensor nodes by localized and energy-efficient discovery, and recovery of query losses. We adopt local and collective cooperation among sensor nodes to increase the success rate of distributed discoveries and recoveries. To enable the scalability in the operations of discoveries and recoveries, we employ a distributed name resolution mechanism at each sensor node to allow sensor nodes to self-detect the correlated queries and query losses, and then efficiently locally respond to the query losses. We prove that the collective discovery of query losses has a high impact on the success of query dissemination and reveal that scalability can be achieved by using the proposed approach. We further study the novel features of the cooperation and competition in the collective recovery at PHY and MAC layers, and show that the appropriate number of detectors can achieve optimal successful recovery rate. We evaluate the proposed approach with both mathematical analyses and computer simulations. The proposed approach enables a high rate of successful delivery of query messages and it results in short route lengths to recover from query losses. The proposed approach is scalable and operates in a fully distributed manner. PMID:23748172

  6. The Search for an Effective Clinical Behavior Analysis: The Nonlinear Thinking of Israel Goldiamond

    PubMed Central

    Layng, T.V Joe

    2009-01-01

    This paper has two purposes; the first is to reintroduce Goldiamond's constructional approach to clinical behavior analysis and to the field of behavior analysis as a whole, which, unfortunately, remains largely unaware of his nonlinear functional analysis and its implications. The approach is not simply a set of clinical techniques; instead it describes how basic, applied, and formal analyses may intersect to provide behavior-analytic solutions where the emphasis is on consequential selection. The paper takes the reader through a cumulative series of explorations, discoveries, and insights that hopefully brings the reader into contact with the power and comprehensiveness of Goldiamond's approach, and leads to an investigation of the original works cited. The second purpose is to provide the context of a life of scientific discovery that attempts to elucidate the variables and events that informed one of the most extraordinary scientific journeys in the history of behavior analysis, and expose the reader (especially young ones) to the exciting process of discovery followed by one of the field's most brilliant thinkers. One may perhaps consider this article a tribute to Goldiamond and his work, but the tribute is really to the process of scientific discovery over a professional lifetime. PMID:22478519

  7. Metadata Effectiveness in Internet Discovery: An Analysis of Digital Collection Metadata Elements and Internet Search Engine Keywords

    ERIC Educational Resources Information Center

    Yang, Le

    2016-01-01

    This study analyzed digital item metadata and keywords from Internet search engines to learn what metadata elements actually facilitate discovery of digital collections through Internet keyword searching and how significantly each metadata element affects the discovery of items in a digital repository. The study found that keywords from Internet…

  8. Nonintrusive Flow Rate Determination Through Space Shuttle Water Coolant Loop Floodlight Coldplate

    NASA Technical Reports Server (NTRS)

    Werlink, Rudolph; Johnson, Harry; Margasahayam, Ravi

    1997-01-01

    Using a Nonintrusive Flow Measurement System (NFMS), the flow rates through the Space Shuttle water coolant coldplate were determined. The objective of this in situ flow measurement was to prove or disprove a potential block inside the affected coldplate had contributed to a reduced flow rate and the subsequent ice formation on the Space Shuttle Discovery. Flow through the coldplate was originally calculated to be 35 to 38 pounds per hour. This application of ultrasonic technology advanced the envelope of flow measurements through use of 1/4-inch-diameter tubing, which resulted in extremely low flow velocities (5 to 30 pounds per hour). In situ measurements on the orbiters Discovery and Atlantis indicated both vehicles, on the average, experienced similar flow rates through the coldplate (around 25 pounds per hour), but lower rates than the designed flow. Based on the noninvasive checks, further invasive troubleshooting was eliminated. Permanent monitoring using the NFMS was recommended.

  9. Discovery and Classification in Astronomy

    NASA Astrophysics Data System (ADS)

    Dick, Steven J.

    2012-01-01

    Three decades after Martin Harwit's pioneering Cosmic Discovery (1981), and following on the recent IAU Symposium "Accelerating the Rate of Astronomical Discovery,” we have revisited the problem of discovery in astronomy, emphasizing new classes of objects. 82 such classes have been identified and analyzed, including 22 in the realm of the planets, 36 in the realm of the stars, and 24 in the realm of the galaxies. We find an extended structure of discovery, consisting of detection, interpretation and understanding, each with its own nuances and a microstructure including conceptual, technological and social roles. This is true with a remarkable degree of consistency over the last 400 years of telescopic astronomy, ranging from Galileo's discovery of satellites, planetary rings and star clusters, to the discovery of quasars and pulsars. Telescopes have served as "engines of discovery” in several ways, ranging from telescope size and sensitivity (planetary nebulae and spiral galaxies), to specialized detectors (TNOs) and the opening of the electromagnetic spectrum for astronomy (pulsars, pulsar planets, and most active galaxies). A few classes (radiation belts, the solar wind and cosmic rays), were initially discovered without the telescope. Classification also plays an important role in discovery. While it might seem that classification marks the end of discovery, or a post-discovery phase, in fact it often marks the beginning, even a pre-discovery phase. Nowhere is this more clearly seen than in the classification of stellar spectra, long before dwarfs, giants and supergiants were known, or their evolutionary sequence recognized. Classification may also be part of a post-discovery phase, as in the MK system of stellar classification, constructed after the discovery of stellar luminosity classes. Some classes are declared rather than discovered, as in the case of gas and ice giant planets, and, infamously, Pluto as a dwarf planet.

  10. How molecular profiling could revolutionize drug discovery.

    PubMed

    Stoughton, Roland B; Friend, Stephen H

    2005-04-01

    Information from genomic, proteomic and metabolomic measurements has already benefited target discovery and validation, assessment of efficacy and toxicity of compounds, identification of disease subgroups and the prediction of responses of individual patients. Greater benefits can be expected from the application of these technologies on a significantly larger scale; by simultaneously collecting diverse measurements from the same subjects or cell cultures; by exploiting the steadily improving quantitative accuracy of the technologies; and by interpreting the emerging data in the context of underlying biological models of increasing sophistication. The benefits of applying molecular profiling to drug discovery and development will include much lower failure rates at all stages of the drug development pipeline, faster progression from discovery through to clinical trials and more successful therapies for patient subgroups. Upheavals in existing organizational structures in the current 'conveyor belt' models of drug discovery might be required to take full advantage of these methods.

  11. Linnorm: improved statistical analysis for single cell RNA-seq expression data.

    PubMed

    Yip, Shun H; Wang, Panwen; Kocher, Jean-Pierre A; Sham, Pak Chung; Wang, Junwen

    2017-12-15

    Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Can Functional Magnetic Resonance Imaging Improve Success Rates in CNS Drug Discovery?

    PubMed Central

    Borsook, David; Hargreaves, Richard; Becerra, Lino

    2011-01-01

    Introduction The bar for developing new treatments for CNS disease is getting progressively higher and fewer novel mechanisms are being discovered, validated and developed. The high costs of drug discovery necessitate early decisions to ensure the best molecules and hypotheses are tested in expensive late stage clinical trials. The discovery of brain imaging biomarkers that can bridge preclinical to clinical CNS drug discovery and provide a ‘language of translation’ affords the opportunity to improve the objectivity of decision-making. Areas Covered This review discusses the benefits, challenges and potential issues of using a science based biomarker strategy to change the paradigm of CNS drug development and increase success rates in the discovery of new medicines. The authors have summarized PubMed and Google Scholar based publication searches to identify recent advances in functional, structural and chemical brain imaging and have discussed how these techniques may be useful in defining CNS disease state and drug effects during drug development. Expert opinion The use of novel brain imaging biomarkers holds the bold promise of making neuroscience drug discovery smarter by increasing the objectivity of decision making thereby improving the probability of success of identifying useful drugs to treat CNS diseases. Functional imaging holds the promise to: (1) define pharmacodynamic markers as an index of target engagement (2) improve translational medicine paradigms to predict efficacy; (3) evaluate CNS efficacy and safety based on brain activation; (4) determine brain activity drug dose-response relationships and (5) provide an objective evaluation of symptom response and disease modification. PMID:21765857

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shen, M.; Vermeulen, R.; Rajaraman, P.

    The high incidence of lung cancer in Xuanwei County, China has been attributed to exposure to indoor smoky coal emissions that contain polycyclic aromatic hydrocarbons (PAHs). The inflammatory response induced by coal smoke components may promote lung tumor development. We studied the association between single nucleotide polymorphisms (SNPs) in genes involved in innate immunity and lung cancer risk in a population-based case-control study (122 cases and 122 controls) in Xuanwei. A total of 1,360 tag SNPs in 149 gene regions were included in the analysis. FCER2 rs7249320 was the most significant SNP (OR: 0.30; 95% Cl: 0.16-0.55; P: 0.0001; falsemore » discovery rate value, 0.13) for variant carriers. The gene regions ALOX12B/ALOX15B and KLK2 were associated with increased lung cancer risk globally (false discovery rate value < 0.15). In addition, there were positive interactions between KLK15 rs3745523 and smoky coal use (OR: 9.40; P-interaction = 0.07) and between FCER2 rs7249320 and KLK2 rs2739476 (OR: 10.77; P-interaction = 0.003). Our results suggest that genetic polymorphisms in innate immunity genes may play a role in the genesis of lung cancer caused by PAH-containing coal smoke. Integrin/receptor and complement pathways as well as IgE regulation are particularly noteworthy.« less

  14. Improving sensitivity in proteome studies by analysis of false discovery rates for multiple search engines.

    PubMed

    Jones, Andrew R; Siepen, Jennifer A; Hubbard, Simon J; Paton, Norman W

    2009-03-01

    LC-MS experiments can generate large quantities of data, for which a variety of database search engines are available to make peptide and protein identifications. Decoy databases are becoming widely used to place statistical confidence in result sets, allowing the false discovery rate (FDR) to be estimated. Different search engines produce different identification sets so employing more than one search engine could result in an increased number of peptides (and proteins) being identified, if an appropriate mechanism for combining data can be defined. We have developed a search engine independent score, based on FDR, which allows peptide identifications from different search engines to be combined, called the FDR Score. The results demonstrate that the observed FDR is significantly different when analysing the set of identifications made by all three search engines, by each pair of search engines or by a single search engine. Our algorithm assigns identifications to groups according to the set of search engines that have made the identification, and re-assigns the score (combined FDR Score). The combined FDR Score can differentiate between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine.

  15. Building high-quality assay libraries for targeted analysis of SWATH MS data.

    PubMed

    Schubert, Olga T; Gillet, Ludovic C; Collins, Ben C; Navarro, Pedro; Rosenberger, George; Wolski, Witold E; Lam, Henry; Amodei, Dario; Mallick, Parag; MacLean, Brendan; Aebersold, Ruedi

    2015-03-01

    Targeted proteomics by selected/multiple reaction monitoring (S/MRM) or, on a larger scale, by SWATH (sequential window acquisition of all theoretical spectra) MS (mass spectrometry) typically relies on spectral reference libraries for peptide identification. Quality and coverage of these libraries are therefore of crucial importance for the performance of the methods. Here we present a detailed protocol that has been successfully used to build high-quality, extensive reference libraries supporting targeted proteomics by SWATH MS. We describe each step of the process, including data acquisition by discovery proteomics, assertion of peptide-spectrum matches (PSMs), generation of consensus spectra and compilation of MS coordinates that uniquely define each targeted peptide. Crucial steps such as false discovery rate (FDR) control, retention time normalization and handling of post-translationally modified peptides are detailed. Finally, we show how to use the library to extract SWATH data with the open-source software Skyline. The protocol takes 2-3 d to complete, depending on the extent of the library and the computational resources available.

  16. The analysis of the market success of FDA approvals by probing top 100 bestselling drugs.

    PubMed

    Polanski, Jaroslaw; Bogocz, Jacek; Tkocz, Aleksandra

    2016-05-01

    Target-oriented drug discovery is the main research paradigm of contemporary drug discovery. In target-oriented approaches, we attempt to maximize in vitro drug potency by finding the optimal fit to the target. This can result in a higher molecular complexity, in particular, the higher molecular weight (MW) of the drugs. However, a comparison of the successful developments of pharmaceuticals with the general trends that can be observed in medicinal chemistry resulted in the conclusion that the so-called molecular obesity is an important reason for the attrition rate of drugs. When analyzing the list of top 100 drug bestsellers versus all of the FDA approvals, we discovered that on average lower-complexity (MW, ADMET score) drugs are winners of the top 100 list in terms of numbers but that, especially, up to some optimal MW value, a higher molecular complexity can pay off with higher incomes. This indicates that slim drugs are doing better but that fat drugs are bigger fishes to catch.

  17. The analysis of the market success of FDA approvals by probing top 100 bestselling drugs

    NASA Astrophysics Data System (ADS)

    Polanski, Jaroslaw; Bogocz, Jacek; Tkocz, Aleksandra

    2016-05-01

    Target-oriented drug discovery is the main research paradigm of contemporary drug discovery. In target-oriented approaches, we attempt to maximize in vitro drug potency by finding the optimal fit to the target. This can result in a higher molecular complexity, in particular, the higher molecular weight (MW) of the drugs. However, a comparison of the successful developments of pharmaceuticals with the general trends that can be observed in medicinal chemistry resulted in the conclusion that the so-called molecular obesity is an important reason for the attrition rate of drugs. When analyzing the list of top 100 drug bestsellers versus all of the FDA approvals, we discovered that on average lower-complexity (MW, ADMET score) drugs are winners of the top 100 list in terms of numbers but that, especially, up to some optimal MW value, a higher molecular complexity can pay off with higher incomes. This indicates that slim drugs are doing better but that fat drugs are bigger fishes to catch.

  18. A metabolomics guided exploration of marine natural product chemical space.

    PubMed

    Floros, Dimitrios J; Jensen, Paul R; Dorrestein, Pieter C; Koyama, Nobuhiro

    2016-09-01

    Natural products from culture collections have enormous impact in advancing discovery programs for metabolites of biotechnological importance. These discovery efforts rely on the metabolomic characterization of strain collections. Many emerging approaches compare metabolomic profiles of such collections, but few enable the analysis and prioritization of thousands of samples from diverse organisms while delivering chemistry specific read outs. In this work we utilize untargeted LC-MS/MS based metabolomics together with molecular networking to. This approach annotated 76 molecular families (a spectral match rate of 28 %), including clinically and biotechnologically important molecules such as valinomycin, actinomycin D, and desferrioxamine E. Targeting a molecular family produced primarily by one microorganism led to the isolation and structure elucidation of two new molecules designated maridric acids A and B. Molecular networking guided exploration of large culture collections allows for rapid dereplication of know molecules and can highlight producers of uniques metabolites. These methods, together with large culture collections and growing databases, allow for data driven strain prioritization with a focus on novel chemistries.

  19. BioPig: Developing Cloud Computing Applications for Next-Generation Sequence Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhatia, Karan; Wang, Zhong

    Next Generation sequencing is producing ever larger data sizes with a growth rate outpacing Moore's Law. The data deluge has made many of the current sequenceanalysis tools obsolete because they do not scale with data. Here we present BioPig, a collection of cloud computing tools to scale data analysis and management. Pig is aflexible data scripting language that uses Apache's Hadoop data structure and map reduce framework to process very large data files in parallel and combine the results.BioPig extends Pig with capability with sequence analysis. We will show the performance of BioPig on a variety of bioinformatics tasks, includingmore » screeningsequence contaminants, Illumina QA/QC, and gene discovery from metagenome data sets using the Rumen metagenome as an example.« less

  20. Discovery and problem solving: Triangulation as a weak heuristic

    NASA Technical Reports Server (NTRS)

    Rochowiak, Daniel

    1987-01-01

    Recently the artificial intelligence community has turned its attention to the process of discovery and found that the history of science is a fertile source for what Darden has called compiled hindsight. Such hindsight generates weak heuristics for discovery that do not guarantee that discoveries will be made but do have proven worth in leading to discoveries. Triangulation is one such heuristic that is grounded in historical hindsight. This heuristic is explored within the general framework of the BACON, GLAUBER, STAHL, DALTON, and SUTTON programs. In triangulation different bases of information are compared in an effort to identify gaps between the bases. Thus, assuming that the bases of information are relevantly related, the gaps that are identified should be good locations for discovery and robust analysis.

  1. Computational modeling approaches to quantitative structure-binding kinetics relationships in drug discovery.

    PubMed

    De Benedetti, Pier G; Fanelli, Francesca

    2018-03-21

    Simple comparative correlation analyses and quantitative structure-kinetics relationship (QSKR) models highlight the interplay of kinetic rates and binding affinity as an essential feature in drug design and discovery. The choice of the molecular series, and their structural variations, used in QSKR modeling is fundamental to understanding the mechanistic implications of ligand and/or drug-target binding and/or unbinding processes. Here, we discuss the implications of linear correlations between kinetic rates and binding affinity constants and the relevance of the computational approaches to QSKR modeling. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. Discovery of remarkable subpulse drifting pattern in PSR B0818-41

    NASA Astrophysics Data System (ADS)

    Bhattacharyya, B.; Gupta, Y.; Gil, J.; Sendyk, M.

    The study of pulsars showing systematic subpulse drift patterns provides important clues for understanding of pulsar emission mechanism. Pulsars with wide profiles provide extra insights because of the presence of multiple drift bands (e.g PSR B0826-34). We report the discovery of a remarkable subpulse drift pattern in a relatively less studied wide profile pulsar, PSR B0818-41, using the GMRT. We find simultaneous occurrence of three drift regions with two drift rates, an inner region with steeper apparent drift rate flanked on each side by a region of slower apparent drift rate. Furthermore, the two closely spaced drift regions always maintain a constant phase relationship. These unique drift properties seen for this pulsar is very rare. We interpret that the observed drift pattern is created by intersection of our line of sight (LOS) with two conal rings in a inner LOS (negative beta) geometry. We argue that the carousel rotation periodicity (P_4) and the number of sparks (N_sp) are the same for the rings and claim that P_4 is close to the measured P_3. Based on our analysis results and interpretations, we simulate the radiation from B0818-41. The simulations support our interpretations and reproduce the average profile and the observed drift pattern. The results of our study show that PSR B0818-41 is a powerful system to explore the pulsar radio emission mechanism, the implications of which are also discussed in our work.

  3. Pharmacovigilance data mining with methods based on false discovery rates: a comparative simulation study.

    PubMed

    Ahmed, I; Thiessard, F; Miremont-Salamé, G; Bégaud, B; Tubert-Bitter, P

    2010-10-01

    The early detection of adverse reactions caused by drugs that are already on the market is the prime concern of pharmacovigilance efforts; the methods in use for postmarketing surveillance are aimed at detecting signals pointing to potential safety concerns, on the basis of reports from health-care providers and from information available in various databases. Signal detection methods based on the estimation of false discovery rate (FDR) have recently been proposed. They address the limitation of arbitrary detection thresholds of the automatic methods in current use, including those last updated by the US Food and Drug Administration and the World Health Organization's Uppsala Monitoring Centre. We used two simulation procedures to compare the false-positive performances for three current methods: the reporting odds ratio (ROR), the information component (IC), the gamma Poisson shrinkage (GPS), and also for two FDR-based methods derived from the GPS model and Fisher's test. Large differences in FDR rates were associated with the signal-detection methods currently in use. These differences ranged from 0.01 to 12% in an analysis that was restricted to signals with at least three reports. The numbers of signals generated were also highly variable. Among fixed-size lists of signals, the FDR was lowered when the FDR-based approaches were used. Overall, the outcomes in both simulation studies suggest that improvement in effectiveness can be expected from use of the FDR-based GPS method.

  4. Counting complete? Finalising the plant inventory of a global biodiversity hotspot

    PubMed Central

    Colville, Jonathan F.; Joppa, Lucas N.; Huyser, Onno; Manning, John

    2017-01-01

    The Cape Floristic Region—the world’s smallest and third richest botanical hotspot—has benefited from sustained levels of taxonomic effort and exploration for almost three centuries, but how close is this to resulting in a near-complete plant species inventory? We analyse a core component of this flora over a 250-year period for trends in taxonomic effort and species discovery linked to ecological and conservation attributes. We show that >40% of the current total of species was described within the first 100 years of exploration, followed by a continued steady rate of description. We propose that <1% of the flora is still to be described. We document a relatively constant cohort of taxonomists, working over 250 years at what we interpret to be their ‘taxonomic maximum.’ Rates of description of new species were independent of plant growth-form but narrow-range taxa have constituted a significantly greater proportion of species discoveries since 1950. This suggests that the fraction of undiscovered species predominantly comprises localised endemics that are thus of high conservation concern. Our analysis provides important real-world insights for other hotspots in the context of global strategic plans for biodiversity in informing considerations of the likely effort required in attaining set targets of comprehensive plant inventories. In a time of unprecedented biodiversity loss, we argue for a focused research agenda across disciplines to increase the rate of species descriptions in global biodiversity hotspots. PMID:28243528

  5. An Analysis of the Peculiar Type IIn Supernova 1995N

    NASA Astrophysics Data System (ADS)

    Baird, M. D.; Garnavich, P. M.; Schlegel, E. M.; Challis, P. M.; Kirshner, R. P.

    1998-12-01

    SN 1995N is a peculiar type IIn supernova. Spectroscopic and photometric data for this analysis were gathered between May 10, 1995 (two days after discovery) and July 18, 1998. A total of twenty two photometric images and eight spectra were obtained at the FLWO and MMTO. The photometric data show a broad maximum at R=17.0 occurred in late October, 1995, followed by a very slow decline at a rate of 2.39 millimag-day(-1) for R and 1.37 millimag-day(-1) for V. The R decay rate corresponds to a half life of 315 days, which is much longer than that of (56) Co. The spectra show broad hydrogen (1500 km/s FWHM) and oxygen (10000 km/s FWZI) emission features along with many unresolved emission lines. Some of the more interesting narrow lines identified correspond to high ionization states for iron such as Fe VII and Fe X which indicate temperatures as high as 10(6) degrees K. These high ionization states, the X-ray detection by Lewin et al. (1996, IAUC 6445) and the slow photometric decay suggest that SN 1995N is powered by a shock propagating through a dense circumstellar environment. From the earliest observations the energy output appears dominated by the interaction and not by radioactivity, implying that the progenitor exploded well before the discovery of SN 1995N. The situation may be similar to SN 1987A, where the rise in emission from a circumstellar interaction is only now beginning and is expected to peak some 15 years after the supernova explosion.

  6. Detection of Chorus Elements and other Wave Signatures Using Geometric Computational Techniques in the Van Allen radiation belts

    NASA Astrophysics Data System (ADS)

    Sengupta, A.; Kletzing, C.; Howk, R.; Kurth, W. S.

    2017-12-01

    An important goal of the Van Allen Probes mission is to understand wave particle interactions that can energize relativistic electron in the Earth's Van Allen radiation belts. The EMFISIS instrumentation suite provides measurements of wave electric and magnetic fields of wave features such as chorus that participate in these interactions. Geometric signal processing discovers structural relationships, e.g. connectivity across ridge-like features in chorus elements to reveal properties such as dominant angles of the element (frequency sweep rate) and integrated power along the a given chorus element. These techniques disambiguate these wave features against background hiss-like chorus. This enables autonomous discovery of chorus elements across the large volumes of EMFISIS data. At the scale of individual or overlapping chorus elements, topological pattern recognition techniques enable interpretation of chorus microstructure by discovering connectivity and other geometric features within the wave signature of a single chorus element or between overlapping chorus elements. Thus chorus wave features can be quantified and studied at multiple scales of spectral geometry using geometric signal processing techniques. We present recently developed computational techniques that exploit spectral geometry of chorus elements and whistlers to enable large-scale automated discovery, detection and statistical analysis of these events over EMFISIS data. Specifically, we present different case studies across a diverse portfolio of chorus elements and discuss the performance of our algorithms regarding precision of detection as well as interpretation of chorus microstructure. We also provide large-scale statistical analysis on the distribution of dominant sweep rates and other properties of the detected chorus elements.

  7. VaDiR: an integrated approach to Variant Detection in RNA.

    PubMed

    Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy

    2018-02-01

    Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.

  8. Successes in drug discovery and design.

    PubMed

    2004-04-01

    The Society for Medicines Research (SMR) held a one-day meeting on case histories in drug discovery on December 4, 2003, at the National Heart and Lung Institute in London. These meetings have been organized by the SMR biannually for many years, and this latest meeting proved extremely popular, attracting a capacity audience of more than 130 registrants. The purpose of these meetings is educational; they allow those interested in drug discovery to hear key learnings from recent successful drug discovery programs. There was no overall linking theme between the talks, other than each success story has led to the introduction of a new and improved product of therapeutic use. The drug discovery stories covered in the meeting were extremely varied and, put together, they emphasized that each successful story is unique and special. This meeting is also special for the SMR because it presents the "SMR Award for Drug Discovery" in recognition of outstanding achievement and contribution in the area. It should be remembered that drug discovery is an extremely risky business and an extremely costly and complicated process in which the success rate is, at best, low. (c) 2004 Prous Science. All rights reserved.

  9. Strategies for bringing drug delivery tools into discovery.

    PubMed

    Kwong, Elizabeth; Higgins, John; Templeton, Allen C

    2011-06-30

    The past decade has yielded a significant body of literature discussing approaches for development and discovery collaboration in the pharmaceutical industry. As a result, collaborations between discovery groups and development scientists have increased considerably. The productivity of pharma companies to deliver new drugs to the market, however, has not increased and development costs continue to rise. Inability to predict clinical and toxicological response underlies the high attrition rate of leads at every step of drug development. A partial solution to this high attrition rate could be provided by better preclinical pharmacokinetics measurements that inform PD response based on key pathways that drive disease progression and therapeutic response. A critical link between these key pharmacology, pharmacokinetics and toxicology studies is the formulation. The challenges in pre-clinical formulation development include limited availability of compounds, rapid turn-around requirements and the frequent un-optimized physical properties of the lead compounds. Despite these challenges, this paper illustrates some successes resulting from close collaboration between formulation scientists and discovery teams. This close collaboration has resulted in development of formulations that meet biopharmaceutical needs from early stage preclinical in vivo model development through toxicity testing and development risk assessment of pre-clinical drug candidates. Published by Elsevier B.V.

  10. STS-131 Discovery Launch

    NASA Image and Video Library

    2010-04-04

    Contrails are seen as workers leave the Launch Control Center after the launch of the space shuttle Discovery and the start of the STS-131 mission at NASA Kennedy Space Center in Cape Canaveral, Fla. on Monday April 5, 2010. Discovery is carrying a multi-purpose logistics module filled with science racks for the laboratories aboard the station. The mission has three planned spacewalks, with work to include replacing an ammonia tank assembly, retrieving a Japanese experiment from the station’s exterior, and switching out a rate gyro assembly on the station’s truss structure. Photo Credit: (NASA/Bill Ingalls)

  11. STS-131 Discovery Launch

    NASA Image and Video Library

    2010-04-04

    NASA Administrator Charles Bolden looks out the window of Firing Room Four in the Launch Control Center during the launch of the space shuttle Discovery and the start of the STS-131 mission at NASA Kennedy Space Center in Cape Canaveral, Fla. on Monday April 5, 2010. Discovery is carrying a multi-purpose logistics module filled with science racks for the laboratories aboard the station. The mission has three planned spacewalks, with work to include replacing an ammonia tank assembly, retrieving a Japanese experiment from the station’s exterior, and switching out a rate gyro assembly on the station’s truss structure. Photo Credit: (NASA/Bill Ingalls)

  12. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA.

    PubMed

    Wu, Zheyang; Zhao, Hongyu

    2012-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies.

  13. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA

    PubMed Central

    Wu, Zheyang; Zhao, Hongyu

    2013-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies. PMID:23956610

  14. Gene-to-metabolite network for biosynthesis of lignans in MeJA-elicited Isatis indigotica hairy root cultures

    PubMed Central

    Chen, Ruibing; Li, Qing; Tan, Hexin; Chen, Junfeng; Xiao, Ying; Ma, Ruifang; Gao, Shouhong; Zerbe, Philipp; Chen, Wansheng; Zhang, Lei

    2015-01-01

    Root and leaf tissue of Isatis indigotica shows notable anti-viral efficacy, and are widely used as “Banlangen” and “Daqingye” in traditional Chinese medicine. The plants' pharmacological activity is attributed to phenylpropanoids, especially a group of lignan metabolites. However, the biosynthesis of lignans in I. indigotica remains opaque. This study describes the discovery and analysis of biosynthetic genes and AP2/ERF-type transcription factors involved in lignan biosynthesis in I. indigotica. MeJA treatment revealed differential expression of three genes involved in phenylpropanoid backbone biosynthesis (IiPAL, IiC4H, Ii4CL), five genes involved in lignan biosynthesis (IiCAD, IiC3H, IiCCR, IiDIR, and IiPLR), and 112 putative AP2/ERF transcription factors. In addition, four intermediates of lariciresinol biosynthesis were found to be induced. Based on these results, a canonical correlation analysis using Pearson's correlation coefficient was performed to construct gene-to-metabolite networks and identify putative key genes and rate-limiting reactions in lignan biosynthesis. Over-expression of IiC3H, identified as a key pathway gene, was used for metabolic engineering of I. indigotica hairy roots, and resulted in an increase in lariciresinol production. These findings illustrate the utility of canonical correlation analysis for the discovery and metabolic engineering of key metabolic genes in plants. PMID:26579184

  15. Analysis of pharmacology data and the prediction of adverse drug reactions and off-target effects from chemical structure.

    PubMed

    Bender, Andreas; Scheiber, Josef; Glick, Meir; Davies, John W; Azzaoui, Kamal; Hamon, Jacques; Urban, Laszlo; Whitebread, Steven; Jenkins, Jeremy L

    2007-06-01

    Preclinical Safety Pharmacology (PSP) attempts to anticipate adverse drug reactions (ADRs) during early phases of drug discovery by testing compounds in simple, in vitro binding assays (that is, preclinical profiling). The selection of PSP targets is based largely on circumstantial evidence of their contribution to known clinical ADRs, inferred from findings in clinical trials, animal experiments, and molecular studies going back more than forty years. In this work we explore PSP chemical space and its relevance for the prediction of adverse drug reactions. Firstly, in silico (computational) Bayesian models for 70 PSP-related targets were built, which are able to detect 93% of the ligands binding at IC(50) < or = 10 microM at an overall correct classification rate of about 94%. Secondly, employing the World Drug Index (WDI), a model for adverse drug reactions was built directly based on normalized side-effect annotations in the WDI, which does not require any underlying functional knowledge. This is, to our knowledge, the first attempt to predict adverse drug reactions across hundreds of categories from chemical structure alone. On average 90% of the adverse drug reactions observed with known, clinically used compounds were detected, an overall correct classification rate of 92%. Drugs withdrawn from the market (Rapacuronium, Suprofen) were tested in the model and their predicted ADRs align well with known ADRs. The analysis was repeated for acetylsalicylic acid and Benperidol which are still on the market. Importantly, features of the models are interpretable and back-projectable to chemical structure, raising the possibility of rationally engineering out adverse effects. By combining PSP and ADR models new hypotheses linking targets and adverse effects can be proposed and examples for the opioid mu and the muscarinic M2 receptors, as well as for cyclooxygenase-1 are presented. It is hoped that the generation of predictive models for adverse drug reactions is able to help support early SAR to accelerate drug discovery and decrease late stage attrition in drug discovery projects. In addition, models such as the ones presented here can be used for compound profiling in all development stages.

  16. Structure based drug discovery for designing leads for the non-toxic metabolic targets in multi drug resistant Mycobacterium tuberculosis.

    PubMed

    Kaur, Divneet; Mathew, Shalu; Nair, Chinchu G S; Begum, Azitha; Jainanarayan, Ashwin K; Sharma, Mukta; Brahmachari, Samir K

    2017-12-21

    The problem of drug resistance and bacterial persistence in tuberculosis is a cause of global alarm. Although, the UN's Sustainable Development Goals for 2030 has targeted a Tb free world, the treatment gap exists and only a few new drug candidates are in the pipeline. In spite of large information from medicinal chemistry to 'omics' data, there has been a little effort from pharmaceutical companies to generate pipelines for the development of novel drug candidates against the multi drug resistant Mycobacterium tuberculosis. In the present study, we describe an integrated methodology; utilizing systems level information to optimize ligand selection to lower the failure rates at the pre-clinical and clinical levels. In the present study, metabolic targets (Rv2763c, Rv3247c, Rv1094, Rv3607c, Rv3048c, Rv2965c, Rv2361c, Rv0865, Rv0321, Rv0098, Rv0390, Rv3588c, Rv2244, Rv2465c and Rv2607) in M. tuberculosis, identified using our previous Systems Biology and data-intensive genome level analysis, have been used to design potential lead molecules, which are likely to be non-toxic. Various in silico drug discovery tools have been utilized to generate small molecular leads for each of the 15 targets with available crystal structures. The present study resulted in identification of 20 novel lead molecules including 4 FDA approved drugs (droxidropa, tetroxoprim, domperidone and nemonapride) which can be further taken for drug repurposing. This comprehensive integrated methodology, with both experimental and in silico approaches, has the potential to not only tackle the MDR form of Mtb but also the most important persister population of the bacterium, with a potential to reduce the failures in the Tb drug discovery. We propose an integrated approach of systems and structural biology for identifying targets that address the high attrition rate issue in lead identification and drug development We expect that this system level analysis will be applicable for identification of drug candidates to other pathogenic organisms as well.

  17. Jointly determining significance levels of primary and replication studies by controlling the false discovery rate in two-stage genome-wide association studies.

    PubMed

    Jiang, Wei; Yu, Weichuan

    2017-01-01

    In genome-wide association studies, we normally discover associations between genetic variants and diseases/traits in primary studies, and validate the findings in replication studies. We consider the associations identified in both primary and replication studies as true findings. An important question under this two-stage setting is how to determine significance levels in both studies. In traditional methods, significance levels of the primary and replication studies are determined separately. We argue that the separate determination strategy reduces the power in the overall two-stage study. Therefore, we propose a novel method to determine significance levels jointly. Our method is a reanalysis method that needs summary statistics from both studies. We find the most powerful significance levels when controlling the false discovery rate in the two-stage study. To enjoy the power improvement from the joint determination method, we need to select single nucleotide polymorphisms for replication at a less stringent significance level. This is a common practice in studies designed for discovery purpose. We suggest this practice is also suitable in studies with validation purpose in order to identify more true findings. Simulation experiments show that our method can provide more power than traditional methods and that the false discovery rate is well-controlled. Empirical experiments on datasets of five diseases/traits demonstrate that our method can help identify more associations. The R-package is available at: http://bioinformatics.ust.hk/RFdr.html .

  18. The in silico drug discovery toolbox: applications in lead discovery and optimization.

    PubMed

    Bruno, Agostino; Costantino, Gabriele; Sartori, Luca; Radi, Marco

    2017-11-06

    Discovery and development of a new drug is a long lasting and expensive journey that takes around 15 years from starting idea to approval and marketing of new medication. Despite the R&D expenditures have been constantly increasing in the last few years, number of new drugs introduced into market has been steadily declining. This is mainly due to preclinical and clinical safety issues, which still represent about 40% of drug discontinuation. From this point of view, it is clear that if we want to increase drug-discovery success rate and reduce costs associated with development of a new drug, a comprehensive evaluation/prediction of potential safety issues should be conducted as soon as possible during early drug discovery phase. In the present review, we will analyse the early steps of drug-discovery pipeline, describing the sequence of steps from disease selection to lead optimization and focusing on the most common in silico tools used to assess attrition risks and build a mitigation plan. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  19. Large Scale Mass Spectrometry-based Identifications of Enzyme-mediated Protein Methylation Are Subject to High False Discovery Rates*

    PubMed Central

    Hart-Smith, Gene; Yagoub, Daniel; Tay, Aidan P.; Pickford, Russell; Wilkins, Marc R.

    2016-01-01

    All large scale LC-MS/MS post-translational methylation site discovery experiments require methylpeptide spectrum matches (methyl-PSMs) to be identified at acceptably low false discovery rates (FDRs). To meet estimated methyl-PSM FDRs, methyl-PSM filtering criteria are often determined using the target-decoy approach. The efficacy of this methyl-PSM filtering approach has, however, yet to be thoroughly evaluated. Here, we conduct a systematic analysis of methyl-PSM FDRs across a range of sample preparation workflows (each differing in their exposure to the alcohols methanol and isopropyl alcohol) and mass spectrometric instrument platforms (each employing a different mode of MS/MS dissociation). Through 13CD3-methionine labeling (heavy-methyl SILAC) of Saccharomyces cerevisiae cells and in-depth manual data inspection, accurate lists of true positive methyl-PSMs were determined, allowing methyl-PSM FDRs to be compared with target-decoy approach-derived methyl-PSM FDR estimates. These results show that global FDR estimates produce extremely unreliable methyl-PSM filtering criteria; we demonstrate that this is an unavoidable consequence of the high number of amino acid combinations capable of producing peptide sequences that are isobaric to methylated peptides of a different sequence. Separate methyl-PSM FDR estimates were also found to be unreliable due to prevalent sources of false positive methyl-PSMs that produce high peptide identity score distributions. Incorrect methylation site localizations, peptides containing cysteinyl-S-β-propionamide, and methylated glutamic or aspartic acid residues can partially, but not wholly, account for these false positive methyl-PSMs. Together, these results indicate that the target-decoy approach is an unreliable means of estimating methyl-PSM FDRs and methyl-PSM filtering criteria. We suggest that orthogonal methylpeptide validation (e.g. heavy-methyl SILAC or its offshoots) should be considered a prerequisite for obtaining high confidence methyl-PSMs in large scale LC-MS/MS methylation site discovery experiments and make recommendations on how to reduce methyl-PSM FDRs in samples not amenable to heavy isotope labeling. Data are available via ProteomeXchange with the data identifier PXD002857. PMID:26699799

  20. Fragment-based drug discovery and molecular docking in drug design.

    PubMed

    Wang, Tao; Wu, Mian-Bin; Chen, Zheng-Jie; Chen, Hua; Lin, Jian-Ping; Yang, Li-Rong

    2015-01-01

    Fragment-based drug discovery (FBDD) has caused a revolution in the process of drug discovery and design, with many FBDD leads being developed into clinical trials or approved in the past few years. Compared with traditional high-throughput screening, it displays obvious advantages such as efficiently covering chemical space, achieving higher hit rates, and so forth. In this review, we focus on the most recent developments of FBDD for improving drug discovery, illustrating the process and the importance of FBDD. In particular, the computational strategies applied in the process of FBDD and molecular-docking programs are highlighted elaborately. In most cases, docking is used for predicting the ligand-receptor interaction modes and hit identification by structurebased virtual screening. The successful cases of typical significance and the hits identified most recently are discussed.

  1. Insecticide discovery: an evaluation and analysis.

    PubMed

    Sparks, Thomas C

    2013-09-01

    There is an on-going need for the discovery and development of new insecticides due to the loss of existing products through the development of resistance, the desire for products with more favorable environmental and toxicological profiles, shifting pest spectrums, and changing agricultural practices. Since 1960, the number of research-based companies in the US and Europe involved in the discovery of new insecticidal chemistries has been declining. In part this is a reflection of the increasing costs of the discovery and development of new pesticides. Likewise, the number of compounds that need to be screened for every product developed has, until recently, been climbing. In the past two decades the agrochemical industry has been able to develop a range of new products that have more favorable mammalian vs. insect selectivity. This review provides an analysis of the time required for the discovery, or more correctly the building process, for a wide range of insecticides developed during the last 60 years. An examination of the data around the time requirements for the discovery of products based on external patents, prior internal products, or entirely new chemistry provides some unexpected observations. In light of the increasing costs of discovery and development, coupled with fewer companies willing or able to make the investment, insecticide resistance management takes on greater importance as a means to preserve existing and new insecticides. Copyright © 2013 Elsevier Inc. All rights reserved.

  2. Argo_CUDA: Exhaustive GPU based approach for motif discovery in large DNA datasets.

    PubMed

    Vishnevsky, Oleg V; Bocharnikov, Andrey V; Kolchanov, Nikolay A

    2018-02-01

    The development of chromatin immunoprecipitation sequencing (ChIP-seq) technology has revolutionized the genetic analysis of the basic mechanisms underlying transcription regulation and led to accumulation of information about a huge amount of DNA sequences. There are a lot of web services which are currently available for de novo motif discovery in datasets containing information about DNA/protein binding. An enormous motif diversity makes their finding challenging. In order to avoid the difficulties, researchers use different stochastic approaches. Unfortunately, the efficiency of the motif discovery programs dramatically declines with the query set size increase. This leads to the fact that only a fraction of top "peak" ChIP-Seq segments can be analyzed or the area of analysis should be narrowed. Thus, the motif discovery in massive datasets remains a challenging issue. Argo_Compute Unified Device Architecture (CUDA) web service is designed to process the massive DNA data. It is a program for the detection of degenerate oligonucleotide motifs of fixed length written in 15-letter IUPAC code. Argo_CUDA is a full-exhaustive approach based on the high-performance GPU technologies. Compared with the existing motif discovery web services, Argo_CUDA shows good prediction quality on simulated sets. The analysis of ChIP-Seq sequences revealed the motifs which correspond to known transcription factor binding sites.

  3. Large scale analysis of the mutational landscape in HT-SELEX improves aptamer discovery

    PubMed Central

    Hoinka, Jan; Berezhnoy, Alexey; Dao, Phuong; Sauna, Zuben E.; Gilboa, Eli; Przytycka, Teresa M.

    2015-01-01

    High-Throughput (HT) SELEX combines SELEX (Systematic Evolution of Ligands by EXponential Enrichment), a method for aptamer discovery, with massively parallel sequencing technologies. This emerging technology provides data for a global analysis of the selection process and for simultaneous discovery of a large number of candidates but currently lacks dedicated computational approaches for their analysis. To close this gap, we developed novel in-silico methods to analyze HT-SELEX data and utilized them to study the emergence of polymerase errors during HT-SELEX. Rather than considering these errors as a nuisance, we demonstrated their utility for guiding aptamer discovery. Our approach builds on two main advancements in aptamer analysis: AptaMut—a novel technique allowing for the identification of polymerase errors conferring an improved binding affinity relative to the ‘parent’ sequence and AptaCluster—an aptamer clustering algorithm which is to our best knowledge, the only currently available tool capable of efficiently clustering entire aptamer pools. We applied these methods to an HT-SELEX experiment developing aptamers against Interleukin 10 receptor alpha chain (IL-10RA) and experimentally confirmed our predictions thus validating our computational methods. PMID:25870409

  4. Management, Analysis, and Visualization of Experimental and Observational Data – The Convergence of Data and Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bethel, E. Wes; Greenwald, Martin; Kleese van Dam, Kerstin

    Scientific user facilities—particle accelerators, telescopes, colliders, supercomputers, light sources, sequencing facilities, and more—operated by the U.S. Department of Energy (DOE) Office of Science (SC) generate ever increasing volumes of data at unprecedented rates from experiments, observations, and simulations. At the same time there is a growing community of experimentalists that require real-time data analysis feedback, to enable them to steer their complex experimental instruments to optimized scientific outcomes and new discoveries. Recent efforts in DOE-SC have focused on articulating the data-centric challenges and opportunities facing these science communities. Key challenges include difficulties coping with data size, rate, and complexity inmore » the context of both real-time and post-experiment data analysis and interpretation. Solutions will require algorithmic and mathematical advances, as well as hardware and software infrastructures that adequately support data-intensive scientific workloads. This paper presents the summary findings of a workshop held by DOE-SC in September 2015, convened to identify the major challenges and the research that is needed to meet those challenges.« less

  5. Management, Analysis, and Visualization of Experimental and Observational Data -- The Convergence of Data and Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bethel, E. Wes; Greenwald, Martin; Kleese van Dam, Kersten

    Scientific user facilities---particle accelerators, telescopes, colliders, supercomputers, light sources, sequencing facilities, and more---operated by the U.S. Department of Energy (DOE) Office of Science (SC) generate ever increasing volumes of data at unprecedented rates from experiments, observations, and simulations. At the same time there is a growing community of experimentalists that require real-time data analysis feedback, to enable them to steer their complex experimental instruments to optimized scientific outcomes and new discoveries. Recent efforts in DOE-SC have focused on articulating the data-centric challenges and opportunities facing these science communities. Key challenges include difficulties coping with data size, rate, and complexity inmore » the context of both real-time and post-experiment data analysis and interpretation. Solutions will require algorithmic and mathematical advances, as well as hardware and software infrastructures that adequately support data-intensive scientific workloads. This paper presents the summary findings of a workshop held by DOE-SC in September 2015, convened to identify the major challenges and the research that is needed to meet those challenges.« less

  6. 78 FR 2433 - Notice of Inventory Completion: Fort Collins Museum of Discovery, Fort Collins, CO

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-11

    ... Discovery). Although specific provenience of the human remains is unknown, osteological analysis conducted by physical anthropologists and by independent forensic scientists determined that the remains are of...

  7. Factors affecting survival of patients in the acute phase of upper cervical spine injuries.

    PubMed

    Morita, Tomonori; Takebayashi, Tsuneo; Irifune, Hideto; Ohnishi, Hirofumi; Hirayama, Suguru; Yamashita, Toshihiko

    2017-04-01

    In recent years, on the one hand, the mortality rates of upper cervical spine injuries, such as odontoid fractures, were suggested to be not so high, but on the other hand reported to be significantly high. Furthermore, it has not been well documented the relationship between survival rates and various clinical features in those patients during the acute phase of injury because of few reports. This study aimed to evaluate survival rates and acute-phase clinical features of upper cervical spine injuries. We conducted a retrospective review of all patients who were transported to the advanced emergency medical center and underwent computed tomography of the cervical spine at our hospital between January 2006 and December 2015. We excluded the patients who were discovered in a state of cardiopulmonary arrest (CPA) and could not be resuscitated after transportation. Of the 215 consecutive patients with cervical spine injuries, we examined 40 patients (18.6%) diagnosed with upper cervical spine injury (males, 28; females, 12; median age, 58.5 years). Age, sex, mechanism of injury, degree of paralysis, the level of cervical injury, injury severity score (ISS), and incidence of CPA at discovery were evaluated and compared among patients classified into the survival and mortality groups. The survival rate was 77.5% (31/40 patients). In addition, complete paralysis was observed in 32.5% of patients. The median of ISS was 34.0 points, and 14 patients (35.0%) presented with CPA at discovery. Age, the proportion of patients with complete paralysis, a high ISS, and incidence of CPA at discovery were significantly higher in the mortality group (p = 0.038, p = 0.038, p < 0.001, and p < 0.001, respectively). Elderly people were more likely to experience upper cervical spine injuries, and their mortality rate was significantly higher than that in injured younger people. In addition, complete paralysis, high ISS, a state of CPA at discovery, was significantly higher in the mortality group.

  8. Discovery of Hubble's Law as a Series of Type III Errors

    ERIC Educational Resources Information Center

    Belenkiy, Ari

    2015-01-01

    Recently much attention has been paid to the history of the discovery of Hubble's law--the linear relation between the rate of recession of the remote galaxies and distance to them from Earth. Though historians of cosmology now mention several names associated with this law instead of just one, the motivation of each actor of that remarkable…

  9. Sky-plane discovery rates for Near Earth Object discoveries from Pan-STARRS1 - implications for future search strategies

    NASA Astrophysics Data System (ADS)

    Wainscoat, Richard J.; Chambers, Kenneth C.; Chastel, Serge; Denneau, Larry; Lilly Schunova, Eva; Micheli, Marco; Weryk, Robert J.

    2016-10-01

    The Pan-STARRS1 telescope has been spending most of its time for the last 2.5 years searching the sky for Near Earth Objects (NEOs). The surveyed area covers the entire northern sky and extends south to -49 degrees declination. Because Pan-STARRS1 has a large field-of-view, it has been able survey large areas of the sky, and we are now able to examine NEO discovery rates relative to ecliptic latitude.Most contemporary searches, including Pan-STARRS1, have been spending large amounts of their observing time during the dark moon period searching for NEOs close to the ecliptic. The rationale for this is that many objects have low inclination, and all objects in orbit around the Sun must cross the ecliptic. New search capabilities are now available, including Pan-STARRS2, and the upgraded camera in Catalina Sky Survey's G96 telescope. These allow NEO searches to be conducted over wider areas of the sky, and to extend further from the ecliptic.We have examined the discovery rates relative to location on the sky for new NEOs from Pan-STARRS1, and find that the new NEO discoveries are less concentrated on the ecliptic than might be expected. This finding also holds for larger objects. The southern sky has proven to be very productive in new NEO discoveries - this is a direct consequence of the major NEO surveys being located in the northern hemisphere.Our preliminary findings suggest that NEO searches should extend to at least 30 degrees from the ecliptic during the more sensitive dark moon period. At least 6,000 deg2 should therefore be searched each lunation. This is possible with the newly augmented NEO search assets, and repeat coverage will be needed in order to recover most of the NEO candidates found. However, weather challenges will likely make full and repeated coverage of such a large area of sky difficult to achieve. Some simple coordination between observing sites will likely lead to improvement in efficiency.

  10. Characterization of individual mouse cerebrospinal fluid proteomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, Jeffrey S.; Angel, Thomas E.; Chavkin, Charles

    2014-03-20

    Analysis of cerebrospinal fluid (CSF) offers key insight into the status of the central nervous system. Characterization of murine CSF proteomes can provide a valuable resource for studying central nervous system injury and disease in animal models. However, the small volume of CSF in mice has thus far limited individual mouse proteome characterization. Through non-terminal CSF extractions in C57Bl/6 mice and high-resolution liquid chromatography-mass spectrometry analysis of individual murine samples, we report the most comprehensive proteome characterization of individual murine CSF to date. Utilizing stringent protein inclusion criteria that required the identification of at least two unique peptides (1% falsemore » discovery rate at the peptide level) we identified a total of 566 unique proteins, including 128 proteins from three individual CSF samples that have been previously identified in brain tissue. Our methods and analysis provide a mechanism for individual murine CSF proteome analysis.« less

  11. Composing compound libraries for hit discovery--rationality-driven preselection or random choice by structural diversity?

    PubMed

    Weidel, Elisabeth; Negri, Matthias; Empting, Martin; Hinsberger, Stefan; Hartmann, Rolf W

    2014-01-01

    In order to identify new scaffolds for drug discovery, surface plasmon resonance is frequently used to screen structurally diverse libraries. Usually, hit rates are low and identification processes are time consuming. Hence, approaches which improve hit rates and, thus, reduce the library size are required. In this work, we studied three often used strategies for their applicability to identify inhibitors of PqsD. In two of them, target-specific aspects like inhibition of a homologous protein or predicted binding determined by virtual screening were used for compound preselection. Finally, a fragment library, covering a large chemical space, was screened and served as comparison. Indeed, higher hit rates were observed for methods employing preselected libraries indicating that target-oriented compound selection provides a time-effective alternative.

  12. Oncology drug discovery: planning a turnaround.

    PubMed

    Toniatti, Carlo; Jones, Philip; Graham, Hilary; Pagliara, Bruno; Draetta, Giulio

    2014-04-01

    We have made remarkable progress in our understanding of the pathophysiology of cancer. This improved understanding has resulted in increasingly effective targeted therapies that are better tolerated than conventional cytotoxic agents and even curative in some patients. Unfortunately, the success rate of drug approval has been limited, and therapeutic improvements have been marginal, with too few exceptions. In this article, we review the current approach to oncology drug discovery and development, identify areas in need of improvement, and propose strategies to improve patient outcomes. We also suggest future directions that may improve the quality of preclinical and early clinical drug evaluation, which could lead to higher approval rates of anticancer drugs.

  13. MicroRNA array normalization: an evaluation using a randomized dataset as the benchmark.

    PubMed

    Qin, Li-Xuan; Zhou, Qin

    2014-01-01

    MicroRNA arrays possess a number of unique data features that challenge the assumption key to many normalization methods. We assessed the performance of existing normalization methods using two microRNA array datasets derived from the same set of tumor samples: one dataset was generated using a blocked randomization design when assigning arrays to samples and hence was free of confounding array effects; the second dataset was generated without blocking or randomization and exhibited array effects. The randomized dataset was assessed for differential expression between two tumor groups and treated as the benchmark. The non-randomized dataset was assessed for differential expression after normalization and compared against the benchmark. Normalization improved the true positive rate significantly in the non-randomized data but still possessed a false discovery rate as high as 50%. Adding a batch adjustment step before normalization further reduced the number of false positive markers while maintaining a similar number of true positive markers, which resulted in a false discovery rate of 32% to 48%, depending on the specific normalization method. We concluded the paper with some insights on possible causes of false discoveries to shed light on how to improve normalization for microRNA arrays.

  14. MicroRNA Array Normalization: An Evaluation Using a Randomized Dataset as the Benchmark

    PubMed Central

    Qin, Li-Xuan; Zhou, Qin

    2014-01-01

    MicroRNA arrays possess a number of unique data features that challenge the assumption key to many normalization methods. We assessed the performance of existing normalization methods using two microRNA array datasets derived from the same set of tumor samples: one dataset was generated using a blocked randomization design when assigning arrays to samples and hence was free of confounding array effects; the second dataset was generated without blocking or randomization and exhibited array effects. The randomized dataset was assessed for differential expression between two tumor groups and treated as the benchmark. The non-randomized dataset was assessed for differential expression after normalization and compared against the benchmark. Normalization improved the true positive rate significantly in the non-randomized data but still possessed a false discovery rate as high as 50%. Adding a batch adjustment step before normalization further reduced the number of false positive markers while maintaining a similar number of true positive markers, which resulted in a false discovery rate of 32% to 48%, depending on the specific normalization method. We concluded the paper with some insights on possible causes of false discoveries to shed light on how to improve normalization for microRNA arrays. PMID:24905456

  15. iPTF14yb: The First Discovery of a Gamma-Ray Burst Afterglow Independent of a High-Energy Trigger

    NASA Technical Reports Server (NTRS)

    Cenko, S. Bradley; Urban, Alex L.; Perley, Daniel A.; Horesh, Assaf; Corsi, Alessandra; Fox, Derek B.; Cao, Yi; Kasliwal, Mansi M.; Lien, Amy; Arcavi, Iair; hide

    2015-01-01

    We report here the discovery by the Intermediate Palomar Transient Factory (iPTF) of iPTF14yb, a luminous (Mr >> -27.8 mag), cosmological (redshift 1.9733), rapidly fading optical transient. We demonstrate, based on probabilistic arguments and a comparison with the broader population, that iPTF14yb is the optical afterglow of the long-duration gamma-ray burst GRB 140226A. This marks the first unambiguous discovery of a GRB afterglow prior to (and thus entirely independent of) an associated high-energy trigger. We estimate the rate of iPTF14yb-like sources (i.e., cosmologically distant relativistic explosions) based on iPTF observations, inferring an all-sky value of Rrel = 610/yr (68% confidence interval of 110-2000/yr). Our derived rate is consistent (within the large uncertainty) with the all-sky rate of on-axis GRBs derived by the Swift satellite. Finally, we briefly discuss the implications of the nondetection to date of bona fide "orphan" afterglows (i.e., those lacking detectable high-energy emission) on GRB beaming and the degree of baryon loading in these relativistic jets.

  16. iPTF14yb: The First Discovery of a Gamma-Ray Burst Afterglow Independent of a High-energy Trigger

    NASA Astrophysics Data System (ADS)

    Cenko, S. Bradley; Urban, Alex L.; Perley, Daniel A.; Horesh, Assaf; Corsi, Alessandra; Fox, Derek B.; Cao, Yi; Kasliwal, Mansi M.; Lien, Amy; Arcavi, Iair; Bloom, Joshua S.; Butler, Nat R.; Cucchiara, Antonino; de Diego, José A.; Filippenko, Alexei V.; Gal-Yam, Avishay; Gehrels, Neil; Georgiev, Leonid; Jesús González, J.; Graham, John F.; Greiner, Jochen; Kann, D. Alexander; Klein, Christopher R.; Knust, Fabian; Kulkarni, S. R.; Kutyrev, Alexander; Laher, Russ; Lee, William H.; Nugent, Peter E.; Prochaska, J. Xavier; Ramirez-Ruiz, Enrico; Richer, Michael G.; Rubin, Adam; Urata, Yuji; Varela, Karla; Watson, Alan M.; Wozniak, Przemek R.

    2015-04-01

    We report here the discovery by the Intermediate Palomar Transient Factory (iPTF) of iPTF14yb, a luminous ({{M}r}≈ -27.8 mag), cosmological (redshift 1.9733), rapidly fading optical transient. We demonstrate, based on probabilistic arguments and a comparison with the broader population, that iPTF14yb is the optical afterglow of the long-duration gamma-ray burst GRB 140226A. This marks the first unambiguous discovery of a GRB afterglow prior to (and thus entirely independent of) an associated high-energy trigger. We estimate the rate of iPTF14yb-like sources (i.e., cosmologically distant relativistic explosions) based on iPTF observations, inferring an all-sky value of {{\\Re }rel}=610 yr-1 (68% confidence interval of 110-2000 yr-1). Our derived rate is consistent (within the large uncertainty) with the all-sky rate of on-axis GRBs derived by the Swift satellite. Finally, we briefly discuss the implications of the nondetection to date of bona fide “orphan” afterglows (i.e., those lacking detectable high-energy emission) on GRB beaming and the degree of baryon loading in these relativistic jets.

  17. Databases and Web Tools for Cancer Genomics Study

    PubMed Central

    Yang, Yadong; Dong, Xunong; Xie, Bingbing; Ding, Nan; Chen, Juan; Li, Yongjun; Zhang, Qian; Qu, Hongzhu; Fang, Xiangdong

    2015-01-01

    Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data comprehensiveness, and user experience. The resources reviewed include data repository and analysis tools; and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community. PMID:25707591

  18. A Methodology for the Analysis of Programmer Productivity and Effort Estimation within the Framework of Software Conversion.

    DTIC Science & Technology

    1984-05-01

    TEST CH~ART NAT ONAL BQREAu Or s T AADS-963-’ 82 The coefficient of KCA(programmer’s knowledge of the program) initially seemed to be an error...productivity, as expected. An interesting manifestation, supporting a discovery by Oliver, was exhibited by the rating of a programmer’s knowledge of...ACCESSION NO. 3. RECIPIENT’S CATALOG NUM13ER 10 AFIT/CI/NR 84-44D _ _ ___)___________ 4. TITLE (.,d S ..benli) PR#fQP6rV/rV S . TYPE OF REPORT A PERIOD

  19. Cryo-EM in drug discovery: achievements, limitations and prospects.

    PubMed

    Renaud, Jean-Paul; Chari, Ashwin; Ciferri, Claudio; Liu, Wen-Ti; Rémigy, Hervé-William; Stark, Holger; Wiesmann, Christian

    2018-06-08

    Cryo-electron microscopy (cryo-EM) of non-crystalline single particles is a biophysical technique that can be used to determine the structure of biological macromolecules and assemblies. Historically, its potential for application in drug discovery has been heavily limited by two issues: the minimum size of the structures it can be used to study and the resolution of the images. However, recent technological advances - including the development of direct electron detectors and more effective computational image analysis techniques - are revolutionizing the utility of cryo-EM, leading to a burst of high-resolution structures of large macromolecular assemblies. These advances have raised hopes that single-particle cryo-EM might soon become an important tool for drug discovery, particularly if they could enable structural determination for 'intractable' targets that are still not accessible to X-ray crystallographic analysis. This article describes the recent advances in the field and critically assesses their relevance for drug discovery as well as discussing at what stages of the drug discovery pipeline cryo-EM can be useful today and what to expect in the near future.

  20. Modern approaches to accelerate discovery of new antischistosomal drugs.

    PubMed

    Neves, Bruno Junior; Muratov, Eugene; Machado, Renato Beilner; Andrade, Carolina Horta; Cravo, Pedro Vitor Lemos

    2016-06-01

    The almost exclusive use of only praziquantel for the treatment of schistosomiasis has raised concerns about the possible emergence of drug-resistant schistosomes. Consequently, there is an urgent need for new antischistosomal drugs. The identification of leads and the generation of high quality data are crucial steps in the early stages of schistosome drug discovery projects. Herein, the authors focus on the current developments in antischistosomal lead discovery, specifically referring to the use of automated in vitro target-based and whole-organism screens and virtual screening of chemical databases. They highlight the strengths and pitfalls of each of the above-mentioned approaches, and suggest possible roadmaps towards the integration of several strategies, which may contribute for optimizing research outputs and led to more successful and cost-effective drug discovery endeavors. Increasing partnerships and access to funding for drug discovery have strengthened the battle against schistosomiasis in recent years. However, the authors believe this battle also includes innovative strategies to overcome scientific challenges. In this context, significant advances of in vitro screening as well as computer-aided drug discovery have contributed to increase the success rate and reduce the costs of drug discovery campaigns. Although some of these approaches were already used in current antischistosomal lead discovery pipelines, the integration of these strategies in a solid workflow should allow the production of new treatments for schistosomiasis in the near future.

  1. Beacons in Time: Maarten Schmidt and the Discovery of Quasars.

    ERIC Educational Resources Information Center

    Preston, Richard

    1988-01-01

    Tells the story of Maarten Schmidt and the discovery of quasars. Discusses the decomposition of light, crucial observations and solving astronomical mysteries. Describes spectroscopic analysis used in astronomy and its application to quasars. (CW)

  2. Modeling Emergence in Neuroprotective Regulatory Networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanfilippo, Antonio P.; Haack, Jereme N.; McDermott, Jason E.

    2013-01-05

    The use of predictive modeling in the analysis of gene expression data can greatly accelerate the pace of scientific discovery in biomedical research by enabling in silico experimentation to test disease triggers and potential drug therapies. Techniques that focus on modeling emergence, such as agent-based modeling and multi-agent simulations, are of particular interest as they support the discovery of pathways that may have never been observed in the past. Thus far, these techniques have been primarily applied at the multi-cellular level, or have focused on signaling and metabolic networks. We present an approach where emergence modeling is extended to regulatorymore » networks and demonstrate its application to the discovery of neuroprotective pathways. An initial evaluation of the approach indicates that emergence modeling provides novel insights for the analysis of regulatory networks that can advance the discovery of acute treatments for stroke and other diseases.« less

  3. Discovery of innovative therapeutics: today's realities and tomorrow's vision. 2. Pharma's challenges and their commitment to innovation.

    PubMed

    Abou-Gharbia, Magid; Childers, Wayne E

    2014-07-10

    The pharmaceutical industry is facing enormous challenges, including reduced efficiency, stagnant success rate, patent expirations for key drugs, fierce price competition from generics, high regulatory hurdles, and the industry's perceived tarnished image. Pharma has responded by embarking on a range of initiatives. Other sectors, including NIH, have also responded. Academic drug discovery groups have appeared to support the transition of innovative academic discoveries and ideas into attractive drug discovery opportunities. Part 1 of this two-part series discussed the criticisms that have been leveled at the pharmaceutical industry over the past 3 decades and summarized the supporting data for and against these criticisms. This second installment will focus on the current challenges facing the pharmaceutical industry and Pharma's responses, focusing on the industry's changing perspective and new business models for coping with the loss of talent and declining clinical pipelines as well as presenting some examples of recent drug discovery successes.

  4. Associating quantitative behavioral traits with gene expression in the brain: searching for diamonds in the hay.

    PubMed

    Reiner-Benaim, Anat; Yekutieli, Daniel; Letwin, Noah E; Elmer, Gregory I; Lee, Norman H; Kafkafi, Neri; Benjamini, Yoav

    2007-09-01

    Gene expression and phenotypic functionality can best be associated when they are measured quantitatively within the same experiment. The analysis of such a complex experiment is presented, searching for associations between measures of exploratory behavior in mice and gene expression in brain regions. The analysis of such experiments raises several methodological problems. First and foremost, the size of the pool of potential discoveries being screened is enormous yet only few biologically relevant findings are expected, making the problem of multiple testing especially severe. We present solutions based on screening by testing related hypotheses, then testing the hypotheses of interest. In one variant the subset is selected directly, in the other one a tree of hypotheses is tested hierarchical; both variants control the False Discovery Rate (FDR). Other problems in such experiments are in the fact that the level of data aggregation may be different for the quantitative traits (one per animal) and gene expression measurements (pooled across animals); in that the association may not be linear; and in the resolution of interest only few replications exist. We offer solutions to these problems as well. The hierarchical FDR testing strategies presented here can serve beyond the structure of our motivating example study to any complex microarray study. Supplementary data are available at Bioinformatics online.

  5. Accounting for control mislabeling in case-control biomarker studies.

    PubMed

    Rantalainen, Mattias; Holmes, Chris C

    2011-12-02

    In biomarker discovery studies, uncertainty associated with case and control labels is often overlooked. By omitting to take into account label uncertainty, model parameters and the predictive risk can become biased, sometimes severely. The most common situation is when the control set contains an unknown number of undiagnosed, or future, cases. This has a marked impact in situations where the model needs to be well-calibrated, e.g., when the prediction performance of a biomarker panel is evaluated. Failing to account for class label uncertainty may lead to underestimation of classification performance and bias in parameter estimates. This can further impact on meta-analysis for combining evidence from multiple studies. Using a simulation study, we outline how conventional statistical models can be modified to address class label uncertainty leading to well-calibrated prediction performance estimates and reduced bias in meta-analysis. We focus on the problem of mislabeled control subjects in case-control studies, i.e., when some of the control subjects are undiagnosed cases, although the procedures we report are generic. The uncertainty in control status is a particular situation common in biomarker discovery studies in the context of genomic and molecular epidemiology, where control subjects are commonly sampled from the general population with an established expected disease incidence rate.

  6. Discovery of Empirical Components by Information Theory

    DTIC Science & Technology

    2016-08-10

    AFRL-AFOSR-VA-TR-2016-0289 Discovery of Empirical Components by Information Theory Amit Singer TRUSTEES OF PRINCETON UNIVERSITY 1 NASSAU HALL...3. DATES COVERED (From - To) 15 Feb 2013 to 14 Feb 2016 5a. CONTRACT NUMBER Discovery of Empirical Components by Information Theory 5b. GRANT...they draw not only from traditional linear algebra based numerical analysis or approximation theory , but also from information theory , graph theory

  7. Perspectives on NMR in drug discovery: a technique comes of age

    PubMed Central

    Pellecchia, Maurizio; Bertini, Ivano; Cowburn, David; Dalvit, Claudio; Giralt, Ernest; Jahnke, Wolfgang; James, Thomas L.; Homans, Steve W.; Kessler, Horst; Luchinat, Claudio; Meyer, Bernd; Oschkinat, Hartmut; Peng, Jeff; Schwalbe, Harald; Siegal, Gregg

    2009-01-01

    In the past decade, the potential of harnessing the ability of nuclear magnetic resonance (NMR) spectroscopy to monitor intermolecular interactions as a tool for drug discovery has been increasingly appreciated in academia and industry. In this Perspective, we highlight some of the major applications of NMR in drug discovery, focusing on hit and lead generation, and provide a critical analysis of its current and potential utility. PMID:19172689

  8. iProphet: Multi-level Integrative Analysis of Shotgun Proteomic Data Improves Peptide and Protein Identification Rates and Error Estimates*

    PubMed Central

    Shteynberg, David; Deutsch, Eric W.; Lam, Henry; Eng, Jimmy K.; Sun, Zhi; Tasman, Natalie; Mendoza, Luis; Moritz, Robert L.; Aebersold, Ruedi; Nesvizhskii, Alexey I.

    2011-01-01

    The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets. PMID:21876204

  9. Net present value approaches for drug discovery.

    PubMed

    Svennebring, Andreas M; Wikberg, Jarl Es

    2013-12-01

    Three dedicated approaches to the calculation of the risk-adjusted net present value (rNPV) in drug discovery projects under different assumptions are suggested. The probability of finding a candidate drug suitable for clinical development and the time to the initiation of the clinical development is assumed to be flexible in contrast to the previously used models. The rNPV of the post-discovery cash flows is calculated as the probability weighted average of the rNPV at each potential time of initiation of clinical development. Practical considerations how to set probability rates, in particular during the initiation and termination of a project is discussed.

  10. STS-131 Discovery Launch

    NASA Image and Video Library

    2010-04-05

    201004050001hq (5 April 2010) --- NASA Administrator Charles Bolden looks out the window of Firing Room Four in the Launch Control Center during the launch of the space shuttle Discovery and the start of the STS-131 mission at NASA Kennedy Space Center in Cape Canaveral, Fla. on April 5, 2010. Discovery is carrying a multi-purpose logistics module filled with science racks for the laboratories aboard the International Space Station. The mission has three planned spacewalks, with work to include replacing an ammonia tank assembly, retrieving a Japanese experiment from the station?s exterior, and switching out a rate gyro assembly on the station?s truss structure. Photo Credit: NASA/Bill Ingalls

  11. An investigation of the false discovery rate and the misinterpretation of p-values

    PubMed Central

    Colquhoun, David

    2014-01-01

    If you use p=0.05 to suggest that you have made a discovery, you will be wrong at least 30% of the time. If, as is often the case, experiments are underpowered, you will be wrong most of the time. This conclusion is demonstrated from several points of view. First, tree diagrams which show the close analogy with the screening test problem. Similar conclusions are drawn by repeated simulations of t-tests. These mimic what is done in real life, which makes the results more persuasive. The simulation method is used also to evaluate the extent to which effect sizes are over-estimated, especially in underpowered experiments. A script is supplied to allow the reader to do simulations themselves, with numbers appropriate for their own work. It is concluded that if you wish to keep your false discovery rate below 5%, you need to use a three-sigma rule, or to insist on p≤0.001. And never use the word ‘significant’. PMID:26064558

  12. Discovery of the leinamycin family of natural products by mining actinobacterial genomes

    PubMed Central

    Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen

    2017-01-01

    Nature’s ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF–SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF–SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm-type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature’s rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature’s biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity. PMID:29229819

  13. Discovery of the leinamycin family of natural products by mining actinobacterial genomes.

    PubMed

    Pan, Guohui; Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Yang, Dong; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen; Shen, Ben

    2017-12-26

    Nature's ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF-SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF-SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm -type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature's rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature's biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity.

  14. SkyDiscovery: Humans and Machines Working Together

    NASA Astrophysics Data System (ADS)

    Donalek, Ciro; Fang, K.; Drake, A. J.; Djorgovski, S. G.; Graham, M. J.; Mahabal, A.; Williams, R.

    2011-01-01

    Synoptic sky surveys are now discovering tens to hundreds of transient events every clear night, and that data rate is expected to increase dramatically as we move towards the LSST. A key problem is classification of transients, which determines their scientific interest and possible follow-up. Some of the relevant information is contextual, and easily recognizable by humans looking at images, but it is very hard to encode in the data pipelines. Crowdsourcing (aka Citizen Science) provides one possible way to gather such information. SkyDiscovery.org is a website that allows experts and citizen science enthusiasts to work together and share information in a collaborative scientific discovery environment. Currently there are two projects running on the website. In the Event Classification project users help finding candidate transients through a series of questions related to the images shown. Event classification depends very much form the contextual information and humans are remarkably effective at recognizing noise in incomplete heterogeneous data and figuring out which contextual information is important. In the SNHunt project users are requested to look for new objects appearing on images of galaxies taken by the Catalina Real-time Transient Survey, in order to find all the supernovae occurring in nearby bright galaxies. Images are served alongside with other tools that can help the discovery. A multi level approach allows the complexity of the interface to be tailored to the expertise level of the user. An entry level user can just review images and validate events as being real, while a more advanced user would be able to interact with the data associated to an event. The data gathered will not be only analyzed and used directly for some specific science project, but also to train well-defined algorithms to be used in automating such data analysis in the future.

  15. Response to comments on "Can we name Earth's species before they go extinct?".

    PubMed

    Costello, Mark J; May, Robert M; Stork, Nigel E

    2013-07-19

    Mora et al. disputed that most species will be discovered before they go extinct, but not our main recommendations to accelerate species' discoveries. We show that our conclusions would be unaltered by discoveries of more microscopic species and reinforce our estimates of species description and extinction rates, that taxonomic effort has never been greater, and that there are 2 to 8 million species on Earth.

  16. An explanation of resisted discoveries based on construal-level theory.

    PubMed

    Fang, Hui

    2015-02-01

    New discoveries and theories are crucial for the development of science, but they are often initially resisted by the scientific community. This paper analyses resistance to scientific discoveries that supplement previous research results or conclusions with new phenomena, such as long chains in macromolecules, Alfvén waves, parity nonconservation in weak interactions and quasicrystals. Construal-level theory is used to explain that the probability of new discoveries may be underestimated because of psychological distance. Thus, the insufficiently examined scope of an accepted theory may lead to overstating the suitable scope and underestimating the probability of its undiscovered counter-examples. Therefore, psychological activity can result in people instinctively resisting new discoveries. Direct evidence can help people judge the validity of a hypothesis with rational thinking. The effects of authorities and textbooks on the resistance to discoveries are also discussed. From the results of our analysis, suggestions are provided to reduce resistance to real discoveries, which will benefit the development of science.

  17. Controlling the Rate of GWAS False Discoveries

    PubMed Central

    Brzyski, Damian; Peterson, Christine B.; Sobczyk, Piotr; Candès, Emmanuel J.; Bogdan, Malgorzata; Sabatti, Chiara

    2017-01-01

    With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study. PMID:27784720

  18. Controlling the Rate of GWAS False Discoveries.

    PubMed

    Brzyski, Damian; Peterson, Christine B; Sobczyk, Piotr; Candès, Emmanuel J; Bogdan, Malgorzata; Sabatti, Chiara

    2017-01-01

    With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study. Copyright © 2017 by the Genetics Society of America.

  19. PTGER4 gene variant rs76523431 is a candidate risk factor for radiological joint damage in rheumatoid arthritis patients: a genetic study of six cohorts.

    PubMed

    Rodriguez-Rodriguez, Luis; Ivorra-Cortes, Jose; Carmona, F David; Martín, Javier; Balsa, Alejandro; van Steenbergen, Hanna W; van der Helm-van Mil, Annette H M; González-Álvaro, Isidoro; Fernandez-Gutiérrez, Benjamín

    2015-11-05

    Prostaglandin E receptor 4 (PTGER4) is implicated in immune regulation and bone metabolism. The aim of this study was to analyze its role in radiological joint damage in rheumatoid arthritis (RA). Six independent cohorts of patients with RA of European or North American descent were included, comprising 1789 patients with 5083 sets of X-rays. The Hospital Clínico San Carlos Rheumatoid Arthritis, Princesa Early Arthritis Register Longitudinal study, and Hospital Universitario de La Paz early arthritis (Spain) cohorts were used as discovery cohorts, and the Leiden Early Arthritis Clinic (The Netherlands), Wichita (United States), and National Databank for Rheumatic Diseases (United States and Canada) cohorts as replication cohorts. First, the PTGER4 rs6896969 single-nucleotide polymorphism (SNP) was genotyped using TaqMan assays and available Illumina Immunochip data and studied in the discovery and replication cohorts. Second, the PTGER4 gene and adjacent regions were analyzed using Immunochip genotyping data in the discovery cohorts. On the basis of pooled p values, linkage disequilibrium structure of the region, and location in regions with transcriptional properties, SNPs were selected for replication. The results from discovery, replication, and overall cohorts were pooled using inverse-variance-weighted meta-analysis. Influence of the polymorphisms on the overall radiological damage (constant effect) and on damage progression over time (time-varying effect) was analyzed. The rs6896969 polymorphism showed a significant association with radiological damage in the constant effect pooled analysis of the discovery cohorts, although no significant association was observed in the replication cohorts or the overall pooled analysis. Regarding the analysis of the PTGER4 region, 976 variants were analyzed in the discovery cohorts. From the constant and time-varying effect analyses, 12 and 20 SNPs, respectively, were selected for replication. Only the rs76523431 variant showed a significant association with radiographic progression in the time-varying effect pooled analysis of the discovery, replication, and overall cohorts. The overall pooled effect size was 1.10 (95 % confidence interval 1.05-1.14, p = 2.10 × 10(-5)), meaning that radiographic yearly progression was 10 % greater for each copy of the minor allele. The PTGER4 gene is a candidate risk factor for radiological progression in RA.

  20. Data-Driven Process Discovery: A Discrete Time Algebra for Relational Signal Analysis

    DTIC Science & Technology

    1996-12-01

    would also like to thank Dr. Mark Oxley for his assistance in developing this abstract algebra and the mathematical notation found herein. Lastly, I... Mathematical Result.. 4-13 4.4. Demostration of Coefficient Signature Additon ........................ 4-14 4.5. Multivariate Relational Discovery...spaces with the recognition of cues in a specific space" [21]. Up to now, most of the Artificial Intelligence (Al) ’discovery’ work has emphasized one

  1. Discovery of Three New Millisecond Pulsars in Terzan 5

    NASA Astrophysics Data System (ADS)

    Cadelano, M.; Ransom, S. M.; Freire, P. C. C.; Ferraro, F. R.; Hessels, J. W. T.; Lanzoni, B.; Pallanca, C.; Stairs, I. H.

    2018-03-01

    We report on the discovery of three new millisecond pulsars (MSPs; namely J1748‑2446aj, J1748‑2446ak, and J1748‑2446al) in the inner regions of the dense stellar system Terzan 5. These pulsars have been discovered thanks to a method, alternative to the classical search routines, that exploited the large set of archival observations of Terzan 5 acquired with the Green Bank Telescope over five years (from 2010 to 2015). This technique allowed the analysis of stacked power spectra obtained by combining ∼206 hr of observation. J1748‑2446aj has a spin period of ∼2.96 ms, J1748‑2446ak of ∼1.89 ms (thus it is the fourth fastest pulsar in the cluster) and J1748‑2446al of ∼5.95 ms. All three MSPs are isolated, and currently we have timing solutions only for J1748‑2446aj and J1748‑2446ak. For these two systems, we evaluated the contribution to the measured spin-down rate of the acceleration due to the cluster potential field, thus estimating the intrinsic spin-down rates, which are in agreement with those typically measured for MSPs in globular clusters (GCs). Our results increase the number of pulsars known in Terzan 5 to 37, which now hosts 25% of the entire pulsar population identified, so far, in GCs.

  2. Cultivation of an obligate acidophilic ammonia oxidizer from a nitrifying acid soil.

    PubMed

    Lehtovirta-Morley, Laura E; Stoecker, Kilian; Vilcinskas, Andreas; Prosser, James I; Nicol, Graeme W

    2011-09-20

    Nitrification is a fundamental component of the global nitrogen cycle and leads to significant fertilizer loss and atmospheric and groundwater pollution. Nitrification rates in acidic soils (pH < 5.5), which comprise 30% of the world's soils, equal or exceed those of neutral soils. Paradoxically, autotrophic ammonia oxidizing bacteria and archaea, which perform the first stage in nitrification, demonstrate little or no growth in suspended liquid culture below pH 6.5, at which ammonia availability is reduced by ionization. Here we report the discovery and cultivation of a chemolithotrophic, obligately acidophilic thaumarchaeal ammonia oxidizer, "Candidatus Nitrosotalea devanaterra," from an acidic agricultural soil. Phylogenetic analysis places the organism within a previously uncultivated thaumarchaeal lineage that has been observed in acidic soils. Growth of the organism is optimal in the pH range 4 to 5 and is restricted to the pH range 4 to 5.5, unlike all previously cultivated ammonia oxidizers. Growth of this organism and associated ammonia oxidation and autotrophy also occur during nitrification in soil at pH 4.5. The discovery of Nitrosotalea devanaterra provides a previously unsuspected explanation for high rates of nitrification in acidic soils, and confirms the vital role that thaumarchaea play in terrestrial nitrogen cycling. Growth at extremely low ammonia concentration (0.18 nM) also challenges accepted views on ammonia uptake and metabolism and indicates novel mechanisms for ammonia oxidation at low pH.

  3. LINNAEUS: BOOSTING NEAR EARTH ASTEROID CHARACTERIZATION RATES

    NASA Astrophysics Data System (ADS)

    Elvis, Martin; Beeson, C.; Galache, J.; DeMeo, F.; Evans, I.; Evans, J.; Konidaris, N.; Najita, J.; Allen, L.; Christensen, E.; Spahr, T.

    2013-10-01

    Near Earth objects (NEOs) are being discovered at a rate of about 1000 per year, and this rate is set to double by 2015. However, the physical characterization of NEOs is only ~100 per year for each type of follow-up observation. We have proposed the LINNAEUS program to NASA to raise the characterization rate of NEOs to the rate of their discovery. This rate matching is necessary as any given NEO is only available for a relatively short time (days to weeks), and they are usually fainter on subsequent apparitions. Hence follow-up observations must be initiated rapidly, without time to cherry-pick the optimum objects. LINNAEUS concentrates on NEO composition. Optical spectra, preferably extending into the near-infrared, provide compositions that can distinguish major compositional classes of NEOs with reasonable confidence (Bus and Binzel 2002, DeMeo et al. 2009). Armed with a taxonomic type the albedo, pV, of an NEO is better constrained, leading to more accurate sizes and masses. Time-resolved spectroscopy can give indications of period, axial ratio and surface homogeneity. A reasonable program of spectroscopy could keep pace with the NEO discovery rate. A ground-based telescope can observe faint NEOs about 210 nights a year, due to time lost due to weather, bright time, and equipment downtime (e.g. Gemini), for a total of ~2000 hours/year. At 1 hour per NEO spectrum, a well-run, dedicated, telescope could obtain almost 2000 spectra per year, about the rate required. If near-IR spectra are required then a 4 m or larger telescope is necessary to reach 20. However, if the Bus-Binzel taxomonmy suffices then only optical spectra are needed and a 2 meter class telescope is sufficient. LINNAEUS would use 50% of the KPNO 2.1 m telescope with an IFU spectrometer, the SED-machine (Ben-Ami et al. 2013), to obtain time-resolved optical spectra of 1200-2000 NEOs/year, or 4200-7000 in 3.5 years observing in an NEOO program. Robust pipeline analysis will release taxonomic types via the Minor Planet Center within 24 hours and a full archive of spectra and products will be provided.

  4. Cheminformatics in Drug Discovery, an Industrial Perspective.

    PubMed

    Chen, Hongming; Kogej, Thierry; Engkvist, Ola

    2018-05-18

    Cheminformatics has established itself as a core discipline within large scale drug discovery operations. It would be impossible to handle the amount of data generated today in a small molecule drug discovery project without persons skilled in cheminformatics. In addition, due to increased emphasis on "Big Data", machine learning and artificial intelligence, not only in the society in general, but also in drug discovery, it is expected that the cheminformatics field will be even more important in the future. Traditional areas like virtual screening, library design and high-throughput screening analysis are highlighted in this review. Applying machine learning in drug discovery is an area that has become very important. Applications of machine learning in early drug discovery has been extended from predicting ADME properties and target activity to tasks like de novo molecular design and prediction of chemical reactions. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. A Discovery Experiment: Carbon Dioxide Soap Bubble Dynamics.

    ERIC Educational Resources Information Center

    Millikan, Roger C.

    1978-01-01

    The observation of soap bubbles in a beaker of carbon dioxide gas helps students to feel the pleasure that comes from understanding nature, from applying that understanding to real problems, and from making unexpected discoveries that yield to analysis. (Author/BB)

  6. SACCHARIS: an automated pipeline to streamline discovery of carbohydrate active enzyme activities within polyspecific families and de novo sequence datasets.

    PubMed

    Jones, Darryl R; Thomas, Dallas; Alger, Nicholas; Ghavidel, Ata; Inglis, G Douglas; Abbott, D Wade

    2018-01-01

    Deposition of new genetic sequences in online databases is expanding at an unprecedented rate. As a result, sequence identification continues to outpace functional characterization of carbohydrate active enzymes (CAZymes). In this paradigm, the discovery of enzymes with novel functions is often hindered by high volumes of uncharacterized sequences particularly when the enzyme sequence belongs to a family that exhibits diverse functional specificities (i.e., polyspecificity). Therefore, to direct sequence-based discovery and characterization of new enzyme activities we have developed an automated in silico pipeline entitled: Sequence Analysis and Clustering of CarboHydrate Active enzymes for Rapid Informed prediction of Specificity (SACCHARIS). This pipeline streamlines the selection of uncharacterized sequences for discovery of new CAZyme or CBM specificity from families currently maintained on the CAZy website or within user-defined datasets. SACCHARIS was used to generate a phylogenetic tree of a GH43, a CAZyme family with defined subfamily designations. This analysis confirmed that large datasets can be organized into sequence clusters of manageable sizes that possess related functions. Seeding this tree with a GH43 sequence from Bacteroides dorei DSM 17855 (BdGH43b, revealed it partitioned as a single sequence within the tree. This pattern was consistent with it possessing a unique enzyme activity for GH43 as BdGH43b is the first described α-glucanase described for this family. The capacity of SACCHARIS to extract and cluster characterized carbohydrate binding module sequences was demonstrated using family 6 CBMs (i.e., CBM6s). This CBM family displays a polyspecific ligand binding profile and contains many structurally determined members. Using SACCHARIS to identify a cluster of divergent sequences, a CBM6 sequence from a unique clade was demonstrated to bind yeast mannan, which represents the first description of an α-mannan binding CBM. Additionally, we have performed a CAZome analysis of an in-house sequenced bacterial genome and a comparative analysis of B. thetaiotaomicron VPI-5482 and B. thetaiotaomicron 7330, to demonstrate that SACCHARIS can generate "CAZome fingerprints", which differentiate between the saccharolytic potential of two related strains in silico. Establishing sequence-function and sequence-structure relationships in polyspecific CAZyme families are promising approaches for streamlining enzyme discovery. SACCHARIS facilitates this process by embedding CAZyme and CBM family trees generated from biochemically to structurally characterized sequences, with protein sequences that have unknown functions. In addition, these trees can be integrated with user-defined datasets (e.g., genomics, metagenomics, and transcriptomics) to inform experimental characterization of new CAZymes or CBMs not currently curated, and for researchers to compare differential sequence patterns between entire CAZomes. In this light, SACCHARIS provides an in silico tool that can be tailored for enzyme bioprospecting in datasets of increasing complexity and for diverse applications in glycobiotechnology.

  7. A General Framework for Discovery and Classification in Astronomy

    NASA Astrophysics Data System (ADS)

    Dick, Steven J.

    2012-09-01

    An analysis of the discovery of 82 classes of astronomical objects reveals an extended structure of discovery, consisting of detection, interpretation and understanding, each with its own nuances and a microstructure including conceptual, technological and social roles. This is true with a remarkable degree of consistency over the last 400 years of telescopic astronomy, ranging from Galileo's discovery of satellites, planetary rings and star clusters, to the discovery of quasars and pulsars. Telescopes have served as ``engines of discovery'' in several ways, ranging from telescope size and sensitivity (planetary nebulae and spiral nebulae), to specialized detectors (TNOs) and the opening of the electromagnetic spectrum for astronomy (pulsars, pulsar planets, and most active galaxies). A few classes (radiation belts, the solar wind and cosmic rays) were initially discovered without the telescope. Classification also plays an important role in discovery. While it might seem that classification marks the end of discovery, or a post-discovery phase, in fact it often marks the beginning, even a pre-discovery phase. Nowhere is this more clearly seen than in the classification of stellar spectra, long before dwarfs, giants and supergiants were known, or their evolutionary sequence recognized. Classification may also be part of a post-discovery phase, as in the MK system of stellar classification, constructed after the discovery of stellar luminosity classes. Some classes are declared rather than detected, as in the case of gas and ice giant planets, and, infamously, Pluto as a dwarf planet. Others are inferred rather than detected, including most classes of stars.

  8. Use of eQTL Analysis for the Discovery of Target Genes Identified by GWAS

    DTIC Science & Technology

    2013-04-01

    the biologic pathways affected by these inherited factors, and ultimately to identify targets for disease prediction, risk stratification and...quality using an Agilent chip technology. Cases having a RIN number of 7.0 or greater were considered good quality. Once completed, the optimum set of...AD_________________ Award Number: W81XWH-11-1-0261 TITLE: Use of eQTL Analysis for the Discovery of

  9. Use of eQTL Analysis for the Discovery of Target Genes Identified by GWAS

    DTIC Science & Technology

    2014-04-01

    technology. Cases having a RIN number of 7.0 or greater were considered good quality. Once completed, the optimum set of 500 samples were then selected for...AD_________________ Award Number: W81XWH-11-1-0261 TITLE: Use of eQTL Analysis for the Discovery...Distribution Unlimited The views, opinions and/or findings contained in this report are those of the author(s) and

  10. New glycoproteomics software, GlycoPep Evaluator, generates decoy glycopeptides de novo and enables accurate false discovery rate analysis for small data sets.

    PubMed

    Zhu, Zhikai; Su, Xiaomeng; Go, Eden P; Desaire, Heather

    2014-09-16

    Glycoproteins are biologically significant large molecules that participate in numerous cellular activities. In order to obtain site-specific protein glycosylation information, intact glycopeptides, with the glycan attached to the peptide sequence, are characterized by tandem mass spectrometry (MS/MS) methods such as collision-induced dissociation (CID) and electron transfer dissociation (ETD). While several emerging automated tools are developed, no consensus is present in the field about the best way to determine the reliability of the tools and/or provide the false discovery rate (FDR). A common approach to calculate FDRs for glycopeptide analysis, adopted from the target-decoy strategy in proteomics, employs a decoy database that is created based on the target protein sequence database. Nonetheless, this approach is not optimal in measuring the confidence of N-linked glycopeptide matches, because the glycopeptide data set is considerably smaller compared to that of peptides, and the requirement of a consensus sequence for N-glycosylation further limits the number of possible decoy glycopeptides tested in a database search. To address the need to accurately determine FDRs for automated glycopeptide assignments, we developed GlycoPep Evaluator (GPE), a tool that helps to measure FDRs in identifying glycopeptides without using a decoy database. GPE generates decoy glycopeptides de novo for every target glycopeptide, in a 1:20 target-to-decoy ratio. The decoys, along with target glycopeptides, are scored against the ETD data, from which FDRs can be calculated accurately based on the number of decoy matches and the ratio of the number of targets to decoys, for small data sets. GPE is freely accessible for download and can work with any search engine that interprets ETD data of N-linked glycopeptides. The software is provided at https://desairegroup.ku.edu/research.

  11. ICan: An Optimized Ion-Current-Based Quantification Procedure with Enhanced Quantitative Accuracy and Sensitivity in Biomarker Discovery

    PubMed Central

    2015-01-01

    The rapidly expanding availability of high-resolution mass spectrometry has substantially enhanced the ion-current-based relative quantification techniques. Despite the increasing interest in ion-current-based methods, quantitative sensitivity, accuracy, and false discovery rate remain the major concerns; consequently, comprehensive evaluation and development in these regards are urgently needed. Here we describe an integrated, new procedure for data normalization and protein ratio estimation, termed ICan, for improved ion-current-based analysis of data generated by high-resolution mass spectrometry (MS). ICan achieved significantly better accuracy and precision, and lower false-positive rate for discovering altered proteins, over current popular pipelines. A spiked-in experiment was used to evaluate the performance of ICan to detect small changes. In this study E. coli extracts were spiked with moderate-abundance proteins from human plasma (MAP, enriched by IgY14-SuperMix procedure) at two different levels to set a small change of 1.5-fold. Forty-five (92%, with an average ratio of 1.71 ± 0.13) of 49 identified MAP protein (i.e., the true positives) and none of the reference proteins (1.0-fold) were determined as significantly altered proteins, with cutoff thresholds of ≥1.3-fold change and p ≤ 0.05. This is the first study to evaluate and prove competitive performance of the ion-current-based approach for assigning significance to proteins with small changes. By comparison, other methods showed remarkably inferior performance. ICan can be broadly applicable to reliable and sensitive proteomic survey of multiple biological samples with the use of high-resolution MS. Moreover, many key features evaluated and optimized here such as normalization, protein ratio determination, and statistical analyses are also valuable for data analysis by isotope-labeling methods. PMID:25285707

  12. Inflammatory gene polymorphisms and risk of postoperative myocardial infarction after cardiac surgery.

    PubMed

    Podgoreanu, M V; White, W D; Morris, R W; Mathew, J P; Stafford-Smith, M; Welsby, I J; Grocott, H P; Milano, C A; Newman, M F; Schwinn, D A

    2006-07-04

    The inflammatory response triggered by cardiac surgery with cardiopulmonary bypass (CPB) is a primary mechanism in the pathogenesis of postoperative myocardial infarction (PMI), a multifactorial disorder with significant inter-patient variability poorly predicted by clinical and procedural factors. We tested the hypothesis that candidate gene polymorphisms in inflammatory pathways contribute to risk of PMI after cardiac surgery. We genotyped 48 polymorphisms from 23 candidate genes in a prospective cohort of 434 patients undergoing elective cardiac surgery with CPB. PMI was defined as creatine kinase-MB isoenzyme level > or = 10x upper limit of normal at 24 hours postoperatively. A 2-step analysis strategy was used: marker selection, followed by model building. To minimize false-positive associations, we adjusted for multiple testing by permutation analysis, Bonferroni correction, and controlling the false discovery rate; 52 patients (12%) experienced PMI. After adjusting for multiple comparisons and clinical risk factors, 3 polymorphisms were found to be independent predictors of PMI (adjusted P<0.05; false discovery rate <10%). These gene variants encode the proinflammatory cytokine interleukin 6 (IL6 -572G>C; odds ratio [OR], 2.47), and 2 adhesion molecules: intercellular adhesion molecule-1 (ICAM1 Lys469Glu; OR, 1.88), and E-selectin (SELE 98G>T; OR, 0.16). The inclusion of genotypic information from these polymorphisms improved prediction models for PMI based on traditional risk factors alone (C-statistic 0.764 versus 0.703). Functional genetic variants in cytokine and leukocyte-endothelial interaction pathways are independently associated with severity of myonecrosis after cardiac surgery. This may aid in preoperative identification of high-risk cardiac surgical patients and development of novel cardioprotective strategies.

  13. RHSEG and Subdue: Background and Preliminary Approach for Combining these Technologies for Enhanced Image Data Analysis, Mining and Knowledge Discovery

    NASA Technical Reports Server (NTRS)

    Tilton, James C.; Cook, Diane J.

    2008-01-01

    Under a project recently selected for funding by NASA's Science Mission Directorate under the Applied Information Systems Research (AISR) program, Tilton and Cook will design and implement the integration of the Subdue graph based knowledge discovery system, developed at the University of Texas Arlington and Washington State University, with image segmentation hierarchies produced by the RHSEG software, developed at NASA GSFC, and perform pilot demonstration studies of data analysis, mining and knowledge discovery on NASA data. Subdue represents a method for discovering substructures in structural databases. Subdue is devised for general-purpose automated discovery, concept learning, and hierarchical clustering, with or without domain knowledge. Subdue was developed by Cook and her colleague, Lawrence B. Holder. For Subdue to be effective in finding patterns in imagery data, the data must be abstracted up from the pixel domain. An appropriate abstraction of imagery data is a segmentation hierarchy: a set of several segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. The RHSEG program, a recursive approximation to a Hierarchical Segmentation approach (HSEG), can produce segmentation hierarchies quickly and effectively for a wide variety of images. RHSEG and HSEG were developed at NASA GSFC by Tilton. In this presentation we provide background on the RHSEG and Subdue technologies and present a preliminary analysis on how RHSEG and Subdue may be combined to enhance image data analysis, mining and knowledge discovery.

  14. A lead discovery strategy driven by a comprehensive analysis of proteases in the peptide substrate space

    PubMed Central

    Sukuru, Sai Chetan K; Nigsch, Florian; Quancard, Jean; Renatus, Martin; Chopra, Rajiv; Brooijmans, Natasja; Mikhailov, Dmitri; Deng, Zhan; Cornett, Allen; Jenkins, Jeremy L; Hommel, Ulrich; Davies, John W; Glick, Meir

    2010-01-01

    We present here a comprehensive analysis of proteases in the peptide substrate space and demonstrate its applicability for lead discovery. Aligned octapeptide substrates of 498 proteases taken from the MEROPS peptidase database were used for the in silico analysis. A multiple-category naïve Bayes model, trained on the two-dimensional chemical features of the substrates, was able to classify the substrates of 365 (73%) proteases and elucidate statistically significant chemical features for each of their specific substrate positions. The positional awareness of the method allows us to identify the most similar substrate positions between proteases. Our analysis reveals that proteases from different families, based on the traditional classification (aspartic, cysteine, serine, and metallo), could have substrates that differ at the cleavage site (P1–P1′) but are similar away from it. Caspase-3 (cysteine protease) and granzyme B (serine protease) are previously known examples of cross-family neighbors identified by this method. To assess whether peptide substrate similarity between unrelated proteases could reliably translate into the discovery of low molecular weight synthetic inhibitors, a lead discovery strategy was tested on two other cross-family neighbors—namely cathepsin L2 and matrix metallo proteinase 9, and calpain 1 and pepsin A. For both these pairs, a naïve Bayes classifier model trained on inhibitors of one protease could successfully enrich those of its neighbor from a different family and vice versa, indicating that this approach could be prospectively applied to lead discovery for a novel protease target with no known synthetic inhibitors. PMID:20799349

  15. Forty years of secondhand smoke research: the gap between discovery and delivery.

    PubMed

    Harris, Jenine K; Luke, Douglas A; Zuckerman, Rachael B; Shelton, Sarah C

    2009-06-01

    Public health initiatives often focus on the discovery of risk factors associated with disease and death. Although this is an important step in protecting public health, recently the field has recognized that it is critical to move along the continuum from discovery of risk factors to delivery of interventions, and to improve the quality and speed of translating scientific discoveries into practice. To understand how public health problems move from discovery to delivery, citation network analysis was used to examine 1877 articles on secondhand smoke (SHS) published between 1965 and 2005. Data were collected and analyzed in 2006-2007. Citation patterns showed discovery and delivery to be distinct areas of SHS research. There was little cross-citation between discovery and delivery research, including only nine citation connections between the main paths. A discovery article was 83.5% less likely to cite a delivery article than to cite another discovery article (OR=0.165 [95% CI=0.139, 0.197]), and a delivery article was 64.3% less likely (OR=0.357 [95% CI=0.330, 0.386]) to cite a discovery article than to cite another delivery article. Research summaries, such as Surgeon General reports, were cited frequently and appear to bridge the discovery-delivery gap. There was a lack of cross-citation between discovery and delivery, even though they share the goal of understanding and reducing the impact of SHS. Reliance on research summaries, although they provide an important bridge between discovery and delivery, may slow the development of a field.

  16. Sex-Specific Associations between Particulate Matter Exposure and Gene Expression in Independent Discovery and Validation Cohorts of Middle-Aged Men and Women.

    PubMed

    Vrijens, Karen; Winckelmans, Ellen; Tsamou, Maria; Baeyens, Willy; De Boever, Patrick; Jennen, Danyel; de Kok, Theo M; Den Hond, Elly; Lefebvre, Wouter; Plusquin, Michelle; Reynders, Hans; Schoeters, Greet; Van Larebeke, Nicolas; Vanpoucke, Charlotte; Kleinjans, Jos; Nawrot, Tim S

    2017-04-01

    Particulate matter (PM) exposure leads to premature death, mainly due to respiratory and cardiovascular diseases. Identification of transcriptomic biomarkers of air pollution exposure and effect in a healthy adult population. Microarray analyses were performed in 98 healthy volunteers (48 men, 50 women). The expression of eight sex-specific candidate biomarker genes (significantly associated with PM 10 in the discovery cohort and with a reported link to air pollution-related disease) was measured with qPCR in an independent validation cohort (75 men, 94 women). Pathway analysis was performed using Gene Set Enrichment Analysis. Average daily PM 2.5 and PM 10 exposures over 2-years were estimated for each participant's residential address using spatiotemporal interpolation in combination with a dispersion model. Average long-term PM 10 was 25.9 (± 5.4) and 23.7 (± 2.3) μg/m 3 in the discovery and validation cohorts, respectively. In discovery analysis, associations between PM 10 and the expression of individual genes differed by sex. In the validation cohort, long-term PM 10 was associated with the expression of DNAJB5 and EAPP in men and ARHGAP4 ( p = 0.053) in women. AKAP6 and LIMK1 were significantly associated with PM 10 in women, although associations differed in direction between the discovery and validation cohorts. Expression of the eight candidate genes in the discovery cohort differentiated between validation cohort participants with high versus low PM 10 exposure (area under the receiver operating curve = 0.92; 95% CI: 0.85, 1.00; p = 0.0002 in men, 0.86; 95% CI: 0.76, 0.96; p = 0.004 in women). Expression of the sex-specific candidate genes identified in the discovery population predicted PM 10 exposure in an independent cohort of adults from the same area. Confirmation in other populations may further support this as a new approach for exposure assessment, and may contribute to the discovery of molecular mechanisms for PM-induced health effects.

  17. Discovery of 4ms and 7 MS Pulsars in M15 (F & H)

    NASA Astrophysics Data System (ADS)

    Middleditch, J.

    1992-12-01

    Observations of M15 taken during Oct. 23-Nov. 1 1991 with the Arecibo 305-m telescope at 430 MHz, which were analyzed using 2-billion point Fourier transforms on supercomputers at Los Alamos National Laboratory, reveal two new ms pulsars in the globular cluster, M15. The sixth and fastest yet discovered in this cluster, M15F, has a spin rate of 248.3 Hz, while the eighth and latest to be discovered in this cluster has a spin rate of 148.3 Hz, the only one known so far in the frequency interval of 100-200 Hz. Further details and implications of these discoveries will be discussed.

  18. Current NEO surveys

    NASA Astrophysics Data System (ADS)

    Larson, Stephen

    2007-05-01

    The state and discovery rate of current NEO surveys reflects incremental improvements in a number of areas, such as detector size and sensitivity, computing capacity and availability of larger apertures. The result has been an increased discovery rate even with the expected reduction of objects left to discover. There are currently about 10 telescopes ranging in size from 0.5 - 1.5-meters carrying out full or part-time, regular surveying in both hemispheres. The sky is covered between 1-2 times per lunation to V~19, with a band near the ecliptic to V~20.5. We review the current survey programs and their contribution towards the Spaceguard goal of discovering at least 90% of the NEOs larger than 1 km.

  19. Conditional robustness analysis for fragility discovery and target identification in biochemical networks and in cancer systems biology.

    PubMed

    Bianconi, Fortunato; Baldelli, Elisa; Ludovini, Vienna; Luovini, Vienna; Petricoin, Emanuel F; Crinò, Lucio; Valigi, Paolo

    2015-10-19

    The study of cancer therapy is a key issue in the field of oncology research and the development of target therapies is one of the main problems currently under investigation. This is particularly relevant in different types of tumor where traditional chemotherapy approaches often fail, such as lung cancer. We started from the general definition of robustness introduced by Kitano and applied it to the analysis of dynamical biochemical networks, proposing a new algorithm based on moment independent analysis of input/output uncertainty. The framework utilizes novel computational methods which enable evaluating the model fragility with respect to quantitative performance measures and parameters such as reaction rate constants and initial conditions. The algorithm generates a small subset of parameters that can be used to act on complex networks and to obtain the desired behaviors. We have applied the proposed framework to the EGFR-IGF1R signal transduction network, a crucial pathway in lung cancer, as an example of Cancer Systems Biology application in drug discovery. Furthermore, we have tested our framework on a pulse generator network as an example of Synthetic Biology application, thus proving the suitability of our methodology to the characterization of the input/output synthetic circuits. The achieved results are of immediate practical application in computational biology, and while we demonstrate their use in two specific examples, they can in fact be used to study a wider class of biological systems.

  20. Advancements in Aptamer Discovery Technologies.

    PubMed

    Gotrik, Michael R; Feagin, Trevor A; Csordas, Andrew T; Nakamoto, Margaret A; Soh, H Tom

    2016-09-20

    Affinity reagents that specifically bind to their target molecules are invaluable tools in nearly every field of modern biomedicine. Nucleic acid-based aptamers offer many advantages in this domain, because they are chemically synthesized, stable, and economical. Despite these compelling features, aptamers are currently not widely used in comparison to antibodies. This is primarily because conventional aptamer-discovery techniques such as SELEX are time-consuming and labor-intensive and often fail to produce aptamers with comparable binding performance to antibodies. This Account describes a body of work from our laboratory in developing advanced methods for consistently producing high-performance aptamers with higher efficiency, fewer resources, and, most importantly, a greater probability of success. We describe our efforts in systematically transforming each major step of the aptamer discovery process: selection, analysis, and characterization. To improve selection, we have developed microfluidic devices (M-SELEX) that enable discovery of high-affinity aptamers after a minimal number of selection rounds by precisely controlling the target concentration and washing stringency. In terms of improving aptamer pool analysis, our group was the first to use high-throughput sequencing (HTS) for the discovery of new aptamers. We showed that tracking the enrichment trajectory of individual aptamer sequences enables the identification of high-performing aptamers without requiring full convergence of the selected aptamer pool. HTS is now widely used for aptamer discovery, and open-source software has become available to facilitate analysis. To improve binding characterization, we used HTS data to design custom aptamer arrays to measure the affinity and specificity of up to ∼10(4) DNA aptamers in parallel as a means to rapidly discover high-quality aptamers. Most recently, our efforts have culminated in the invention of the "particle display" (PD) screening system, which transforms solution-phase aptamers into "aptamer particles" that can be individually screened at high-throughput via fluorescence-activated cell sorting. Using PD, we have shown the feasibility of rapidly generating aptamers with exceptional affinities, even for proteins that have previously proven intractable to aptamer discovery. We are confident that these advanced aptamer-discovery methods will accelerate the discovery of aptamer reagents with excellent affinities and specificities, perhaps even exceeding those of the best monoclonal antibodies. Since aptamers are reproducible, renewable, stable, and can be distributed as sequence information, we anticipate that these affinity reagents will become even more valuable tools for both research and clinical applications.

  1. Quantifying enzymatic lysis: estimating the combined effects of chemistry, physiology and physics.

    PubMed

    Mitchell, Gabriel J; Nelson, Daniel C; Weitz, Joshua S

    2010-10-04

    The number of microbial pathogens resistant to antibiotics continues to increase even as the rate of discovery and approval of new antibiotic therapeutics steadily decreases. Many researchers have begun to investigate the therapeutic potential of naturally occurring lytic enzymes as an alternative to traditional antibiotics. However, direct characterization of lytic enzymes using techniques based on synthetic substrates is often difficult because lytic enzymes bind to the complex superstructure of intact cell walls. Here we present a new standard for the analysis of lytic enzymes based on turbidity assays which allow us to probe the dynamics of lysis without preparing a synthetic substrate. The challenge in the analysis of these assays is to infer the microscopic details of lysis from macroscopic turbidity data. We propose a model of enzymatic lysis that integrates the chemistry responsible for bond cleavage with the physical mechanisms leading to cell wall failure. We then present a solution to an inverse problem in which we estimate reaction rate constants and the heterogeneous susceptibility to lysis among target cells. We validate our model given simulated and experimental turbidity assays. The ability to estimate reaction rate constants for lytic enzymes will facilitate their biochemical characterization and development as antimicrobial therapeutics.

  2. Appreciative Inquiry for quality improvement in primary care practices.

    PubMed

    Ruhe, Mary C; Bobiak, Sarah N; Litaker, David; Carter, Caroline A; Wu, Laura; Schroeder, Casey; Zyzanski, Stephen J; Weyer, Sharon M; Werner, James J; Fry, Ronald E; Stange, Kurt C

    2011-01-01

    To test the effect of an Appreciative Inquiry (AI) quality improvement strategy on clinical quality management and practice development outcomes. Appreciative inquiry enables the discovery of shared motivations, envisioning a transformed future, and learning around the implementation of a change process. Thirty diverse primary care practices were randomly assigned to receive an AI-based intervention focused on a practice-chosen topic and on improving preventive service delivery (PSD) rates. Medical-record review assessed change in PSD rates. Ethnographic field notes and observational checklist analysis used editing and immersion/crystallization methods to identify factors affecting intervention implementation and practice development outcomes. The PSD rates did not change. Field note analysis suggested that the intervention elicited core motivations, facilitated development of a shared vision, defined change objectives, and fostered respectful interactions. Practices most likely to implement the intervention or develop new practice capacities exhibited 1 or more of the following: support from key leader(s), a sense of urgency for change, a mission focused on serving patients, health care system and practice flexibility, and a history of constructive practice change. An AI approach and enabling practice conditions can lead to intervention implementation and practice development by connecting individual and practice strengths and motivations to the change objective.

  3. Appreciative Inquiry for Quality Improvement in Primary Care Practices

    PubMed Central

    Ruhe, Mary C.; Bobiak, Sarah N.; Litaker, David; Carter, Caroline A.; Wu, Laura; Schroeder, Casey; Zyzanski, Stephen; Weyer, Sharon M.; Werner, James J.; Fry, Ronald E.; Stange, Kurt C.

    2014-01-01

    Purpose To test the effect of an Appreciative Inquiry (AI) quality improvement strategy, on clinical quality management and practice development outcomes. AI enables discovery of shared motivations, envisioning a transformed future, and learning around implementation of a change process. Methods Thirty diverse primary care practices were randomly assigned to receive an AI-based intervention focused on a practice-chosen topic and on improving preventive service delivery (PSD) rates. Medical record review assessed change in PSD rates. Ethnographic fieldnotes and observational checklist analysis used editing and immersion/crystallization methods to identify factors affecting intervention implementation and practice development outcomes. Results PSD rates did not change. Field note analysis suggested that the intervention elicited core motivations, facilitated development of a shared vision, defined change objectives and fostered respectful interactions. Practices most likely to implement the intervention or develop new practice capacities exhibited one or more of the following: support from key leader(s), a sense of urgency for change, a mission focused on serving patients, health care system and practice flexibility, and a history of constructive practice change. Conclusions An AI approach and enabling practice conditions can lead to intervention implementation and practice development by connecting individual and practice strengths and motivations to the change objective. PMID:21192206

  4. Lysobacter species: a potential source of novel antibiotics.

    PubMed

    Panthee, Suresh; Hamamoto, Hiroshi; Paudel, Atmika; Sekimizu, Kazuhisa

    2016-11-01

    Infectious diseases threaten global health due to the ability of microbes to acquire resistance against clinically used antibiotics. Continuous discovery of antibiotics with a novel mode of action is thus required. Actinomycetes and fungi are currently the major sources of antibiotics, but the decreasing rate of discovery of novel antibiotics suggests that the focus should be changed to previously untapped groups of microbes. Lysobacter species have a genome size of ~6 Mb with a relatively high G + C content of 61-70 % and are characterized by their ability to produce peptides that damage the cell walls or membranes of other microbes. Genome sequence analysis revealed that each Lysobacter species has gene clusters for the production of 12-16 secondary metabolites, most of which are peptides, thus making them 'peptide production specialists'. Given that the number of antibiotics isolated is much lower than the number of gene clusters harbored, further intensive studies of Lysobacter are likely to unearth novel antibiotics with profound biomedical applications. In this review, we summarize the structural diversity, activity and biosynthesis of lysobacterial antibiotics and highlight the importance of Lysobacter species for antibiotic production.

  5. Latin America Report

    DTIC Science & Technology

    1985-07-25

    renovation is not a recent discovery . In May 1984, I also rejected /Mrs Peron’s/ offer to appoint me to the tactical command that she created, and I...been marked by emphasis placed on greater discoveries of reserves. For example, at present, the proven crude supplies will suffice to cover only 14... Cobre and Mid-Claren- don and provide other irrigation facilities where they are necessary throughout the country. ’■ 4 Special rate of electricity

  6. Challenges in Biomarker Discovery: Combining Expert Insights with Statistical Analysis of Complex Omics Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McDermott, Jason E.; Wang, Jing; Mitchell, Hugh D.

    2013-01-01

    The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities both for purely statistical and expert knowledge-based approaches and would benefit from improved integration of the two. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges thatmore » have been encountered. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to biomarker discovery and characterization are key to future success in the biomarker field. We will describe our recommendations of possible approaches to this problem including metrics for the evaluation of biomarkers.« less

  7. Applied metabolomics in drug discovery.

    PubMed

    Cuperlovic-Culf, M; Culf, A S

    2016-08-01

    The metabolic profile is a direct signature of phenotype and biochemical activity following any perturbation. Metabolites are small molecules present in a biological system including natural products as well as drugs and their metabolism by-products depending on the biological system studied. Metabolomics can provide activity information about possible novel drugs and drug scaffolds, indicate interesting targets for drug development and suggest binding partners of compounds. Furthermore, metabolomics can be used for the discovery of novel natural products and in drug development. Metabolomics can enhance the discovery and testing of new drugs and provide insight into the on- and off-target effects of drugs. This review focuses primarily on the application of metabolomics in the discovery of active drugs from natural products and the analysis of chemical libraries and the computational analysis of metabolic networks. Metabolomics methodology, both experimental and analytical is fast developing. At the same time, databases of compounds are ever growing with the inclusion of more molecular and spectral information. An increasing number of systems are being represented by very detailed metabolic network models. Combining these experimental and computational tools with high throughput drug testing and drug discovery techniques can provide new promising compounds and leads.

  8. Citation Discovery Tools for Conducting Adaptive Meta-analyses to Update Systematic Reviews.

    PubMed

    Bae, Jong-Myon; Kim, Eun Hee

    2016-03-01

    The systematic review (SR) is a research methodology that aims to synthesize related evidence. Updating previously conducted SRs is necessary when new evidence has been produced, but no consensus has yet emerged on the appropriate update methodology. The authors have developed a new SR update method called 'adaptive meta-analysis' (AMA) using the 'cited by', 'similar articles', and 'related articles' citation discovery tools in the PubMed and Scopus databases. This study evaluates the usefulness of these citation discovery tools for updating SRs. Lists were constructed by applying the citation discovery tools in the two databases to the articles analyzed by a published SR. The degree of overlap between the lists and distribution of excluded results were evaluated. The articles ultimately selected for the SR update meta-analysis were found in the lists obtained from the 'cited by' and 'similar' tools in PubMed. Most of the selected articles appeared in both the 'cited by' lists in Scopus and PubMed. The Scopus 'related' tool did not identify the appropriate articles. The AMA, which involves using both citation discovery tools in PubMed, and optionally, the 'related' tool in Scopus, was found to be useful for updating an SR.

  9. A biological compression model and its applications.

    PubMed

    Cao, Minh Duc; Dix, Trevor I; Allison, Lloyd

    2011-01-01

    A biological compression model, expert model, is presented which is superior to existing compression algorithms in both compression performance and speed. The model is able to compress whole eukaryotic genomes. Most importantly, the model provides a framework for knowledge discovery from biological data. It can be used for repeat element discovery, sequence alignment and phylogenetic analysis. We demonstrate that the model can handle statistically biased sequences and distantly related sequences where conventional knowledge discovery tools often fail.

  10. [Genetic mutation databases: stakes and perspectives for orphan genetic diseases].

    PubMed

    Humbertclaude, V; Tuffery-Giraud, S; Bareil, C; Thèze, C; Paulet, D; Desmet, F-O; Hamroun, D; Baux, D; Girardet, A; Collod-Béroud, G; Khau Van Kien, P; Roux, A-F; des Georges, M; Béroud, C; Claustres, M

    2010-10-01

    New technologies, which constantly become available for mutation detection and gene analysis, have contributed to an exponential rate of discovery of disease genes and variation in the human genome. The task of collecting and documenting this enormous amount of data in genetic databases represents a major challenge for the future of biological and medical science. The Locus Specific Databases (LSDBs) are so far the most efficient mutation databases. This review presents the main types of databases available for the analysis of mutations responsible for genetic disorders, as well as open perspectives for new therapeutic research or challenges for future medicine. Accurate and exhaustive collection of variations in human genomes will be crucial for research and personalized delivery of healthcare. Copyright © 2009 Elsevier Masson SAS. All rights reserved.

  11. Multi-parameter phenotypic profiling: using cellular effects to characterize small-molecule compounds.

    PubMed

    Feng, Yan; Mitchison, Timothy J; Bender, Andreas; Young, Daniel W; Tallarico, John A

    2009-07-01

    Multi-parameter phenotypic profiling of small molecules provides important insights into their mechanisms of action, as well as a systems level understanding of biological pathways and their responses to small molecule treatments. It therefore deserves more attention at an early step in the drug discovery pipeline. Here, we summarize the technologies that are currently in use for phenotypic profiling--including mRNA-, protein- and imaging-based multi-parameter profiling--in the drug discovery context. We think that an earlier integration of phenotypic profiling technologies, combined with effective experimental and in silico target identification approaches, can improve success rates of lead selection and optimization in the drug discovery process.

  12. Placental Proteomics: A Shortcut to Biological Insight

    PubMed Central

    Robinson, John M.; Vandré, Dale D.; Ackerman, William E.

    2012-01-01

    Proteomics analysis of biological samples has the potential to identify novel protein expression patterns and/or changes in protein expression patterns in different developmental or disease states. An important component of successful proteomics research, at least in its present form, is to reduce the complexity of the sample if it is derived from cells or tissues. One method to simplify complex tissues is to focus on a specific, highly purified sub-proteome. Using this approach we have developed methods to prepare highly enriched fractions of the apical plasma membrane of the syncytiotrophoblast. Through proteomics analysis of this fraction we have identified over five hundred proteins several of which were previously not known to reside in the syncytiotrophoblast. Herein, we focus on two of these, dysferlin and myoferlin. These proteins, largely known from studies of skeletal muscle, may not have been found in the human placenta were it not for discovery-based proteomics analysis. This new knowledge, acquired through a discovery-driven approach, can now be applied for the generation of hypothesis-based experimentation. Thus discovery-based and hypothesis-based research are complimentary approaches that when coupled together can hasten scientific discoveries. PMID:19070895

  13. Linkage effects between deposit discovery and postdiscovery exploratory drilling

    USGS Publications Warehouse

    Drew, Lawrence J.

    1975-01-01

    For the 1950-71 period of petroleum exploration in the Powder River Basin, northeastern Wyoming and southeastern Montana, three specific topics were investigated. First, the wildcat wells drilled during the ambient phases of exploration are estimated to have discovered 2.80 times as much petroleum per well as the wildcat wells drilled during the cyclical phases of exploration, periods when exploration plays were active. Second, the hypothesis was tested and verified that during ambient phases of exploration the discovery of deposits could be anticipated by a small but statistically significant rise in the ambient drilling rate during the year prior to the year of discovery. Closer examination of the data suggests that this anticipation effect decreases through time. Third, a regression model utilizing the two independent variables of (1) the volume of petroleum contained in each deposit discovered in a cell and the directly adjacent cells and (2) the respective depths of these deposits was constructed to predict the expected yearly cyclical wildcat drilling rate in four 30 by 30 min (approximately 860 mi2) sized cells. In two of these cells relatively large volumes of petroleum were discovered, whereas in the other two cells smaller volumes were discovered. The predicted and actual rates of wildcat drilling which occurred in each cell agreed rather closely.

  14. iPTF14yb: The First Discovery of a Gamma-Ray Burst Afterglow Independent of a High-Energy Trigger

    DOE PAGES

    Cenko, S. Bradley; Urban, Alex L.; Perley, Daniel A.; ...

    2015-04-20

    We report here the discovery by the Intermediate Palomar Transient Factory (iPTF) of iPTF14yb, a luminous (Msub>r ≈ ₋27.8 mag), cosmological (redshift 1.9733), rapidly fading optical transient. We demonstrate, based on probabilistic arguments and a comparison with the broader population, that iPTF14yb is the optical afterglow of the long-duration gamma-ray burst GRB140226A. This marks the rst unambiguous discovery of a GRB afterglow prior to (and thus entirely independent of) an associated high-energy trigger. We estimate the rate of iPTF14yb-like sources (i.e., cosmologically dis- tant relativistic explosions) based on iPTF observations, inferring an all-sky value ofmore » $$R_{rel}$$ = 610yr -1 (68% con dence interval of 110{2000 yr -1). Our derived rate is consistent (within the large uncer- tainty) with the all-sky rate of on-axis GRBs derived by the Swift satellite. Finally, we brie y discuss the implications of the nondetection to date of bona de \\orphan" afterglows (i.e., those lacking de- tectable high-energy emission) on GRB beaming and the degree of baryon loading in these relativistic jets.« less

  15. Long-Period Planets in Open Clusters and the Evolution of Planetary Systems

    NASA Astrophysics Data System (ADS)

    Quinn, Samuel N.; White, Russel; Latham, David W.; Stefanik, Robert

    2018-01-01

    Recent discoveries of giant planets in open clusters confirm that they do form and migrate in relatively dense stellar groups, though overall occurrence rates are not yet well constrained because the small sample of giant planets discovered thus far predominantly have short periods. Moreover, planet formation rates and the architectures of planetary systems in clusters may vary significantly -- e.g., due to intercluster differences in the chemical properties that regulate the growth of planetary embryos or in the stellar space density and binary populations, which can influence the dynamical evolution of planetary systems. Constraints on the population of long-period Jovian planets -- those representing the reservoir from which many hot Jupiters likely form, and which are most vulnerable to intracluster dynamical interactions -- can help quantify how the birth environment affects formation and evolution, particularly through comparison of populations possessing a range of ages and chemical and dynamical properties. From our ongoing RV survey of open clusters, we present the discovery of several long-period planets and candidate substellar companions in the Praesepe, Coma Berenices, and Hyades open clusters. From these discoveries, we improve estimates of giant planet occurrence rates in clusters, and we note that high eccentricities in several of these systems support the prediction that the birth environment helps shape planetary system architectures.

  16. Discovery radiomics via evolutionary deep radiomic sequencer discovery for pathologically proven lung cancer detection.

    PubMed

    Shafiee, Mohammad Javad; Chung, Audrey G; Khalvati, Farzad; Haider, Masoom A; Wong, Alexander

    2017-10-01

    While lung cancer is the second most diagnosed form of cancer in men and women, a sufficiently early diagnosis can be pivotal in patient survival rates. Imaging-based, or radiomics-driven, detection methods have been developed to aid diagnosticians, but largely rely on hand-crafted features that may not fully encapsulate the differences between cancerous and healthy tissue. Recently, the concept of discovery radiomics was introduced, where custom abstract features are discovered from readily available imaging data. We propose an evolutionary deep radiomic sequencer discovery approach based on evolutionary deep intelligence. Motivated by patient privacy concerns and the idea of operational artificial intelligence, the evolutionary deep radiomic sequencer discovery approach organically evolves increasingly more efficient deep radiomic sequencers that produce significantly more compact yet similarly descriptive radiomic sequences over multiple generations. As a result, this framework improves operational efficiency and enables diagnosis to be run locally at the radiologist's computer while maintaining detection accuracy. We evaluated the evolved deep radiomic sequencer (EDRS) discovered via the proposed evolutionary deep radiomic sequencer discovery framework against state-of-the-art radiomics-driven and discovery radiomics methods using clinical lung CT data with pathologically proven diagnostic data from the LIDC-IDRI dataset. The EDRS shows improved sensitivity (93.42%), specificity (82.39%), and diagnostic accuracy (88.78%) relative to previous radiomics approaches.

  17. How Will We React to the Discovery of Extraterrestrial Life?

    PubMed

    Kwon, Jung Yul; Bercovici, Hannah L; Cunningham, Katja; Varnum, Michael E W

    2017-01-01

    How will humanity react to the discovery of extraterrestrial life? Speculation on this topic abounds, but empirical research is practically non-existent. We report the results of three empirical studies assessing psychological reactions to the discovery of extraterrestrial life using the Linguistic Inquiry and Word Count (LIWC) text analysis software. We examined language use in media coverage of past discovery announcements of this nature, with a focus on extraterrestrial microbial life (Pilot Study). A large online sample ( N = 501) was asked to write about their own and humanity's reaction to a hypothetical announcement of such a discovery (Study 1), and an independent, large online sample ( N = 256) was asked to read and respond to a newspaper story about the claim that fossilized extraterrestrial microbial life had been found in a meteorite of Martian origin (Study 2). Across these studies, we found that reactions were significantly more positive than negative, and more reward vs. risk oriented. A mini-meta-analysis revealed large overall effect sizes (positive vs. negative affect language: g = 0.98; reward vs. risk language: g = 0.81). We also found that people's forecasts of their own reactions showed a greater positivity bias than their forecasts of humanity's reactions (Study 1), and that responses to reading an actual announcement of the discovery of extraterrestrial microbial life showed a greater positivity bias than responses to reading an actual announcement of the creation of man-made synthetic life (Study 2). Taken together, this work suggests that our reactions to a future confirmed discovery of microbial extraterrestrial life are likely to be fairly positive.

  18. How Will We React to the Discovery of Extraterrestrial Life?

    PubMed Central

    Kwon, Jung Yul; Bercovici, Hannah L.; Cunningham, Katja; Varnum, Michael E. W.

    2018-01-01

    How will humanity react to the discovery of extraterrestrial life? Speculation on this topic abounds, but empirical research is practically non-existent. We report the results of three empirical studies assessing psychological reactions to the discovery of extraterrestrial life using the Linguistic Inquiry and Word Count (LIWC) text analysis software. We examined language use in media coverage of past discovery announcements of this nature, with a focus on extraterrestrial microbial life (Pilot Study). A large online sample (N = 501) was asked to write about their own and humanity’s reaction to a hypothetical announcement of such a discovery (Study 1), and an independent, large online sample (N = 256) was asked to read and respond to a newspaper story about the claim that fossilized extraterrestrial microbial life had been found in a meteorite of Martian origin (Study 2). Across these studies, we found that reactions were significantly more positive than negative, and more reward vs. risk oriented. A mini-meta-analysis revealed large overall effect sizes (positive vs. negative affect language: g = 0.98; reward vs. risk language: g = 0.81). We also found that people’s forecasts of their own reactions showed a greater positivity bias than their forecasts of humanity’s reactions (Study 1), and that responses to reading an actual announcement of the discovery of extraterrestrial microbial life showed a greater positivity bias than responses to reading an actual announcement of the creation of man-made synthetic life (Study 2). Taken together, this work suggests that our reactions to a future confirmed discovery of microbial extraterrestrial life are likely to be fairly positive. PMID:29367849

  19. Order Under Uncertainty: Robust Differential Expression Analysis Using Probabilistic Models for Pseudotime Inference

    PubMed Central

    Campbell, Kieran R.

    2016-01-01

    Single cell gene expression profiling can be used to quantify transcriptional dynamics in temporal processes, such as cell differentiation, using computational methods to label each cell with a ‘pseudotime’ where true time series experimentation is too difficult to perform. However, owing to the high variability in gene expression between individual cells, there is an inherent uncertainty in the precise temporal ordering of the cells. Pre-existing methods for pseudotime estimation have predominantly given point estimates precluding a rigorous analysis of the implications of uncertainty. We use probabilistic modelling techniques to quantify pseudotime uncertainty and propagate this into downstream differential expression analysis. We demonstrate that reliance on a point estimate of pseudotime can lead to inflated false discovery rates and that probabilistic approaches provide greater robustness and measures of the temporal resolution that can be obtained from pseudotime inference. PMID:27870852

  20. Discovery of four recessive developmental disorders using probabilistic genotype and phenotype matching among 4,125 families

    PubMed Central

    Ansari, Morad; Balasubramanian, Meena; Blyth, Moira; Brady, Angela F.; Clayton, Stephen; Cole, Trevor; Deshpande, Charu; Fitzgerald, Tomas W.; Foulds, Nicola; Francis, Richard; Gabriel, George; Gerety, Sebastian S.; Goodship, Judith; Hobson, Emma; Jones, Wendy D.; Joss, Shelagh; King, Daniel; Klena, Nikolai; Kumar, Ajith; Lees, Melissa; Lelliott, Chris; Lord, Jenny; McMullan, Dominic; O'Regan, Mary; Osio, Deborah; Piombo, Virginia; Prigmore, Elena; Rajan, Diana; Rosser, Elisabeth; Sifrim, Alejandro; Smith, Audrey; Swaminathan, Ganesh J.; Turnpenny, Peter; Whitworth, James; Wright, Caroline F.; Firth, Helen V.; Barrett, Jeffrey C.; Lo, Cecilia W.; FitzPatrick, David R.; Hurles, Matthew E.

    2018-01-01

    Discovery of most autosomal recessive disease genes has involved analysis of large, often consanguineous, multiplex families or small cohorts of unrelated individuals with a well-defined clinical condition. Discovery of novel dominant causes of rare, genetically heterogenous developmental disorders has been revolutionized by exome analysis of large cohorts of phenotypically diverse parent-offspring trios 1,2. Here we analysed 4,125 families with diverse, rare, genetically heterogeneous developmental disorders and identified four novel autosomal recessive disorders. These four disorders were identified by integrating Mendelian filtering (identifying probands with rare biallelic putatively damaging variants in the same gene) with statistical assessments of (i) the likelihood of sampling the observed genotypes from the general population, and (ii) the phenotypic similarity of patients with the same recessive candidate gene. This new paradigm promises to catalyse discovery of novel recessive disorders, especially those with less consistent or nonspecific clinical presentations, and those caused predominantly by compound heterozygous genotypes. PMID:26437029

  1. Discovery of four recessive developmental disorders using probabilistic genotype and phenotype matching among 4,125 families.

    PubMed

    Akawi, Nadia; McRae, Jeremy; Ansari, Morad; Balasubramanian, Meena; Blyth, Moira; Brady, Angela F; Clayton, Stephen; Cole, Trevor; Deshpande, Charu; Fitzgerald, Tomas W; Foulds, Nicola; Francis, Richard; Gabriel, George; Gerety, Sebastian S; Goodship, Judith; Hobson, Emma; Jones, Wendy D; Joss, Shelagh; King, Daniel; Klena, Nikolai; Kumar, Ajith; Lees, Melissa; Lelliott, Chris; Lord, Jenny; McMullan, Dominic; O'Regan, Mary; Osio, Deborah; Piombo, Virginia; Prigmore, Elena; Rajan, Diana; Rosser, Elisabeth; Sifrim, Alejandro; Smith, Audrey; Swaminathan, Ganesh J; Turnpenny, Peter; Whitworth, James; Wright, Caroline F; Firth, Helen V; Barrett, Jeffrey C; Lo, Cecilia W; FitzPatrick, David R; Hurles, Matthew E

    2015-11-01

    Discovery of most autosomal recessive disease-associated genes has involved analysis of large, often consanguineous multiplex families or small cohorts of unrelated individuals with a well-defined clinical condition. Discovery of new dominant causes of rare, genetically heterogeneous developmental disorders has been revolutionized by exome analysis of large cohorts of phenotypically diverse parent-offspring trios. Here we analyzed 4,125 families with diverse, rare and genetically heterogeneous developmental disorders and identified four new autosomal recessive disorders. These four disorders were identified by integrating Mendelian filtering (selecting probands with rare, biallelic and putatively damaging variants in the same gene) with statistical assessments of (i) the likelihood of sampling the observed genotypes from the general population and (ii) the phenotypic similarity of patients with recessive variants in the same candidate gene. This new paradigm promises to catalyze the discovery of novel recessive disorders, especially those with less consistent or nonspecific clinical presentations and those caused predominantly by compound heterozygous genotypes.

  2. Simultaneous Proteomic Discovery and Targeted Monitoring using Liquid Chromatography, Ion Mobility Spectrometry, and Mass Spectrometry*

    PubMed Central

    Burnum-Johnson, Kristin E.; Nie, Song; Casey, Cameron P.; Monroe, Matthew E.; Orton, Daniel J.; Ibrahim, Yehia M.; Gritsenko, Marina A.; Clauss, Therese R. W.; Shukla, Anil K.; Moore, Ronald J.; Purvine, Samuel O.; Shi, Tujin; Qian, Weijun; Liu, Tao; Baker, Erin S.; Smith, Richard D.

    2016-01-01

    Current proteomic approaches include both broad discovery measurements and quantitative targeted analyses. In many cases, discovery measurements are initially used to identify potentially important proteins (e.g. candidate biomarkers) and then targeted studies are employed to quantify a limited number of selected proteins. Both approaches, however, suffer from limitations. Discovery measurements aim to sample the whole proteome but have lower sensitivity, accuracy, and quantitation precision than targeted approaches, whereas targeted measurements are significantly more sensitive but only sample a limited portion of the proteome. Herein, we describe a new approach that performs both discovery and targeted monitoring (DTM) in a single analysis by combining liquid chromatography, ion mobility spectrometry and mass spectrometry (LC-IMS-MS). In DTM, heavy labeled target peptides are spiked into tryptic digests and both the labeled and unlabeled peptides are detected using LC-IMS-MS instrumentation. Compared with the broad LC-MS discovery measurements, DTM yields greater peptide/protein coverage and detects lower abundance species. DTM also achieved detection limits similar to selected reaction monitoring (SRM) indicating its potential for combined high quality discovery and targeted analyses, which is a significant step toward the convergence of discovery and targeted approaches. PMID:27670688

  3. Lessons from hot spot analysis for fragment-based drug discovery

    PubMed Central

    Hall, David R.; Vajda, Sandor

    2015-01-01

    Analysis of binding energy hot spots at protein surfaces can provide crucial insights into the prospects for successful application of fragment-based drug discovery (FBDD), and whether a fragment hit can be advanced into a high affinity, druglike ligand. The key factor is the strength of the top ranking hot spot, and how well a given fragment complements it. We show that published data are sufficient to provide a sophisticated and quantitative understanding of how hot spots derive from protein three-dimensional structure, and how their strength, number and spatial arrangement govern the potential for a surface site to bind to fragment-sized and larger ligands. This improved understanding provides important guidance for the effective application of FBDD in drug discovery. PMID:26538314

  4. Getting physical to fix pharma

    NASA Astrophysics Data System (ADS)

    Connelly, Patrick R.; Vuong, T. Minh; Murcko, Mark A.

    2011-09-01

    Powerful technologies allow the synthesis and testing of large numbers of new compounds, but the failure rate of pharmaceutical R&D remains very high. Greater understanding of the fundamental physical chemical behaviour of molecules could be the key to greatly enhancing the success rate of drug discovery.

  5. Subjective Cognitive Decline Is Associated With Altered Default Mode Network Connectivity in Individuals With a Family History of Alzheimer's Disease.

    PubMed

    Verfaillie, Sander C J; Pichet Binette, Alexa; Vachon-Presseau, Etienne; Tabrizi, Shirin; Savard, Mélissa; Bellec, Pierre; Ossenkoppele, Rik; Scheltens, Philip; van der Flier, Wiesje M; Breitner, John C S; Villeneuve, Sylvia

    2018-05-01

    Both subjective cognitive decline (SCD) and a family history of Alzheimer's disease (AD) portend risk of brain abnormalities and progression to dementia. Posterior default mode network (pDMN) connectivity is altered early in the course of AD. It is unclear whether SCD predicts similar outcomes in cognitively normal individuals with a family history of AD. We studied 124 asymptomatic individuals with a family history of AD (age 64 ± 5 years). Participants were categorized as having SCD if they reported that their memory was becoming worse (SCD + ). We used extensive neuropsychological assessment to investigate five different cognitive domain performances at baseline (n = 124) and 1 year later (n = 59). We assessed interconnectivity among three a priori defined ROIs: pDMN, anterior ventral DMN, medial temporal memory system (MTMS), and the connectivity of each with the rest of brain. Sixty-eight (55%) participants reported SCD. Baseline cognitive performance was comparable between groups (all false discovery rate-adjusted p values > .05). At follow-up, immediate and delayed memory improved across groups, but the improvement in immediate memory was reduced in SCD + compared with SCD - (all false discovery rate-adjusted p values < .05). When compared with SCD - , SCD + subjects showed increased pDMN-MTMS connectivity (false discovery rate-adjusted p < .05). Higher connectivity between the MTMS and the rest of the brain was associated with better baseline immediate memory, attention, and global cognition, whereas higher MTMS and pDMN-MTMS connectivity were associated with lower immediate memory over time (all false discovery rate-adjusted p values < .05). SCD in cognitively normal individuals is associated with diminished immediate memory practice effects and a brain connectivity pattern that mirrors early AD-related connectivity failure. Copyright © 2017 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  6. Statistical testing and power analysis for brain-wide association study.

    PubMed

    Gong, Weikang; Wan, Lin; Lu, Wenlian; Ma, Liang; Cheng, Fan; Cheng, Wei; Grünewald, Stefan; Feng, Jianfeng

    2018-04-05

    The identification of connexel-wise associations, which involves examining functional connectivities between pairwise voxels across the whole brain, is both statistically and computationally challenging. Although such a connexel-wise methodology has recently been adopted by brain-wide association studies (BWAS) to identify connectivity changes in several mental disorders, such as schizophrenia, autism and depression, the multiple correction and power analysis methods designed specifically for connexel-wise analysis are still lacking. Therefore, we herein report the development of a rigorous statistical framework for connexel-wise significance testing based on the Gaussian random field theory. It includes controlling the family-wise error rate (FWER) of multiple hypothesis testings using topological inference methods, and calculating power and sample size for a connexel-wise study. Our theoretical framework can control the false-positive rate accurately, as validated empirically using two resting-state fMRI datasets. Compared with Bonferroni correction and false discovery rate (FDR), it can reduce false-positive rate and increase statistical power by appropriately utilizing the spatial information of fMRI data. Importantly, our method bypasses the need of non-parametric permutation to correct for multiple comparison, thus, it can efficiently tackle large datasets with high resolution fMRI images. The utility of our method is shown in a case-control study. Our approach can identify altered functional connectivities in a major depression disorder dataset, whereas existing methods fail. A software package is available at https://github.com/weikanggong/BWAS. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Analysis student self efficacy in terms of using Discovery Learning model with SAVI approach

    NASA Astrophysics Data System (ADS)

    Sahara, Rifki; Mardiyana, S., Dewi Retno Sari

    2017-12-01

    Often students are unable to prove their academic achievement optimally according to their abilities. One reason is that they often feel unsure that they are capable of completing the tasks assigned to them. For students, such beliefs are necessary. The term belief has called self efficacy. Self efficacy is not something that has brought about by birth or something with permanent quality of an individual, but is the result of cognitive processes, the meaning one's self efficacy will be stimulated through learning activities. Self efficacy has developed and enhanced by a learning model that can stimulate students to foster confidence in their capabilities. One of them is by using Discovery Learning model with SAVI approach. Discovery Learning model with SAVI approach is one of learning models that involves the active participation of students in exploring and discovering their own knowledge and using it in problem solving by utilizing all the sensory devices they have. This naturalistic qualitative research aims to analyze student self efficacy in terms of use the Discovery Learning model with SAVI approach. The subjects of this study are 30 students focused on eight students who have high, medium, and low self efficacy obtained through purposive sampling technique. The data analysis of this research used three stages, that were reducing, displaying, and getting conclusion of the data. Based on the results of data analysis, it was concluded that the self efficacy appeared dominantly on the learning by using Discovery Learning model with SAVI approach is magnitude dimension.

  8. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery.

    PubMed

    Hickey, John M; Chiurugwi, Tinashe; Mackay, Ian; Powell, Wayne

    2017-08-30

    The rate of annual yield increases for major staple crops must more than double relative to current levels in order to feed a predicted global population of 9 billion by 2050. Controlled hybridization and selective breeding have been used for centuries to adapt plant and animal species for human use. However, achieving higher, sustainable rates of improvement in yields in various species will require renewed genetic interventions and dramatic improvement of agricultural practices. Genomic prediction of breeding values has the potential to improve selection, reduce costs and provide a platform that unifies breeding approaches, biological discovery, and tools and methods. Here we compare and contrast some animal and plant breeding approaches to make a case for bringing the two together through the application of genomic selection. We propose a strategy for the use of genomic selection as a unifying approach to deliver innovative 'step changes' in the rate of genetic gain at scale.

  9. SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

    USDA-ARS?s Scientific Manuscript database

    The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....

  10. The Discovery-Oriented Approach to Organic Chemistry. 6. Selective Reduction in Organic Chemistry: Reduction of Aldehydes in the Presence of Esters Using Sodium Borohydride

    ERIC Educational Resources Information Center

    Baru, Ashvin R.; Mohan, Ram S.

    2005-01-01

    A discovery-oriented lab experiment is developed that illustrates the chemoselective nature of reductions using sodium borohydride. Products are of sufficient purity to allow analysis by spectroscopy without further purification.

  11. The Gulf of Guinea and its Strategic Center Point: How Nigeria Will Bridge American and African Cooperation

    DTIC Science & Technology

    2009-04-01

    Services , Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware...Within four years, there were 43 additional discoveries , the highest rate of any location in the world.13 Deep-water oil fields provide the region and...for continued discoveries of high quality crude oil is extremely likely, spurring interest and development in the region. The geographic

  12. Metabolic characterization of cultured mammalian cells by mass balance analysis, tracer labeling experiments and computer-aided simulations.

    PubMed

    Okahashi, Nobuyuki; Kohno, Susumu; Kitajima, Shunsuke; Matsuda, Fumio; Takahashi, Chiaki; Shimizu, Hiroshi

    2015-12-01

    Studying metabolic directions and flow rates in cultured mammalian cells can provide key information for understanding metabolic function in the fields of cancer research, drug discovery, stem cell biology, and antibody production. In this work, metabolic engineering methodologies including medium component analysis, (13)C-labeling experiments, and computer-aided simulation analysis were applied to characterize the metabolic phenotype of soft tissue sarcoma cells derived from p53-null mice. Cells were cultured in medium containing [1-(13)C] glutamine to assess the level of reductive glutamine metabolism via the reverse reaction of isocitrate dehydrogenase (IDH). The specific uptake and production rates of glucose, organic acids, and the 20 amino acids were determined by time-course analysis of cultured media. Gas chromatography-mass spectrometry analysis of the (13)C-labeling of citrate, succinate, fumarate, malate, and aspartate confirmed an isotopically steady state of the cultured cells. After removing the effect of naturally occurring isotopes, the direction of the IDH reaction was determined by computer-aided analysis. The results validated that metabolic engineering methodologies are applicable to soft tissue sarcoma cells derived from p53-null mice, and also demonstrated that reductive glutamine metabolism is active in p53-null soft tissue sarcoma cells under normoxia. Copyright © 2015 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  13. The Production of 3D Tumor Spheroids for Cancer Drug Discovery

    PubMed Central

    Sant, Shilpa; Johnston, Paul A.

    2017-01-01

    New cancer drug approval rates are ≤ 5% despite significant investments in cancer research, drug discovery and development. One strategy to improve the rate of success of new cancer drugs transitioning into the clinic would be to more closely align the cellular models used in the early lead discovery with pre-clinical animal models and patient tumors. For solid tumors, this would mandate the development and implementation of three dimensional (3D) in vitro tumor models that more accurately recapitulate human solid tumor architecture and biology. Recent advances in tissue engineering and regenerative medicine have provided new techniques for 3D spheroid generation and a variety of in vitro 3D cancer models are being explored for cancer drug discovery. Although homogeneous assay methods and high content imaging approaches to assess tumor spheroid morphology, growth and viability have been developed, the implementation of 3D models in HTS remains challenging due to reasons that we discuss in this review. Perhaps the biggest obstacle to achieve acceptable HTS assay performance metrics occurs in 3D tumor models that produce spheroids with highly variable morphologies and/or sizes. We highlight two methods that produce uniform size-controlled 3D multicellular tumor spheroids that are compatible with cancer drug research and HTS; tumor spheroids formed in ultra-low attachment microplates, or in polyethylene glycol dimethacrylate hydrogel microwell arrays. PMID:28647083

  14. Concepts of formal concept analysis

    NASA Astrophysics Data System (ADS)

    Žáček, Martin; Homola, Dan; Miarka, Rostislav

    2017-07-01

    The aim of this article is apply of Formal Concept Analysis on concept of world. Formal concept analysis (FCA) as a methodology of data analysis, information management and knowledge representation has potential to be applied to a verity of linguistic problems. FCA is mathematical theory for concepts and concept hierarchies that reflects an understanding of concept. Formal concept analysis explicitly formalizes extension and intension of a concept, their mutual relationships. A distinguishing feature of FCA is an inherent integration of three components of conceptual processing of data and knowledge, namely, the discovery and reasoning with concepts in data, discovery and reasoning with dependencies in data, and visualization of data, concepts, and dependencies with folding/unfolding capabilities.

  15. Advances in synthetic peptides reagent discovery

    NASA Astrophysics Data System (ADS)

    Adams, Bryn L.; Sarkes, Deborah A.; Finch, Amethist S.; Stratis-Cullum, Dimitra N.

    2013-05-01

    Bacterial display technology offers a number of advantages over competing display technologies (e.g, phage) for the rapid discovery and development of peptides with interaction targeted to materials ranging from biological hazards through inorganic metals. We have previously shown that discovery of synthetic peptide reagents utilizing bacterial display technology is relatively simple and rapid to make laboratory automation possible. This included extensive study of the protective antigen system of Bacillus anthracis, including development of discovery, characterization, and computational biology capabilities for in-silico optimization. Although the benefits towards CBD goals are evident, the impact is far-reaching due to our ability to understand and harness peptide interactions that are ultimately extendable to the hybrid biomaterials of the future. In this paper, we describe advances in peptide discovery including, new target systems (e.g. non-biological materials), advanced library development and clone analysis including integrated reporting.

  16. Application of Combination High-Throughput Phenotypic Screening and Target Identification Methods for the Discovery of Natural Product-Based Combination Drugs.

    PubMed

    Isgut, Monica; Rao, Mukkavilli; Yang, Chunhua; Subrahmanyam, Vangala; Rida, Padmashree C G; Aneja, Ritu

    2018-03-01

    Modern drug discovery efforts have had mediocre success rates with increasing developmental costs, and this has encouraged pharmaceutical scientists to seek innovative approaches. Recently with the rise of the fields of systems biology and metabolomics, network pharmacology (NP) has begun to emerge as a new paradigm in drug discovery, with a focus on multiple targets and drug combinations for treating disease. Studies on the benefits of drug combinations lay the groundwork for a renewed focus on natural products in drug discovery. Natural products consist of a multitude of constituents that can act on a variety of targets in the body to induce pharmacodynamic responses that may together culminate in an additive or synergistic therapeutic effect. Although natural products cannot be patented, they can be used as starting points in the discovery of potent combination therapeutics. The optimal mix of bioactive ingredients in natural products can be determined via phenotypic screening. The targets and molecular mechanisms of action of these active ingredients can then be determined using chemical proteomics, and by implementing a reverse pharmacokinetics approach. This review article provides evidence supporting the potential benefits of natural product-based combination drugs, and summarizes drug discovery methods that can be applied to this class of drugs. © 2017 Wiley Periodicals, Inc.

  17. Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset.

    PubMed

    Ramus, Claire; Hovasse, Agnès; Marcellin, Marlène; Hesse, Anne-Marie; Mouton-Barbosa, Emmanuelle; Bouyssié, David; Vaca, Sebastian; Carapito, Christine; Chaoui, Karima; Bruley, Christophe; Garin, Jérôme; Cianférani, Sarah; Ferro, Myriam; Van Dorssaeler, Alain; Burlet-Schiltz, Odile; Schaeffer, Christine; Couté, Yohann; Gonzalez de Peredo, Anne

    2016-01-30

    Proteomic workflows based on nanoLC-MS/MS data-dependent-acquisition analysis have progressed tremendously in recent years. High-resolution and fast sequencing instruments have enabled the use of label-free quantitative methods, based either on spectral counting or on MS signal analysis, which appear as an attractive way to analyze differential protein expression in complex biological samples. However, the computational processing of the data for label-free quantification still remains a challenge. Here, we used a proteomic standard composed of an equimolar mixture of 48 human proteins (Sigma UPS1) spiked at different concentrations into a background of yeast cell lysate to benchmark several label-free quantitative workflows, involving different software packages developed in recent years. This experimental design allowed to finely assess their performances in terms of sensitivity and false discovery rate, by measuring the number of true and false-positive (respectively UPS1 or yeast background proteins found as differential). The spiked standard dataset has been deposited to the ProteomeXchange repository with the identifier PXD001819 and can be used to benchmark other label-free workflows, adjust software parameter settings, improve algorithms for extraction of the quantitative metrics from raw MS data, or evaluate downstream statistical methods. Bioinformatic pipelines for label-free quantitative analysis must be objectively evaluated in their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. This can be done through the use of complex spiked samples, for which the "ground truth" of variant proteins is known, allowing a statistical evaluation of the performances of the data processing workflow. We provide here such a controlled standard dataset and used it to evaluate the performances of several label-free bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, for detection of variant proteins with different absolute expression levels and fold change values. The dataset presented here can be useful for tuning software tool parameters, and also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches

    PubMed Central

    Romer, Katherine A.; Kayombya, Guy-Richard; Fraenkel, Ernest

    2007-01-01

    WebMOTIFS provides a web interface that facilitates the discovery and analysis of DNA-sequence motifs. Several studies have shown that the accuracy of motif discovery can be significantly improved by using multiple de novo motif discovery programs and using randomized control calculations to identify the most significant motifs or by using Bayesian approaches. WebMOTIFS makes it easy to apply these strategies. Using a single submission form, users can run several motif discovery programs and score, cluster and visualize the results. In addition, the Bayesian motif discovery program THEME can be used to determine the class of transcription factors that is most likely to regulate a set of sequences. Input can be provided as a list of gene or probe identifiers. Used with the default settings, WebMOTIFS accurately identifies biologically relevant motifs from diverse data in several species. WebMOTIFS is freely available at http://fraenkel.mit.edu/webmotifs. PMID:17584794

  19. Human Disease Models in Drosophila melanogaster and the Role of the Fly in Therapeutic Drug Discovery

    PubMed Central

    Pandey, Udai Bhan

    2011-01-01

    The common fruit fly, Drosophila melanogaster, is a well studied and highly tractable genetic model organism for understanding molecular mechanisms of human diseases. Many basic biological, physiological, and neurological properties are conserved between mammals and D. melanogaster, and nearly 75% of human disease-causing genes are believed to have a functional homolog in the fly. In the discovery process for therapeutics, traditional approaches employ high-throughput screening for small molecules that is based primarily on in vitro cell culture, enzymatic assays, or receptor binding assays. The majority of positive hits identified through these types of in vitro screens, unfortunately, are found to be ineffective and/or toxic in subsequent validation experiments in whole-animal models. New tools and platforms are needed in the discovery arena to overcome these limitations. The incorporation of D. melanogaster into the therapeutic discovery process holds tremendous promise for an enhanced rate of discovery of higher quality leads. D. melanogaster models of human diseases provide several unique features such as powerful genetics, highly conserved disease pathways, and very low comparative costs. The fly can effectively be used for low- to high-throughput drug screens as well as in target discovery. Here, we review the basic biology of the fly and discuss models of human diseases and opportunities for therapeutic discovery for central nervous system disorders, inflammatory disorders, cardiovascular disease, cancer, and diabetes. We also provide information and resources for those interested in pursuing fly models of human disease, as well as those interested in using D. melanogaster in the drug discovery process. PMID:21415126

  20. Genomics and transcriptomics in drug discovery.

    PubMed

    Dopazo, Joaquin

    2014-02-01

    The popularization of genomic high-throughput technologies is causing a revolution in biomedical research and, particularly, is transforming the field of drug discovery. Systems biology offers a framework to understand the extensive human genetic heterogeneity revealed by genomic sequencing in the context of the network of functional, regulatory and physical protein-drug interactions. Thus, approaches to find biomarkers and therapeutic targets will have to take into account the complex system nature of the relationships of the proteins with the disease. Pharmaceutical companies will have to reorient their drug discovery strategies considering the human genetic heterogeneity. Consequently, modeling and computational data analysis will have an increasingly important role in drug discovery. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Sports Stars: Analyzing the Performance of Astronomers at Visualization-based Discovery

    NASA Astrophysics Data System (ADS)

    Fluke, C. J.; Parrington, L.; Hegarty, S.; MacMahon, C.; Morgan, S.; Hassan, A. H.; Kilborn, V. A.

    2017-05-01

    In this data-rich era of astronomy, there is a growing reliance on automated techniques to discover new knowledge. The role of the astronomer may change from being a discoverer to being a confirmer. But what do astronomers actually look at when they distinguish between “sources” and “noise?” What are the differences between novice and expert astronomers when it comes to visual-based discovery? Can we identify elite talent or coach astronomers to maximize their potential for discovery? By looking to the field of sports performance analysis, we consider an established, domain-wide approach, where the expertise of the viewer (i.e., a member of the coaching team) plays a crucial role in identifying and determining the subtle features of gameplay that provide a winning advantage. As an initial case study, we investigate whether the SportsCode performance analysis software can be used to understand and document how an experienced Hi astronomer makes discoveries in spectral data cubes. We find that the process of timeline-based coding can be applied to spectral cube data by mapping spectral channels to frames within a movie. SportsCode provides a range of easy to use methods for annotation, including feature-based codes and labels, text annotations associated with codes, and image-based drawing. The outputs, including instance movies that are uniquely associated with coded events, provide the basis for a training program or team-based analysis that could be used in unison with discipline specific analysis software. In this coordinated approach to visualization and analysis, SportsCode can act as a visual notebook, recording the insight and decisions in partnership with established analysis methods. Alternatively, in situ annotation and coding of features would be a valuable addition to existing and future visualization and analysis packages.

  2. Statistics of Petroleum Exploration in the World Outside the United States and Canada Through 2001

    USGS Publications Warehouse

    Attanasi, E.D.; Freeman, P.A.; Glovier, Jennifer A.

    2007-01-01

    Future oil and gas supplies depend, in part, on the reserves that are expected to be added through exploration and new discoveries. This Circular presents a summary of the statistics and an analysis of petroleum exploration in the world outside the United States and Canada (the study area) through 2001. It updates U.S. Geological Survey Circular 1096 (by E.D. Attanasi and D.H. Root, 1993) and expands coverage of the statistics to areas where drilling and discovery data have recently become available. These new areas include China, the formerly Communist countries of Eastern Europe, and the countries that once were part of the former Soviet Union in Europe and Asia. Data are presented by country but are organized by petroleum provinces delineated by the U.S. Geological Survey World Energy Assessment Team (USGS Digital Data Series DDS?60, published in 2000). The data and analysis are presented in maps and graphs, providing a visual summary of the exploration maturity of an area. The maps show the delineated prospective areas and explored areas through 2001; explored areas have a drilling density that would rule out the occurrence of undetected large petroleum accumulations. Graphs summarize the exploration yields in terms of cumulative recoverable discovered oil and gas by delineated prospective area. From 1992 through 2001 in areas outside the United States and Canada, the delineated prospective area expanded at a rate of about 50,000 square miles per year while the explored area grew at the rate of about 11,000 square miles per year. The delineated prospective area established by 1970 contains about 75 percent of the oil discovered to date in the study area. This area is slightly less than 40 percent of the delineated prospective area established through 2001. Maps and graphs show the extension of the delineated prospective area to deepwater areas offshore of Brazil and West Africa. From 1991 through 2000, offshore discoveries accounted for 59 percent of the oil and 77 percent of the gas discovered in the study area. The petroleum industry's decision to incur the greater costs of moving offshore and into deeper waters appears to be a response to the absence of onshore prospects of comparable quality. Where natural gas can be commercially developed and marketed, data show an expansion of exploration to target gas-prone areas.

  3. Enhanced HTS hit selection via a local hit rate analysis.

    PubMed

    Posner, Bruce A; Xi, Hualin; Mills, James E J

    2009-10-01

    The postprocessing of high-throughput screening (HTS) results is complicated by the occurrence of false positives (inactive compounds misidentified as active by the primary screen) and false negatives (active compounds misidentified as inactive by the primary screen). An activity cutoff is frequently used to select "active" compounds from HTS data; however, this approach is insensitive to both false positives and false negatives. An alternative method that can minimize the occurrence of these artifacts will increase the efficiency of hit selection and therefore lead discovery. In this work, rather than merely using the activity of a given compound, we look at the presence and absence of activity among all compounds in its "chemical space neighborhood" to give a degree of confidence in its activity. We demonstrate that this local hit rate (LHR) analysis method outperforms hit selection based on ranking by primary screen activity values across ten diverse high throughput screens, spanning both cell-based and biochemical assay formats of varying biology and robustness. On average, the local hit rate analysis method was approximately 2.3-fold and approximately 1.3-fold more effective in identifying active compounds and active chemical series, respectively, than selection based on primary activity alone. Moreover, when applied to finding false negatives, this method was 2.3-fold better than ranking by primary activity alone. In most cases, novel hit series were identified that would have otherwise been missed. Additional uses of and observations regarding this HTS analysis approach are also discussed.

  4. ADDME – Avoiding Drug Development Mistakes Early: central nervous system drug discovery perspective

    PubMed Central

    Tsaioun, Katya; Bottlaender, Michel; Mabondzo, Aloise

    2009-01-01

    The advent of early absorption, distribution, metabolism, excretion, and toxicity (ADMET) screening has increased the attrition rate of weak drug candidates early in the drug-discovery process, and decreased the proportion of compounds failing in clinical trials for ADMET reasons. This paper reviews the history of ADMET screening and its place in pharmaceutical development, and central nervous system drug discovery in particular. Assays that have been developed in response to specific needs and improvements in technology that result in higher throughput and greater accuracy of prediction of human mechanisms of absorption and toxicity are discussed. The paper concludes with the authors' forecast of new models that will better predict human efficacy and toxicity. PMID:19534730

  5. How to revive breakthrough innovation in the pharmaceutical industry.

    PubMed

    Munos, Bernard H; Chin, William W

    2011-06-29

    Over the past 20 years, pharmaceutical companies have implemented conservative management practices to improve the predictability of therapeutics discovery and success rates of drug candidates. This approach has often yielded compounds that are only marginally better than existing therapies, yet require larger, longer, and more complex trials. To fund them, companies have shifted resources away from drug discovery to late clinical development; this has hurt innovation and amplified the crisis brought by the expiration of patents on many best-selling drugs. Here, we argue that more breakthrough therapeutics will reach patients only if the industry ceases to pursue "safe" incremental innovation, re-engages in high-risk discovery research, and adopts collaborative innovation models that allow sharing of knowledge and costs among collaborators.

  6. Increasing body mass index predicts increasing difficulty, failure rate, and time to discovery of failure of epidural anesthesia in laboring patients.

    PubMed

    Kula, Ayse O; Riess, Matthias L; Ellinas, Elizabeth H

    2017-02-01

    Obese parturients both greatly benefit from neuraxial techniques, and may represent a technical challenge to obstetric anesthesiologists. Several studies address the topic of obesity and neuraxial analgesia in general, but few offer well described definitions or rates of "difficulty" and "failure" of labor epidural analgesia. Providing those definitions, we hypothesized that increasing body mass index (BMI) is associated with negative outcomes in both categories and increased time needed for epidural placement. Single center retrospective chart review. Labor and Delivery Unit of an inner city academic teaching hospital. 2485 parturients, ASA status 2 to 4, receiving labor epidural analgesia for anticipated vaginal delivery. None. We reviewed quality assurance and anesthesia records over a 12-month period. "Failure" was defined as either inadequate analgesia or a positive test dose, requiring replacement, and/or when the anesthesia record stated they failed. "Difficulty" was defined as six or more needle redirections or a note indicating difficulty in the anesthesia record. Overall epidural failure and difficulty rates were 4.3% and 3.0%, respectively. Patients with a BMI of 30kg/m 2 or higher had a higher chance of both failure and difficulty with two and almost three fold increases, respectively. Regression analysis indicated that failure was best predicted by BMI and less provider training while difficulty was best predicted by BMI. Additionally, increased BMI was associated with increased time of discovery of epidural catheter failure. Obesity is associated with increasing technical difficulty and failure of neuraxial analgesia for labor. Practitioners should consider allotting extra time for obese parturients in order to manage potential problems. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Identifying significant gene‐environment interactions using a combination of screening testing and hierarchical false discovery rate control

    PubMed Central

    Shen, Li; Saykin, Andrew J.; Williams, Scott M.; Moore, Jason H.

    2016-01-01

    ABSTRACT Although gene‐environment (G× E) interactions play an important role in many biological systems, detecting these interactions within genome‐wide data can be challenging due to the loss in statistical power incurred by multiple hypothesis correction. To address the challenge of poor power and the limitations of existing multistage methods, we recently developed a screening‐testing approach for G× E interaction detection that combines elastic net penalized regression with joint estimation to support a single omnibus test for the presence of G× E interactions. In our original work on this technique, however, we did not assess type I error control or power and evaluated the method using just a single, small bladder cancer data set. In this paper, we extend the original method in two important directions and provide a more rigorous performance evaluation. First, we introduce a hierarchical false discovery rate approach to formally assess the significance of individual G× E interactions. Second, to support the analysis of truly genome‐wide data sets, we incorporate a score statistic‐based prescreening step to reduce the number of single nucleotide polymorphisms prior to fitting the first stage penalized regression model. To assess the statistical properties of our method, we compare the type I error rate and statistical power of our approach with competing techniques using both simple simulation designs as well as designs based on real disease architectures. Finally, we demonstrate the ability of our approach to identify biologically plausible SNP‐education interactions relative to Alzheimer's disease status using genome‐wide association study data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). PMID:27578615

  8. Lessons from Hot Spot Analysis for Fragment-Based Drug Discovery.

    PubMed

    Hall, David R; Kozakov, Dima; Whitty, Adrian; Vajda, Sandor

    2015-11-01

    Analysis of binding energy hot spots at protein surfaces can provide crucial insights into the prospects for successful application of fragment-based drug discovery (FBDD), and whether a fragment hit can be advanced into a high-affinity, drug-like ligand. The key factor is the strength of the top ranking hot spot, and how well a given fragment complements it. We show that published data are sufficient to provide a sophisticated and quantitative understanding of how hot spots derive from a protein 3D structure, and how their strength, number, and spatial arrangement govern the potential for a surface site to bind to fragment-sized and larger ligands. This improved understanding provides important guidance for the effective application of FBDD in drug discovery. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, I-Min; Chu, Ken; Ratner, Anna

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorwaymore » to a new era in the discovery of novel molecules.« less

  10. Multiplicity Control in Structural Equation Modeling

    ERIC Educational Resources Information Center

    Cribbie, Robert A.

    2007-01-01

    Researchers conducting structural equation modeling analyses rarely, if ever, control for the inflated probability of Type I errors when evaluating the statistical significance of multiple parameters in a model. In this study, the Type I error control, power and true model rates of famsilywise and false discovery rate controlling procedures were…

  11. 37 CFR 351.5 - Discovery in royalty rate proceedings.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... proceedings. 351.5 Section 351.5 Patents, Trademarks, and Copyrights COPYRIGHT ROYALTY BOARD, LIBRARY OF... matter, not privileged, that is relevant to the claim or defense of any party. Relevant information need... information and materials. (1) In any royalty rate proceeding scheduled to commence prior to January 1, 2011...

  12. Targeted proteomics guided by label-free global proteome analysis in saliva reveal transition signatures from health to periodontal disease.

    PubMed

    Bostanci, Nagihan; Selevsek, Nathalie; Wolski, Witold; Grossmann, Jonas; Bao, Kai; Wahlander, Asa; Trachsel, Christian; Schlapbach, Ralph; Özturk, Veli Özgen; Afacan, Beral; Emingil, Gulnur; Belibasakis, Georgios N

    2018-04-02

    Periodontal diseases are among the most prevalent worldwide, but largely silent, chronic diseases. They affect the tooth-supporting tissues with multiple ramifications on life quality. Their early diagnosis is still challenging, due to lack of appropriate molecular diagnostic methods. Saliva offers a non-invasively collectable reservoir of clinically relevant biomarkers, which, if utilized efficiently, could facilitate early diagnosis and monitoring of ongoing disease. Despite several novel protein markers being recently enlisted by discovery proteomics, their routine diagnostic application is hampered by the lack of validation platforms that allow for rapid, accurate and simultaneous quantification of multiple proteins in large cohorts. We carried out a pipeline of two proteomic platforms; firstly, we applied open ended label-free quantitative (LFQ) proteomics for discovery in saliva (n=67, health, gingivitis, and periodontitis), followed by selected-reaction monitoring (SRM)-targeted proteomics for validation in an independent cohort (n=82). The LFQ platform led to the discovery of 119 proteins with at least two-fold significant difference between health and disease. The 65 proteins chosen for the subsequent SRM platform included 50 related proteins derived from the significantly enriched processes of the LFQ data, 11 from literature-mining, and four house-keeping ones. Among those, 60 were reproducibly quantifiable proteins (92% success rate), represented by a total of 143 peptides. Machine-learning modeling led to a narrowed-down panel of five proteins of high predictive value for periodontal diseases (higher in disease: Matrix metalloproteinase-9, Ras-related protein-1, Actin-related protein 2/3 complex subunit 5; lower in disease: Clusterin, Deleted in Malignant Brain Tumors 1), with maximum area under the receiver operating curve >0.97. This panel enriches the pool of credible clinical biomarker candidates for diagnostic assay development. Yet, the quantum leap brought in periodontal diagnostics by this study lies in the introduction of the well established discovery-through-verification pipeline for periodontal biomarker discovery and validation in further periodontal patient cohorts. Published under license by The American Society for Biochemistry and Molecular Biology, Inc.

  13. Sex-Specific Associations between Particulate Matter Exposure and Gene Expression in Independent Discovery and Validation Cohorts of Middle-Aged Men and Women

    PubMed Central

    Vrijens, Karen; Winckelmans, Ellen; Tsamou, Maria; Baeyens, Willy; De Boever, Patrick; Jennen, Danyel; de Kok, Theo M.; Den Hond, Elly; Lefebvre, Wouter; Plusquin, Michelle; Reynders, Hans; Schoeters, Greet; Van Larebeke, Nicolas; Vanpoucke, Charlotte; Kleinjans, Jos; Nawrot, Tim S.

    2016-01-01

    Background: Particulate matter (PM) exposure leads to premature death, mainly due to respiratory and cardiovascular diseases. Objectives: Identification of transcriptomic biomarkers of air pollution exposure and effect in a healthy adult population. Methods: Microarray analyses were performed in 98 healthy volunteers (48 men, 50 women). The expression of eight sex-specific candidate biomarker genes (significantly associated with PM10 in the discovery cohort and with a reported link to air pollution-related disease) was measured with qPCR in an independent validation cohort (75 men, 94 women). Pathway analysis was performed using Gene Set Enrichment Analysis. Average daily PM2.5 and PM10 exposures over 2-years were estimated for each participant’s residential address using spatiotemporal interpolation in combination with a dispersion model. Results: Average long-term PM10 was 25.9 (± 5.4) and 23.7 (± 2.3) μg/m3 in the discovery and validation cohorts, respectively. In discovery analysis, associations between PM10 and the expression of individual genes differed by sex. In the validation cohort, long-term PM10 was associated with the expression of DNAJB5 and EAPP in men and ARHGAP4 (p = 0.053) in women. AKAP6 and LIMK1 were significantly associated with PM10 in women, although associations differed in direction between the discovery and validation cohorts. Expression of the eight candidate genes in the discovery cohort differentiated between validation cohort participants with high versus low PM10 exposure (area under the receiver operating curve = 0.92; 95% CI: 0.85, 1.00; p = 0.0002 in men, 0.86; 95% CI: 0.76, 0.96; p = 0.004 in women). Conclusions: Expression of the sex-specific candidate genes identified in the discovery population predicted PM10 exposure in an independent cohort of adults from the same area. Confirmation in other populations may further support this as a new approach for exposure assessment, and may contribute to the discovery of molecular mechanisms for PM-induced health effects. Citation: Vrijens K, Winckelmans E, Tsamou M, Baeyens W, De Boever P, Jennen D, de Kok TM, Den Hond E, Lefebvre W, Plusquin M, Reynders H, Schoeters G, Van Larebeke N, Vanpoucke C, Kleinjans J, Nawrot TS. 2017. Sex-specific associations between particulate matter exposure and gene expression in independent discovery and validation cohorts of middle-aged men and women. Environ Health Perspect 125:660–669; http://dx.doi.org/10.1289/EHP370 PMID:27740511

  14. Discovery and development of new antibacterial drugs: learning from experience?

    PubMed

    Jackson, Nicole; Czaplewski, Lloyd; Piddock, Laura J V

    2018-06-01

    Antibiotic (antibacterial) resistance is a serious global problem and the need for new treatments is urgent. The current antibiotic discovery model is not delivering new agents at a rate that is sufficient to combat present levels of antibiotic resistance. This has led to fears of the arrival of a 'post-antibiotic era'. Scientific difficulties, an unfavourable regulatory climate, multiple company mergers and the low financial returns associated with antibiotic drug development have led to the withdrawal of many pharmaceutical companies from the field. The regulatory climate has now begun to improve, but major scientific hurdles still impede the discovery and development of novel antibacterial agents. To facilitate discovery activities there must be increased understanding of the scientific problems experienced by pharmaceutical companies. This must be coupled with addressing the current antibiotic resistance crisis so that compounds and ultimately drugs are delivered to treat the most urgent clinical challenges. By understanding the causes of the failures and successes of the pharmaceutical industry's research history, duplication of discovery programmes will be reduced, increasing the productivity of the antibiotic drug discovery pipeline by academia and small companies. The most important scientific issues to address are getting molecules into the Gram-negative bacterial cell and avoiding their efflux. Hence screening programmes should focus their efforts on whole bacterial cells rather than cell-free systems. Despite falling out of favour with pharmaceutical companies, natural product research still holds promise for providing new molecules as a basis for discovery.

  15. Simultaneous Proteomic Discovery and Targeted Monitoring using Liquid Chromatography, Ion Mobility Spectrometry, and Mass Spectrometry.

    PubMed

    Burnum-Johnson, Kristin E; Nie, Song; Casey, Cameron P; Monroe, Matthew E; Orton, Daniel J; Ibrahim, Yehia M; Gritsenko, Marina A; Clauss, Therese R W; Shukla, Anil K; Moore, Ronald J; Purvine, Samuel O; Shi, Tujin; Qian, Weijun; Liu, Tao; Baker, Erin S; Smith, Richard D

    2016-12-01

    Current proteomic approaches include both broad discovery measurements and quantitative targeted analyses. In many cases, discovery measurements are initially used to identify potentially important proteins (e.g. candidate biomarkers) and then targeted studies are employed to quantify a limited number of selected proteins. Both approaches, however, suffer from limitations. Discovery measurements aim to sample the whole proteome but have lower sensitivity, accuracy, and quantitation precision than targeted approaches, whereas targeted measurements are significantly more sensitive but only sample a limited portion of the proteome. Herein, we describe a new approach that performs both discovery and targeted monitoring (DTM) in a single analysis by combining liquid chromatography, ion mobility spectrometry and mass spectrometry (LC-IMS-MS). In DTM, heavy labeled target peptides are spiked into tryptic digests and both the labeled and unlabeled peptides are detected using LC-IMS-MS instrumentation. Compared with the broad LC-MS discovery measurements, DTM yields greater peptide/protein coverage and detects lower abundance species. DTM also achieved detection limits similar to selected reaction monitoring (SRM) indicating its potential for combined high quality discovery and targeted analyses, which is a significant step toward the convergence of discovery and targeted approaches. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  16. Robustness of disaggregate oil and gas discovery forecasting models

    USGS Publications Warehouse

    Attanasi, E.D.; Schuenemeyer, J.H.

    1989-01-01

    The trend in forecasting oil and gas discoveries has been to develop and use models that allow forecasts of the size distribution of future discoveries. From such forecasts, exploration and development costs can more readily be computed. Two classes of these forecasting models are the Arps-Roberts type models and the 'creaming method' models. This paper examines the robustness of the forecasts made by these models when the historical data on which the models are based have been subject to economic upheavals or when historical discovery data are aggregated from areas having widely differing economic structures. Model performance is examined in the context of forecasting discoveries for offshore Texas State and Federal areas. The analysis shows how the model forecasts are limited by information contained in the historical discovery data. Because the Arps-Roberts type models require more regularity in discovery sequence than the creaming models, prior information had to be introduced into the Arps-Roberts models to accommodate the influence of economic changes. The creaming methods captured the overall decline in discovery size but did not easily allow introduction of exogenous information to compensate for incomplete historical data. Moreover, the predictive log normal distribution associated with the creaming model methods appears to understate the importance of the potential contribution of small fields. ?? 1989.

  17. Discovery of Novel Mammary Developmental and Cancer Genes Using ENU Mutagenesis

    DTIC Science & Technology

    2002-10-01

    death rates we need new therapeutic targets, currently a major challenge facing cancer researchers This requires an understanding of the undiscovered pathways that operate to drive breast cancer cell proliferation, cell survival and cell differentiation, pathways which are also likely to operate during normal mammary development, and which go awry in cancer The discovery of signalling pathways operative in breast cancer has utilised examination of mammary gland development following systemic endocrine ablation or viral insertion, positional cloning in affected families and

  18. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library.

    PubMed

    Sánchez, Cecilia Castaño; Smith, Timothy P L; Wiedmann, Ralph T; Vallejo, Roger L; Salem, Mohamed; Yao, Jianbo; Rexroad, Caird E

    2009-11-25

    To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the validated markers were associated with rainbow trout transcripts. The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.

  19. GWATCH: a web platform for automated gene association discovery analysis.

    PubMed

    Svitin, Anton; Malov, Sergey; Cherkasov, Nikolay; Geerts, Paul; Rotkevich, Mikhail; Dobrynin, Pavel; Shevchenko, Andrey; Guan, Li; Troyer, Jennifer; Hendrickson, Sher; Dilks, Holli Hutcheson; Oleksyk, Taras K; Donfield, Sharyne; Gomperts, Edward; Jabs, Douglas A; Sezgin, Efe; Van Natta, Mark; Harrigan, P Richard; Brumme, Zabrina L; O'Brien, Stephen J

    2014-01-01

    As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Here we present a dynamic web-based platform - GWATCH - that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH.

  20. Genes-environment interactions in obesity- and diabetes-associated pancreatic cancer: a GWAS data analysis.

    PubMed

    Tang, Hongwei; Wei, Peng; Duell, Eric J; Risch, Harvey A; Olson, Sara H; Bueno-de-Mesquita, H Bas; Gallinger, Steven; Holly, Elizabeth A; Petersen, Gloria M; Bracci, Paige M; McWilliams, Robert R; Jenab, Mazda; Riboli, Elio; Tjønneland, Anne; Boutron-Ruault, Marie Christine; Kaaks, Rudolf; Trichopoulos, Dimitrios; Panico, Salvatore; Sund, Malin; Peeters, Petra H M; Khaw, Kay-Tee; Amos, Christopher I; Li, Donghui

    2014-01-01

    Obesity and diabetes are potentially alterable risk factors for pancreatic cancer. Genetic factors that modify the associations of obesity and diabetes with pancreatic cancer have previously not been examined at the genome-wide level. Using genome-wide association studies (GWAS) genotype and risk factor data from the Pancreatic Cancer Case Control Consortium, we conducted a discovery study of 2,028 cases and 2,109 controls to examine gene-obesity and gene-diabetes interactions in relation to pancreatic cancer risk by using the likelihood-ratio test nested in logistic regression models and Ingenuity Pathway Analysis (IPA). After adjusting for multiple comparisons, a significant interaction of the chemokine signaling pathway with obesity (P = 3.29 × 10(-6)) and a near significant interaction of calcium signaling pathway with diabetes (P = 1.57 × 10(-4)) in modifying the risk of pancreatic cancer were observed. These findings were supported by results from IPA analysis of the top genes with nominal interactions. The major contributing genes to the two top pathways include GNGT2, RELA, TIAM1, and GNAS. None of the individual genes or single-nucleotide polymorphism (SNP) except one SNP remained significant after adjusting for multiple testing. Notably, SNP rs10818684 of the PTGS1 gene showed an interaction with diabetes (P = 7.91 × 10(-7)) at a false discovery rate of 6%. Genetic variations in inflammatory response and insulin resistance may affect the risk of obesity- and diabetes-related pancreatic cancer. These observations should be replicated in additional large datasets. A gene-environment interaction analysis may provide new insights into the genetic susceptibility and molecular mechanisms of obesity- and diabetes-related pancreatic cancer.

  1. Serum metabolites are associated with all-cause mortality in chronic kidney disease.

    PubMed

    Hu, Jiun-Ruey; Coresh, Josef; Inker, Lesley A; Levey, Andrew S; Zheng, Zihe; Rebholz, Casey M; Tin, Adrienne; Appel, Lawrence J; Chen, Jingsha; Sarnak, Mark J; Grams, Morgan E

    2018-06-02

    Chronic kidney disease (CKD) involves significant metabolic abnormalities and has a high mortality rate. Because the levels of serum metabolites in patients with CKD might provide insight into subclinical disease states and risk for future mortality, we determined which serum metabolites reproducibly associate with mortality in CKD using a discovery and replication design. Metabolite levels were quantified via untargeted liquid chromatography and mass spectroscopy from serum samples of 299 patients with CKD in the Modification of Diet in Renal Disease (MDRD) study as a discovery cohort. Six among 622 metabolites were significantly associated with mortality over a median follow-up of 17 years after adjustment for demographic and clinical covariates, including urine protein and measured glomerular filtration rate. We then replicated associations with mortality in 963 patients with CKD from the African American Study of Kidney Disease and Hypertension (AASK) cohort over a median follow-up of ten years. Three of the six metabolites identified in the MDRD cohort replicated in the AASK cohort: fumarate, allantoin, and ribonate, belonging to energy, nucleotide, and carbohydrate pathways, respectively. Point estimates were similar in both studies and in meta-analysis (adjusted hazard ratios 1.63, 1.59, and 1.61, respectively, per doubling of the metabolite). Thus, selected serum metabolites were reproducibly associated with long-term mortality in CKD beyond markers of kidney function in two well characterized cohorts, providing targets for investigation. Copyright © 2018 International Society of Nephrology. Published by Elsevier Inc. All rights reserved.

  2. Cultivation of an obligate acidophilic ammonia oxidizer from a nitrifying acid soil

    PubMed Central

    Lehtovirta-Morley, Laura E.; Stoecker, Kilian; Vilcinskas, Andreas; Prosser, James I.; Nicol, Graeme W.

    2011-01-01

    Nitrification is a fundamental component of the global nitrogen cycle and leads to significant fertilizer loss and atmospheric and groundwater pollution. Nitrification rates in acidic soils (pH < 5.5), which comprise 30% of the world's soils, equal or exceed those of neutral soils. Paradoxically, autotrophic ammonia oxidizing bacteria and archaea, which perform the first stage in nitrification, demonstrate little or no growth in suspended liquid culture below pH 6.5, at which ammonia availability is reduced by ionization. Here we report the discovery and cultivation of a chemolithotrophic, obligately acidophilic thaumarchaeal ammonia oxidizer, “Candidatus Nitrosotalea devanaterra,” from an acidic agricultural soil. Phylogenetic analysis places the organism within a previously uncultivated thaumarchaeal lineage that has been observed in acidic soils. Growth of the organism is optimal in the pH range 4 to 5 and is restricted to the pH range 4 to 5.5, unlike all previously cultivated ammonia oxidizers. Growth of this organism and associated ammonia oxidation and autotrophy also occur during nitrification in soil at pH 4.5. The discovery of Nitrosotalea devanaterra provides a previously unsuspected explanation for high rates of nitrification in acidic soils, and confirms the vital role that thaumarchaea play in terrestrial nitrogen cycling. Growth at extremely low ammonia concentration (0.18 nM) also challenges accepted views on ammonia uptake and metabolism and indicates novel mechanisms for ammonia oxidation at low pH. PMID:21896746

  3. Cancer epigenetics drug discovery and development: the challenge of hitting the mark

    PubMed Central

    Campbell, Robert M.; Tummino, Peter J.

    2014-01-01

    Over the past several years, there has been rapidly expanding evidence of epigenetic dysregulation in cancer, in which histone and DNA modification play a critical role in tumor growth and survival. These findings have gained the attention of the drug discovery and development community, and offer the potential for a second generation of cancer epigenetic agents for patients following the approved “first generation” of DNA methylation (e.g., Dacogen, Vidaza) and broad-spectrum HDAC inhibitors (e.g., Vorinostat, Romidepsin). This Review provides an analysis of prospects for discovery and development of novel cancer agents that target epigenetic proteins. We will examine key examples of epigenetic dysregulation in tumors as well as challenges to epigenetic drug discovery with emerging biology and novel classes of drug targets. We will also highlight recent successes in cancer epigenetics drug discovery and consider important factors for clinical success in this burgeoning area. PMID:24382391

  4. Studies of a Next-Generation Silicon-Photomultiplier-Based Time-of-Flight PET/CT System.

    PubMed

    Hsu, David F C; Ilan, Ezgi; Peterson, William T; Uribe, Jorge; Lubberink, Mark; Levin, Craig S

    2017-09-01

    This article presents system performance studies for the Discovery MI PET/CT system, a new time-of-flight system based on silicon photomultipliers. System performance and clinical imaging were compared between this next-generation system and other commercially available PET/CT and PET/MR systems, as well as between different reconstruction algorithms. Methods: Spatial resolution, sensitivity, noise-equivalent counting rate, scatter fraction, counting rate accuracy, and image quality were characterized with the National Electrical Manufacturers Association NU-2 2012 standards. Energy resolution and coincidence time resolution were measured. Tests were conducted independently on two Discovery MI scanners installed at Stanford University and Uppsala University, and the results were averaged. Back-to-back patient scans were also performed between the Discovery MI, Discovery 690 PET/CT, and SIGNA PET/MR systems. Clinical images were reconstructed using both ordered-subset expectation maximization and Q.Clear (block-sequential regularized expectation maximization with point-spread function modeling) and were examined qualitatively. Results: The averaged full widths at half maximum (FWHMs) of the radial/tangential/axial spatial resolution reconstructed with filtered backprojection at 1, 10, and 20 cm from the system center were, respectively, 4.10/4.19/4.48 mm, 5.47/4.49/6.01 mm, and 7.53/4.90/6.10 mm. The averaged sensitivity was 13.7 cps/kBq at the center of the field of view. The averaged peak noise-equivalent counting rate was 193.4 kcps at 21.9 kBq/mL, with a scatter fraction of 40.6%. The averaged contrast recovery coefficients for the image-quality phantom were 53.7, 64.0, 73.1, 82.7, 86.8, and 90.7 for the 10-, 13-, 17-, 22-, 28-, and 37-mm-diameter spheres, respectively. The average photopeak energy resolution was 9.40% FWHM, and the average coincidence time resolution was 375.4 ps FWHM. Clinical image comparisons between the PET/CT systems demonstrated the high quality of the Discovery MI. Comparisons between the Discovery MI and SIGNA showed a similar spatial resolution and overall imaging performance. Lastly, the results indicated significantly enhanced image quality and contrast-to-noise performance for Q.Clear, compared with ordered-subset expectation maximization. Conclusion: Excellent performance was achieved with the Discovery MI, including 375 ps FWHM coincidence time resolution and sensitivity of 14 cps/kBq. Comparisons between reconstruction algorithms and other multimodal silicon photomultiplier and non-silicon photomultiplier PET detector system designs indicated that performance can be substantially enhanced with this next-generation system. © 2017 by the Society of Nuclear Medicine and Molecular Imaging.

  5. Adaptation of Decoy Fusion Strategy for Existing Multi-Stage Search Workflows

    NASA Astrophysics Data System (ADS)

    Ivanov, Mark V.; Levitsky, Lev I.; Gorshkov, Mikhail V.

    2016-09-01

    A number of proteomic database search engines implement multi-stage strategies aiming at increasing the sensitivity of proteome analysis. These approaches often employ a subset of the original database for the secondary stage of analysis. However, if target-decoy approach (TDA) is used for false discovery rate (FDR) estimation, the multi-stage strategies may violate the underlying assumption of TDA that false matches are distributed uniformly across the target and decoy databases. This violation occurs if the numbers of target and decoy proteins selected for the second search are not equal. Here, we propose a method of decoy database generation based on the previously reported decoy fusion strategy. This method allows unbiased TDA-based FDR estimation in multi-stage searches and can be easily integrated into existing workflows utilizing popular search engines and post-search algorithms.

  6. Mass univariate analysis of event-related brain potentials/fields I: a critical tutorial review.

    PubMed

    Groppe, David M; Urbach, Thomas P; Kutas, Marta

    2011-12-01

    Event-related potentials (ERPs) and magnetic fields (ERFs) are typically analyzed via ANOVAs on mean activity in a priori windows. Advances in computing power and statistics have produced an alternative, mass univariate analyses consisting of thousands of statistical tests and powerful corrections for multiple comparisons. Such analyses are most useful when one has little a priori knowledge of effect locations or latencies, and for delineating effect boundaries. Mass univariate analyses complement and, at times, obviate traditional analyses. Here we review this approach as applied to ERP/ERF data and four methods for multiple comparison correction: strong control of the familywise error rate (FWER) via permutation tests, weak control of FWER via cluster-based permutation tests, false discovery rate control, and control of the generalized FWER. We end with recommendations for their use and introduce free MATLAB software for their implementation. Copyright © 2011 Society for Psychophysiological Research.

  7. MAVTgsa: An R Package for Gene Set (Enrichment) Analysis

    DOE PAGES

    Chien, Chih-Yi; Chang, Ching-Wei; Tsai, Chen-An; ...

    2014-01-01

    Gene semore » t analysis methods aim to determine whether an a priori defined set of genes shows statistically significant difference in expression on either categorical or continuous outcomes. Although many methods for gene set analysis have been proposed, a systematic analysis tool for identification of different types of gene set significance modules has not been developed previously. This work presents an R package, called MAVTgsa, which includes three different methods for integrated gene set enrichment analysis. (1) The one-sided OLS (ordinary least squares) test detects coordinated changes of genes in gene set in one direction, either up- or downregulation. (2) The two-sided MANOVA (multivariate analysis variance) detects changes both up- and downregulation for studying two or more experimental conditions. (3) A random forests-based procedure is to identify gene sets that can accurately predict samples from different experimental conditions or are associated with the continuous phenotypes. MAVTgsa computes the P values and FDR (false discovery rate) q -value for all gene sets in the study. Furthermore, MAVTgsa provides several visualization outputs to support and interpret the enrichment results. This package is available online.« less

  8. Unconfirmed Near-Earth Objects

    NASA Astrophysics Data System (ADS)

    Vereš, Peter; Payne, Matthew J.; Holman, Matthew J.; Farnocchia, Davide; Williams, Gareth V.; Keys, Sonia; Boardman, Ian

    2018-07-01

    We studied the Near-Earth Asteroid (NEA) candidates posted on the Minor Planet Center’s Near-Earth Object Confirmation Page (NEOCP) between years 2013 and 2016. Out of more than 17000 NEA candidates, while the majority became either new discoveries or were associated with previously known objects, about 11% were unable to be followed-up or confirmed. We further demonstrate that of the unconfirmed candidates, 926 ± 50 are likely to be NEAs, representing 18% of discovered NEAs in that period. Only 11% (∼93) of the unconfirmed NEA candidates were large (having absolute magnitude H < 22). To identify the reasons why these NEAs were not recovered, we analyzed those from the most prolific asteroid surveys: Pan-STARRS, the Catalina Sky Survey, the Dark Energy Survey, and the Space Surveillance Telescope. We examined the influence of plane-of-sky positions and rates of motion, brightnesses, submission delays, and computed absolute magnitudes, as well as correlations with the phase of the moon and seasonal effects. We find that delayed submission of newly discovered NEA candidate to the NEOCP drove a large fraction of the unconfirmed NEA candidates. A high rate of motion was another significant contributing factor. We suggest that prompt submission of suspected NEA discoveries and rapid response to fast-moving targets and targets with fast growing ephemeris uncertainty would allow better coordination among dedicated follow-up observers, decrease the number of unconfirmed NEA candidates, and increase the discovery rate of NEAs.

  9. Early detection surveillance for an emerging plant pathogen: a rule of thumb to predict prevalence at first discovery.

    PubMed

    Parnell, S; Gottwald, T R; Cunniffe, N J; Alonso Chavez, V; van den Bosch, F

    2015-09-07

    Emerging plant pathogens are a significant problem for conservation and food security. Surveillance is often instigated in an attempt to detect an invading epidemic before it gets out of control. Yet in practice many epidemics are not discovered until already at a high prevalence, partly due to a lack of quantitative understanding of how surveillance effort and the dynamics of an invading epidemic relate. We test a simple rule of thumb to determine, for a surveillance programme taking a fixed number of samples at regular intervals, the distribution of the prevalence an epidemic will have reached on first discovery (discovery-prevalence) and its expectation E(q*). We show that E(q*) = r/(N/Δ), i.e. simply the rate of epidemic growth divided by the rate of sampling; where r is the epidemic growth rate, N is the sample size and Δ is the time between sampling rounds. We demonstrate the robustness of this rule of thumb using spatio-temporal epidemic models as well as data from real epidemics. Our work supports the view that, for the purposes of early detection surveillance, simple models can provide useful insights in apparently complex systems. The insight can inform decisions on surveillance resource allocation in plant health and has potential applicability to invasive species generally. © 2015 The Author(s).

  10. Early detection surveillance for an emerging plant pathogen: a rule of thumb to predict prevalence at first discovery

    PubMed Central

    Parnell, S.; Gottwald, T. R.; Cunniffe, N. J.; Alonso Chavez, V.; van den Bosch, F.

    2015-01-01

    Emerging plant pathogens are a significant problem for conservation and food security. Surveillance is often instigated in an attempt to detect an invading epidemic before it gets out of control. Yet in practice many epidemics are not discovered until already at a high prevalence, partly due to a lack of quantitative understanding of how surveillance effort and the dynamics of an invading epidemic relate. We test a simple rule of thumb to determine, for a surveillance programme taking a fixed number of samples at regular intervals, the distribution of the prevalence an epidemic will have reached on first discovery (discovery-prevalence) and its expectation E(q*). We show that E(q*) = r/(N/Δ), i.e. simply the rate of epidemic growth divided by the rate of sampling; where r is the epidemic growth rate, N is the sample size and Δ is the time between sampling rounds. We demonstrate the robustness of this rule of thumb using spatio-temporal epidemic models as well as data from real epidemics. Our work supports the view that, for the purposes of early detection surveillance, simple models can provide useful insights in apparently complex systems. The insight can inform decisions on surveillance resource allocation in plant health and has potential applicability to invasive species generally. PMID:26336177

  11. Poisson Statistics of Combinatorial Library Sampling Predict False Discovery Rates of Screening

    PubMed Central

    2017-01-01

    Microfluidic droplet-based screening of DNA-encoded one-bead-one-compound combinatorial libraries is a miniaturized, potentially widely distributable approach to small molecule discovery. In these screens, a microfluidic circuit distributes library beads into droplets of activity assay reagent, photochemically cleaves the compound from the bead, then incubates and sorts the droplets based on assay result for subsequent DNA sequencing-based hit compound structure elucidation. Pilot experimental studies revealed that Poisson statistics describe nearly all aspects of such screens, prompting the development of simulations to understand system behavior. Monte Carlo screening simulation data showed that increasing mean library sampling (ε), mean droplet occupancy, or library hit rate all increase the false discovery rate (FDR). Compounds identified as hits on k > 1 beads (the replicate k class) were much more likely to be authentic hits than singletons (k = 1), in agreement with previous findings. Here, we explain this observation by deriving an equation for authenticity, which reduces to the product of a library sampling bias term (exponential in k) and a sampling saturation term (exponential in ε) setting a threshold that the k-dependent bias must overcome. The equation thus quantitatively describes why each hit structure’s FDR is based on its k class, and further predicts the feasibility of intentionally populating droplets with multiple library beads, assaying the micromixtures for function, and identifying the active members by statistical deconvolution. PMID:28682059

  12. Enabling drug discovery project decisions with integrated computational chemistry and informatics

    NASA Astrophysics Data System (ADS)

    Tsui, Vickie; Ortwine, Daniel F.; Blaney, Jeffrey M.

    2017-03-01

    Computational chemistry/informatics scientists and software engineers in Genentech Small Molecule Drug Discovery collaborate with experimental scientists in a therapeutic project-centric environment. Our mission is to enable and improve pre-clinical drug discovery design and decisions. Our goal is to deliver timely data, analysis, and modeling to our therapeutic project teams using best-in-class software tools. We describe our strategy, the organization of our group, and our approaches to reach this goal. We conclude with a summary of the interdisciplinary skills required for computational scientists and recommendations for their training.

  13. Avoiding false discoveries in association studies.

    PubMed

    Sabatti, Chiara

    2007-01-01

    We consider the problem of controlling false discoveries in association studies. We assume that the design of the study is adequate so that the "false discoveries" are potentially only because of random chance, not to confounding or other flaws. Under this premise, we review the statistical framework for hypothesis testing and correction for multiple comparisons. We consider in detail the currently accepted strategies in linkage analysis. We then examine the underlying similarities and differences between linkage and association studies and document some of the most recent methodological developments for association mapping.

  14. Natural-product-derived fragments for fragment-based ligand discovery

    NASA Astrophysics Data System (ADS)

    Over, Björn; Wetzel, Stefan; Grütter, Christian; Nakai, Yasushi; Renner, Steffen; Rauh, Daniel; Waldmann, Herbert

    2013-01-01

    Fragment-based ligand and drug discovery predominantly employs sp2-rich compounds covering well-explored regions of chemical space. Despite the ease with which such fragments can be coupled, this focus on flat compounds is widely cited as contributing to the attrition rate of the drug discovery process. In contrast, biologically validated natural products are rich in stereogenic centres and populate areas of chemical space not occupied by average synthetic molecules. Here, we have analysed more than 180,000 natural product structures to arrive at 2,000 clusters of natural-product-derived fragments with high structural diversity, which resemble natural scaffolds and are rich in sp3-configured centres. The structures of the cluster centres differ from previously explored fragment libraries, but for nearly half of the clusters representative members are commercially available. We validate their usefulness for the discovery of novel ligand and inhibitor types by means of protein X-ray crystallography and the identification of novel stabilizers of inactive conformations of p38α MAP kinase and of inhibitors of several phosphatases.

  15. Organic synthesis provides opportunities to transform drug discovery

    NASA Astrophysics Data System (ADS)

    Blakemore, David C.; Castro, Luis; Churcher, Ian; Rees, David C.; Thomas, Andrew W.; Wilson, David M.; Wood, Anthony

    2018-04-01

    Despite decades of ground-breaking research in academia, organic synthesis is still a rate-limiting factor in drug-discovery projects. Here we present some current challenges in synthetic organic chemistry from the perspective of the pharmaceutical industry and highlight problematic steps that, if overcome, would find extensive application in the discovery of transformational medicines. Significant synthesis challenges arise from the fact that drug molecules typically contain amines and N-heterocycles, as well as unprotected polar groups. There is also a need for new reactions that enable non-traditional disconnections, more C-H bond activation and late-stage functionalization, as well as stereoselectively substituted aliphatic heterocyclic ring synthesis, C-X or C-C bond formation. We also emphasize that syntheses compatible with biomacromolecules will find increasing use, while new technologies such as machine-assisted approaches and artificial intelligence for synthesis planning have the potential to dramatically accelerate the drug-discovery process. We believe that increasing collaboration between academic and industrial chemists is crucial to address the challenges outlined here.

  16. Identification of ganglioside GM2 activator playing a role in cancer cell migration through proteomic analysis of breast cancer secretomes.

    PubMed

    Shin, Jihye; Kim, Gamin; Lee, Jong Won; Lee, Ji Eun; Kim, Yoo Seok; Yu, Jong-Han; Lee, Seung-Taek; Ahn, Sei Hyun; Kim, Hoguen; Lee, Cheolju

    2016-06-01

    Cancer cell secretomes are considered a potential source for the discovery of cancer markers. In this study, the secretomes of four breast cancer (BC) cell lines (Hs578T, MCF-7, MDA-MB-231, and SK-BR-3) were profiled with liquid chromatography-tandem mass spectrometry analysis. A total of 1410 proteins were identified with less than 1% false discovery rate, of which approximately 55% (796 proteins) were predicted to be secreted from cells. To find BC-specific proteins among the secreted proteins, data of immunohistochemical staining compiled in the Human Protein Atlas were investigated by comparing the data of BC tissues with those of normal tissues. By applying various criteria, including higher expression level in BC tissues, higher predicted potential of secretion, and sufficient number of tandem mass spectra, 12 biomarker candidate proteins including ganglioside GM2 activator (GM2A) were selected for confirmation. Western blot analysis and ELISA for plasma samples of healthy controls and BC patients revealed elevation of GM2A in BC patients, especially those who were estrogen receptor-negative. Additionally, siRNA-mediated knockdown of GM2A in BC cells decreased migration in vitro, whereas the overexpression of GM2A led to an increase in cell migration. Although GM2A as a diagnostic and prognostic marker in BC should be carefully verified further, this study has established the potential role of GM2A in BC progression. © 2016 The Authors. Cancer Science published by John Wiley & Sons Australia, Ltd on behalf of Japanese Cancer Association.

  17. Regulation of gene expression in the mammalian eye and its relevance to eye disease.

    PubMed

    Scheetz, Todd E; Kim, Kwang-Youn A; Swiderski, Ruth E; Philp, Alisdair R; Braun, Terry A; Knudtson, Kevin L; Dorrance, Anne M; DiBona, Gerald F; Huang, Jian; Casavant, Thomas L; Sheffield, Val C; Stone, Edwin M

    2006-09-26

    We used expression quantitative trait locus mapping in the laboratory rat (Rattus norvegicus) to gain a broad perspective of gene regulation in the mammalian eye and to identify genetic variation relevant to human eye disease. Of >31,000 gene probes represented on an Affymetrix expression microarray, 18,976 exhibited sufficient signal for reliable analysis and at least 2-fold variation in expression among 120 F(2) rats generated from an SR/JrHsd x SHRSP intercross. Genome-wide linkage analysis with 399 genetic markers revealed significant linkage with at least one marker for 1,300 probes (alpha = 0.001; estimated empirical false discovery rate = 2%). Both contiguous and noncontiguous loci were found to be important in regulating mammalian eye gene expression. We investigated one locus of each type in greater detail and identified putative transcription-altering variations in both cases. We found an inserted cREL binding sequence in the 5' flanking sequence of the Abca4 gene associated with an increased expression level of that gene, and we found a mutation of the gene encoding thyroid hormone receptor beta2 associated with a decreased expression level of the gene encoding short-wavelength sensitive opsin (Opn1sw). In addition to these positional studies, we performed a pairwise analysis of gene expression to identify genes that are regulated in a coordinated manner and used this approach to validate two previously undescribed genes involved in the human disease Bardet-Biedl syndrome. These data and analytical approaches can be used to facilitate the discovery of additional genes and regulatory elements involved in human eye disease.

  18. Clinical decision support alert malfunctions: analysis and empirically derived taxonomy.

    PubMed

    Wright, Adam; Ai, Angela; Ash, Joan; Wiesen, Jane F; Hickman, Thu-Trang T; Aaron, Skye; McEvoy, Dustin; Borkowsky, Shane; Dissanayake, Pavithra I; Embi, Peter; Galanter, William; Harper, Jeremy; Kassakian, Steve Z; Ramoni, Rachel; Schreiber, Richard; Sirajuddin, Anwar; Bates, David W; Sittig, Dean F

    2018-05-01

    To develop an empirically derived taxonomy of clinical decision support (CDS) alert malfunctions. We identified CDS alert malfunctions using a mix of qualitative and quantitative methods: (1) site visits with interviews of chief medical informatics officers, CDS developers, clinical leaders, and CDS end users; (2) surveys of chief medical informatics officers; (3) analysis of CDS firing rates; and (4) analysis of CDS overrides. We used a multi-round, manual, iterative card sort to develop a multi-axial, empirically derived taxonomy of CDS malfunctions. We analyzed 68 CDS alert malfunction cases from 14 sites across the United States with diverse electronic health record systems. Four primary axes emerged: the cause of the malfunction, its mode of discovery, when it began, and how it affected rule firing. Build errors, conceptualization errors, and the introduction of new concepts or terms were the most frequent causes. User reports were the predominant mode of discovery. Many malfunctions within our database caused rules to fire for patients for whom they should not have (false positives), but the reverse (false negatives) was also common. Across organizations and electronic health record systems, similar malfunction patterns recurred. Challenges included updates to code sets and values, software issues at the time of system upgrades, difficulties with migration of CDS content between computing environments, and the challenge of correctly conceptualizing and building CDS. CDS alert malfunctions are frequent. The empirically derived taxonomy formalizes the common recurring issues that cause these malfunctions, helping CDS developers anticipate and prevent CDS malfunctions before they occur or detect and resolve them expediently.

  19. MetaCoMET: a web platform for discovery and visualization of the core microbiome

    USDA-ARS?s Scientific Manuscript database

    A key component of the analysis of microbiome datasets is the identification of OTUs shared between multiple experimental conditions, commonly referred to as the core microbiome. Results: We present a web platform named MetaCoMET that enables the discovery and visualization of the core microbiome an...

  20. Teaching Tip: Using Rapid Game Prototyping for Exploring Requirements Discovery and Modeling

    ERIC Educational Resources Information Center

    Dalal, Nikunj

    2012-01-01

    We describe the use of rapid game prototyping as a pedagogic technique to experientially explore and learn requirements discovery, modeling, and specification in systems analysis and design courses. Students have a natural interest in gaming that transcends age, gender, and background. Rapid digital game creation is used to build computer games…

  1. Service-based analysis of biological pathways

    PubMed Central

    Zheng, George; Bouguettaya, Athman

    2009-01-01

    Background Computer-based pathway discovery is concerned with two important objectives: pathway identification and analysis. Conventional mining and modeling approaches aimed at pathway discovery are often effective at achieving either objective, but not both. Such limitations can be effectively tackled leveraging a Web service-based modeling and mining approach. Results Inspired by molecular recognitions and drug discovery processes, we developed a Web service mining tool, named PathExplorer, to discover potentially interesting biological pathways linking service models of biological processes. The tool uses an innovative approach to identify useful pathways based on graph-based hints and service-based simulation verifying user's hypotheses. Conclusion Web service modeling of biological processes allows the easy access and invocation of these processes on the Web. Web service mining techniques described in this paper enable the discovery of biological pathways linking these process service models. Algorithms presented in this paper for automatically highlighting interesting subgraph within an identified pathway network enable the user to formulate hypothesis, which can be tested out using our simulation algorithm that are also described in this paper. PMID:19796403

  2. Apparently low reproducibility of true differential expression discoveries in microarray studies.

    PubMed

    Zhang, Min; Yao, Chen; Guo, Zheng; Zou, Jinfeng; Zhang, Lin; Xiao, Hui; Wang, Dong; Yang, Da; Gong, Xue; Zhu, Jing; Li, Yanhui; Li, Xia

    2008-09-15

    Differentially expressed gene (DEG) lists detected from different microarray studies for a same disease are often highly inconsistent. Even in technical replicate tests using identical samples, DEG detection still shows very low reproducibility. It is often believed that current small microarray studies will largely introduce false discoveries. Based on a statistical model, we show that even in technical replicate tests using identical samples, it is highly likely that the selected DEG lists will be very inconsistent in the presence of small measurement variations. Therefore, the apparently low reproducibility of DEG detection from current technical replicate tests does not indicate low quality of microarray technology. We also demonstrate that heterogeneous biological variations existing in real cancer data will further reduce the overall reproducibility of DEG detection. Nevertheless, in small subsamples from both simulated and real data, the actual false discovery rate (FDR) for each DEG list tends to be low, suggesting that each separately determined list may comprise mostly true DEGs. Rather than simply counting the overlaps of the discovery lists from different studies for a complex disease, novel metrics are needed for evaluating the reproducibility of discoveries characterized with correlated molecular changes. Supplementaty information: Supplementary data are available at Bioinformatics online.

  3. FDR doesn't Tell the Whole Story: Joint Influence of Effect Size and Covariance Structure on the Distribution of the False Discovery Proportions

    NASA Technical Reports Server (NTRS)

    Feiveson, Alan H.; Ploutz-Snyder, Robert; Fiedler, James

    2011-01-01

    As part of a 2009 Annals of Statistics paper, Gavrilov, Benjamini, and Sarkar report results of simulations that estimated the false discovery rate (FDR) for equally correlated test statistics using a well-known multiple-test procedure. In our study we estimate the distribution of the false discovery proportion (FDP) for the same procedure under a variety of correlation structures among multiple dependent variables in a MANOVA context. Specifically, we study the mean (the FDR), skewness, kurtosis, and percentiles of the FDP distribution in the case of multiple comparisons that give rise to correlated non-central t-statistics when results at several time periods are being compared to baseline. Even if the FDR achieves its nominal value, other aspects of the distribution of the FDP depend on the interaction between signed effect sizes and correlations among variables, proportion of true nulls, and number of dependent variables. We show examples where the mean FDP (the FDR) is 10% as designed, yet there is a surprising probability of having 30% or more false discoveries. Thus, in a real experiment, the proportion of false discoveries could be quite different from the stipulated FDR.

  4. Evaluation of solar electric propulsion technologies for discovery class missions

    NASA Technical Reports Server (NTRS)

    Oh, David Y.

    2005-01-01

    A detailed study examines the potential benefits that advanced electric propulsion (EP) technologies offer to the cost-capped missions in NASA's Discovery program. The study looks at potential cost and performance benefits provided by three EP technologies that are currently in development: NASA's Evolutionary Xenon Thruster (NEXT), an Enhanced NSTAR system, and a Low Power Hall effect thruster. These systems are analyzed on three straw man Discovery class missions and their performance is compared to a state of the art system using the NSTAR ion thruster. An electric propulsion subsystem cost model is used to conduct a cost-benefit analysis for each option. The results show that each proposed technology offers a different degree of performance and/or cost benefit for Discovery class missions.

  5. Assessment of CFD Hypersonic Turbulent Heating Rates for Space Shuttle Orbiter

    NASA Technical Reports Server (NTRS)

    Wood, William A.; Oliver, A. Brandon

    2011-01-01

    Turbulent CFD codes are assessed for the prediction of convective heat transfer rates at turbulent, hypersonic conditions. Algebraic turbulence models are used within the DPLR and LAURA CFD codes. The benchmark heat transfer rates are derived from thermocouple measurements of the Space Shuttle orbiter Discovery windward tiles during the STS-119 and STS-128 entries. The thermocouples were located underneath the reaction-cured glass coating on the thermal protection tiles. Boundary layer transition flight experiments conducted during both of those entries promoted turbulent flow at unusually high Mach numbers, with the present analysis considering Mach 10{15. Similar prior comparisons of CFD predictions directly to the flight temperature measurements were unsatisfactory, showing diverging trends between prediction and measurement for Mach numbers greater than 11. In the prior work, surface temperatures and convective heat transfer rates had been assumed to be in radiative equilibrium. The present work employs a one-dimensional time-accurate conduction analysis to relate measured temperatures to surface heat transfer rates, removing heat soak lag from the flight data, in order to better assess the predictive accuracy of the numerical models. The turbulent CFD shows good agreement for turbulent fuselage flow up to Mach 13. But on the wing in the wake of the boundary layer trip, the inclusion of tile conduction effects does not explain the prior observed discrepancy in trends between simulation and experiment; the flight heat transfer measurements are roughly constant over Mach 11-15, versus an increasing trend with Mach number from the CFD.

  6. Combining data from multiple sources using the CUAHSI Hydrologic Information System

    NASA Astrophysics Data System (ADS)

    Tarboton, D. G.; Ames, D. P.; Horsburgh, J. S.; Goodall, J. L.

    2012-12-01

    The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) has developed a Hydrologic Information System (HIS) to provide better access to data by enabling the publication, cataloging, discovery, retrieval, and analysis of hydrologic data using web services. The CUAHSI HIS is an Internet based system comprised of hydrologic databases and servers connected through web services as well as software for data publication, discovery and access. The HIS metadata catalog lists close to 100 web services registered to provide data through this system, ranging from large federal agency data sets to experimental watersheds managed by University investigators. The system's flexibility in storing and enabling public access to similarly formatted data and metadata has created a community data resource from governmental and academic data that might otherwise remain private or analyzed only in isolation. Comprehensive understanding of hydrology requires integration of this information from multiple sources. HydroDesktop is the client application developed as part of HIS to support data discovery and access through this system. HydroDesktop is founded on an open source GIS client and has a plug-in architecture that has enabled the integration of modeling and analysis capability with the functionality for data discovery and access. Model integration is possible through a plug-in built on the OpenMI standard and data visualization and analysis is supported by an R plug-in. This presentation will demonstrate HydroDesktop, showing how it provides an analysis environment within which data from multiple sources can be discovered, accessed and integrated.

  7. Discovery: Under the Microscope at Kennedy Space Center

    NASA Technical Reports Server (NTRS)

    Howard, Philip M.

    2013-01-01

    The National Aeronautics & Space Administration (NASA) is known for discovery, exploration, and advancement of knowledge. Since the days of Leeuwenhoek, microscopy has been at the forefront of discovery and knowledge. No truer is that statement than today at Kennedy Space Center (KSC), where microscopy plays a major role in contamination identification and is an integral part of failure analysis. Space exploration involves flight hardware undergoing rigorous "visually clean" inspections at every step of processing. The unknown contaminants that are discovered on these inspections can directly impact the mission by decreasing performance of sensors and scientific detectors on spacecraft and satellites, acting as micrometeorites, damaging critical sealing surfaces, and causing hazards to the crew of manned missions. This talk will discuss how microscopy has played a major role in all aspects of space port operations at KSC. Case studies will highlight years of analysis at the Materials Science Division including facility and payload contamination for the Navigation Signal Timing and Ranging Global Positioning Satellites (NA VST AR GPS) missions, quality control monitoring of monomethyl hydrazine fuel procurement for launch vehicle operations, Shuttle Solids Rocket Booster (SRB) foam processing failure analysis, and Space Shuttle Main Engine Cut-off (ECO) flight sensor anomaly analysis. What I hope to share with my fellow microscopists is some of the excitement of microscopy and how its discoveries has led to hardware processing, that has helped enable the successful launch of vehicles and space flight missions here at Kennedy Space Center.

  8. Hydrogen storage materials discovery via high throughput ball milling and gas sorption.

    PubMed

    Li, Bin; Kaye, Steven S; Riley, Conor; Greenberg, Doron; Galang, Daniel; Bailey, Mark S

    2012-06-11

    The lack of a high capacity hydrogen storage material is a major barrier to the implementation of the hydrogen economy. To accelerate discovery of such materials, we have developed a high-throughput workflow for screening of hydrogen storage materials in which candidate materials are synthesized and characterized via highly parallel ball mills and volumetric gas sorption instruments, respectively. The workflow was used to identify mixed imides with significantly enhanced absorption rates relative to Li2Mg(NH)2. The most promising material, 2LiNH2:MgH2 + 5 atom % LiBH4 + 0.5 atom % La, exhibits the best balance of absorption rate, capacity, and cycle-life, absorbing >4 wt % H2 in 1 h at 120 °C after 11 absorption-desorption cycles.

  9. Discovery and Orbital Determination of the Transient X-Ray Pulsar GRO J1750-27

    NASA Technical Reports Server (NTRS)

    Scott, D. M.; Finger, M. H.; Wilson, R. B.; Koh, D. T.; Prince, T. A.; Vaughan, B. A.; Chakrabarty, D.

    1997-01-01

    We report on the discovery and hard X-ray (20 - 70 keV) observations of the 4.45 s period transient X-ray pulsar GRO J1750-27 with the BATSE all-sky monitor on board CGRO. A relatively faint out- burst (less than 30 mcrab peak) lasting at least 60 days was observed during which the spin-up rate peaked at 38 pHz/s and was correlated with the pulsed intensity. An orbit with a period of 29.8 days was found. The large spin-up rate, spin period, and orbital period together suggest that accretion is occurring from a disk and that the outburst is a "giant" outburst typical of a Be/X-ray transient system. No optical counterpart has yet been reported.

  10. Discovery and Orbital Determination of the Transient X-Ray Pulsar GRO J1750-27

    NASA Technical Reports Server (NTRS)

    Scott, D. M.; Finger, M. H.; Wilson, R. B.; Koh, D. T.; Prince, T. A.; Vaughan, B. A.; Chakrabarty, D.

    1997-01-01

    We report on the discovery and hard X-ray (20-70 keV) observations of the 4.45 second period transient X-ray pulsar GRO J1750-27 with the BATSE all-sky monitor on board CCRO. A relatively faint outburst (< 30 mCrab peak) lasting at least 60 days was observed during which the spin-up rate peaked at 38 pHz/sec and was correlated with the pulsed intensity. An orbit with a period of 29.8 days was found. The large spin-up rate, spin period and orbital period together suggest that accretion is occurring from a disk and that the outburst is a 'giant' outburst typical of a Be/X-ray transient system. No optical counterpart has been reported yet.

  11. Financing drug discovery for orphan diseases.

    PubMed

    Fagnan, David E; Gromatzky, Austin A; Stein, Roger M; Fernandez, Jose-Maria; Lo, Andrew W

    2014-05-01

    Recently proposed 'megafund' financing methods for funding translational medicine and drug development require billions of dollars in capital per megafund to de-risk the drug discovery process enough to issue long-term bonds. Here, we demonstrate that the same financing methods can be applied to orphan drug development but, because of the unique nature of orphan diseases and therapeutics (lower development costs, faster FDA approval times, lower failure rates and lower correlation of failures among disease targets) the amount of capital needed to de-risk such portfolios is much lower in this field. Numerical simulations suggest that an orphan disease megafund of only US$575 million can yield double-digit expected rates of return with only 10-20 projects in the portfolio. Copyright © 2013 The Authors. Published by Elsevier Ltd.. All rights reserved.

  12. Lung neuroendocrine tumours: deep sequencing of the four World Health Organization histotypes reveals chromatin-remodelling genes as major players and a prognostic role for TERT, RB1, MEN1 and KMT2D.

    PubMed

    Simbolo, Michele; Mafficini, Andrea; Sikora, Katarzyna O; Fassan, Matteo; Barbi, Stefano; Corbo, Vincenzo; Mastracci, Luca; Rusev, Borislav; Grillo, Federica; Vicentini, Caterina; Ferrara, Roberto; Pilotto, Sara; Davini, Federico; Pelosi, Giuseppe; Lawlor, Rita T; Chilosi, Marco; Tortora, Giampaolo; Bria, Emilio; Fontanini, Gabriella; Volante, Marco; Scarpa, Aldo

    2017-03-01

    Next-generation sequencing (NGS) was applied to 148 lung neuroendocrine tumours (LNETs) comprising the four World Health Organization classification categories: 53 typical carcinoid (TCs), 35 atypical carcinoid (ACs), 27 large-cell neuroendocrine carcinomas, and 33 small-cell lung carcinomas. A discovery screen was conducted on 46 samples by the use of whole-exome sequencing and high-coverage targeted sequencing of 418 genes. Eighty-eight recurrently mutated genes from both the discovery screen and current literature were verified in the 46 cases of the discovery screen, and validated on additional 102 LNETs by targeted NGS; their prevalence was then evaluated on the whole series. Thirteen of these 88 genes were also evaluated for copy number alterations (CNAs). Carcinoids and carcinomas shared most of the altered genes but with different prevalence rates. When mutations and copy number changes were combined, MEN1 alterations were almost exclusive to carcinoids, whereas alterations of TP53 and RB1 cell cycle regulation genes and PI3K/AKT/mTOR pathway genes were significantly enriched in carcinomas. Conversely, mutations in chromatin-remodelling genes, including those encoding histone modifiers and members of SWI-SNF complexes, were found at similar rates in carcinoids (45.5%) and carcinomas (55.0%), suggesting a major role in LNET pathogenesis. One AC and one TC showed a hypermutated profile associated with a POLQ damaging mutation. There were fewer CNAs in carcinoids than in carcinomas; however ACs showed a hybrid pattern, whereby gains of TERT, SDHA, RICTOR, PIK3CA, MYCL and SRC were found at rates similar to those in carcinomas, whereas the MEN1 loss rate mirrored that of TCs. Multivariate survival analysis revealed RB1 mutation (p = 0.0005) and TERT copy gain (p = 0.016) as independent predictors of poorer prognosis. MEN1 mutation was associated with poor prognosis in AC (p = 0.0045), whereas KMT2D mutation correlated with longer survival in SCLC (p = 0.0022). In conclusion, molecular profiling may complement histology for better diagnostic definition and prognostic stratification of LNETs. © 2016 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland. © 2016 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.

  13. Evaluation of Second-Level Inference in fMRI Analysis

    PubMed Central

    Roels, Sanne P.; Loeys, Tom; Moerkerke, Beatrijs

    2016-01-01

    We investigate the impact of decisions in the second-level (i.e., over subjects) inferential process in functional magnetic resonance imaging on (1) the balance between false positives and false negatives and on (2) the data-analytical stability, both proxies for the reproducibility of results. Second-level analysis based on a mass univariate approach typically consists of 3 phases. First, one proceeds via a general linear model for a test image that consists of pooled information from different subjects. We evaluate models that take into account first-level (within-subjects) variability and models that do not take into account this variability. Second, one proceeds via inference based on parametrical assumptions or via permutation-based inference. Third, we evaluate 3 commonly used procedures to address the multiple testing problem: familywise error rate correction, False Discovery Rate (FDR) correction, and a two-step procedure with minimal cluster size. Based on a simulation study and real data we find that the two-step procedure with minimal cluster size results in most stable results, followed by the familywise error rate correction. The FDR results in most variable results, for both permutation-based inference and parametrical inference. Modeling the subject-specific variability yields a better balance between false positives and false negatives when using parametric inference. PMID:26819578

  14. Two Long-Term Intermittent Pulsars Discovered in the PALFA Survey

    NASA Astrophysics Data System (ADS)

    Lyne, A. G.; Stappers, B. W.; Freire, P. C. C.; Hessels, J. W. T.; Kaspi, V. M.; Allen, B.; Bogdanov, S.; Brazier, A.; Camilo, F.; Cardoso, F.; Chatterjee, S.; Cordes, J. M.; Crawford, F.; Deneva, J. S.; Ferdman, R. D.; Jenet, F. A.; Knispel, B.; Lazarus, P.; van Leeuwen, J.; Lynch, R.; Madsen, E.; McLaughlin, M. A.; Parent, E.; Patel, C.; Ransom, S. M.; Scholz, P.; Seymour, A.; Siemens, X.; Spitler, L. G.; Stairs, I. H.; Stovall, K.; Swiggum, J.; Wharton, R. S.; Zhu, W. W.

    2017-01-01

    We report the discovery of two long-term intermittent radio pulsars in the ongoing Pulsar Arecibo L-Band Feed Array survey. Following discovery with the Arecibo Telescope, extended observations of these pulsars over several years at Jodrell Bank Observatory have revealed the details of their rotation and radiation properties. PSRs J1910+0517 and J1929+1357 show long-term extreme bimodal intermittency, switching between active (ON) and inactive (OFF) emission states and indicating the presence of a large, hitherto unrecognized underlying population of such objects. For PSR J1929+1357, the initial duty cycle was fON = 0.008, but two years later, this changed quite abruptly to fON = 0.16. This is the first time that a significant evolution in the activity of an intermittent pulsar has been seen, and we show that the spin-down rate of the pulsar is proportional to the activity. The spin-down rate of PSR J1929+1357 is increased by a factor of 1.8 when it is in active mode, similar to the increase seen in the other three known long-term intermittent pulsars. These discoveries increase the number of known pulsars displaying long-term intermittency to five. These five objects display a remarkably narrow range of spin-down power (\\dot{E} ˜ {10}32 {erg} {{{s}}}-1) and accelerating potential above their polar caps. If confirmed by further discoveries, this trend might be important for understanding the physical mechanisms that cause intermittency.

  15. DataHub: Knowledge-based data management for data discovery

    NASA Astrophysics Data System (ADS)

    Handley, Thomas H.; Li, Y. Philip

    1993-08-01

    Currently available database technology is largely designed for business data-processing applications, and seems inadequate for scientific applications. The research described in this paper, the DataHub, will address the issues associated with this shortfall in technology utilization and development. The DataHub development is addressing the key issues in scientific data management of scientific database models and resource sharing in a geographically distributed, multi-disciplinary, science research environment. Thus, the DataHub will be a server between the data suppliers and data consumers to facilitate data exchanges, to assist science data analysis, and to provide as systematic approach for science data management. More specifically, the DataHub's objectives are to provide support for (1) exploratory data analysis (i.e., data driven analysis); (2) data transformations; (3) data semantics capture and usage; analysis-related knowledge capture and usage; and (5) data discovery, ingestion, and extraction. Applying technologies that vary from deductive databases, semantic data models, data discovery, knowledge representation and inferencing, exploratory data analysis techniques and modern man-machine interfaces, DataHub will provide a prototype, integrated environement to support research scientists' needs in multiple disciplines (i.e. oceanography, geology, and atmospheric) while addressing the more general science data management issues. Additionally, the DataHub will provide data management services to exploratory data analysis applications such as LinkWinds and NCSA's XIMAGE.

  16. [Recent advances in metabonomics].

    PubMed

    Xu, Guo-Wang; Lu, Xin; Yang, Sheng-Li

    2007-12-01

    Metabonomics (or metabolomics) aims at the comprehensive and quantitative analysis of the wide arrays of metabolites in biological samples. Metabonomics has been labeled as one of the new" -omics" joining genomics, transcriptomics, and proteomics as a science employed toward the understanding of global systems biology. It has been widely applied in many research areas including drug toxicology, biomarker discovery, functional genomics, and molecular pathology etc. The comprehensive analysis of the metabonome is particularly challenging due to the diverse chemical natures of metabolites. Metabonomics investigations require special approaches for sample preparation, data-rich analytical chemical measurements, and information mining. The outputs from a metabonomics study allow sample classification, biomarker discovery, and interpretation of the reasons for classification information. This review focuses on the currently new advances in various technical platforms of metabonomics and its applications in drug discovery and development, disease biomarker identification, plant and microbe related fields.

  17. FLIM FRET Technology for Drug Discovery: Automated Multiwell-Plate High-Content Analysis, Multiplexed Readouts and Application in Situ**

    PubMed Central

    Kumar, Sunil; Alibhai, Dominic; Margineanu, Anca; Laine, Romain; Kennedy, Gordon; McGinty, James; Warren, Sean; Kelly, Douglas; Alexandrov, Yuriy; Munro, Ian; Talbot, Clifford; Stuckey, Daniel W; Kimberly, Christopher; Viellerobe, Bertrand; Lacombe, Francois; Lam, Eric W-F; Taylor, Harriet; Dallman, Margaret J; Stamp, Gordon; Murray, Edward J; Stuhmeier, Frank; Sardini, Alessandro; Katan, Matilda; Elson, Daniel S; Neil, Mark A A; Dunsby, Chris; French, Paul M W

    2011-01-01

    A fluorescence lifetime imaging (FLIM) technology platform intended to read out changes in Förster resonance energy transfer (FRET) efficiency is presented for the study of protein interactions across the drug-discovery pipeline. FLIM provides a robust, inherently ratiometric imaging modality for drug discovery that could allow the same sensor constructs to be translated from automated cell-based assays through small transparent organisms such as zebrafish to mammals. To this end, an automated FLIM multiwell-plate reader is described for high content analysis of fixed and live cells, tomographic FLIM in zebrafish and FLIM FRET of live cells via confocal endomicroscopy. For cell-based assays, an exemplar application reading out protein aggregation using FLIM FRET is presented, and the potential for multiple simultaneous FLIM (FRET) readouts in microscopy is illustrated. PMID:21337485

  18. Real-Time Analysis of Binding Events between Different Aβ1-42 Species and Human Lilrb2 by Dual Polarization Interferometry.

    PubMed

    Hu, Tao; Wang, Shuang; Chen, Chuanxia; Sun, Jian; Yang, Xiurong

    2017-02-21

    Abnormal accumulation of 42-residue amyloid-β (Aβ 1-42 ) within the brain triggers the pathogenesis of Alzheimer's disease (AD). In this paper, we use a dual polarization interferometry (DPI) tool to evaluate the binding events of various Aβ 1-42 species such as monomeric Aβ 1-42 , low molecular weight Aβ 1-42 oligomer (LMW Aβ 1-42 ), and high molecular weight Aβ 1-42 oligomer (HMW Aβ 1-42 ) with extracellular D1D2 domain of lilrb2 (ED1D2L) receptor that has been proved to be associated with AD. Based on the real-time binding information provided by DPI, the association rate (k a ) of ED1D2L receptor with monomeric Aβ 1-42 , LMW Aβ 1-42 , and HMW Aβ 1-42 is individually determined to be 2.85 × 10 4 , 4.52 × 10 4 , and 1.34 × 10 5 M -1 ·s -1 , and meanwhile, the dissociation rate (k d ) corresponds to 1.79 × 10 -2 , 2.09 × 10 -2 , and 5.34 × 10 -4 s -1 , respectively. By analysis of the kinetic parameters of k a and k d values, we discovery that the HMW Aβ 1-42 exhibits the fastest rate for ED1D2L receptor in the association phrase, and HMW Aβ 1-42 likewise shows the highest affinity with ED1D2L receptor during the dissociation period in contrast to LMW Aβ 1-42 and monomeric Aβ 1-42 . Our findings significantly reveal the different binding behaviors among them from the perspective of kinetics aspect, by which we could indirectly elucidate the malicious impacts in the process of AD triggered by HMW Aβ 1-42 . Strikingly, this work offers a new exciting clue to explore the dynamic properties associated with interactions of various Aβ 1-42 species with other targets and hopefully contributes to drug discovery and screen in the future.

  19. Basics of Antibody Phage Display Technology.

    PubMed

    Ledsgaard, Line; Kilstrup, Mogens; Karatt-Vellatt, Aneesh; McCafferty, John; Laustsen, Andreas H

    2018-06-09

    Antibody discovery has become increasingly important in almost all areas of modern medicine. Different antibody discovery approaches exist, but one that has gained increasing interest in the field of toxinology and antivenom research is phage display technology. In this review, the lifecycle of the M13 phage and the basics of phage display technology are presented together with important factors influencing the success rates of phage display experiments. Moreover, the pros and cons of different antigen display methods and the use of naïve versus immunized phage display antibody libraries is discussed, and selected examples from the field of antivenom research are highlighted. This review thus provides in-depth knowledge on the principles and use of phage display technology with a special focus on discovery of antibodies that target animal toxins.

  20. Research of Ad Hoc Networks Access Algorithm

    NASA Astrophysics Data System (ADS)

    Xiang, Ma

    With the continuous development of mobile communication technology, Ad Hoc access network has become a hot research, Ad Hoc access network nodes can be used to expand capacity of multi-hop communication range of mobile communication system, even business adjacent to the community, improve edge data rates. When the ad hoc network is the access network of the internet, the gateway discovery protocol is very important to choose the most appropriate gateway to guarantee the connectivity between ad hoc network and IP based fixed networks. The paper proposes a QoS gateway discovery protocol which uses the time delay and stable route to the gateway selection conditions. And according to the gateway discovery protocol, it also proposes a fast handover scheme which can decrease the handover time and improve the handover efficiency.

  1. Systems biology-embedded target validation: improving efficacy in drug discovery.

    PubMed

    Vandamme, Drieke; Minke, Benedikt A; Fitzmaurice, William; Kholodenko, Boris N; Kolch, Walter

    2014-01-01

    The pharmaceutical industry is faced with a range of challenges with the ever-escalating costs of drug development and a drying out of drug pipelines. By harnessing advances in -omics technologies and moving away from the standard, reductionist model of drug discovery, there is significant potential to reduce costs and improve efficacy. Embedding systems biology approaches in drug discovery, which seek to investigate underlying molecular mechanisms of potential drug targets in a network context, will reduce attrition rates by earlier target validation and the introduction of novel targets into the currently stagnant market. Systems biology approaches also have the potential to assist in the design of multidrug treatments and repositioning of existing drugs, while stratifying patients to give a greater personalization of medical treatment. © 2013 Wiley Periodicals, Inc.

  2. Improving the dissolution and bioavailability of 6-mercaptopurine via co-crystallization with isonicotinamide.

    PubMed

    Wang, Jian-Rong; Yu, Xueping; Zhou, Chun; Lin, Yunfei; Chen, Chen; Pan, Guoyu; Mei, Xuefeng

    2015-03-01

    6-Mercaptopurine (6-MP) is a clinically important antitumor drug. The commercially available form was provided as monohydrate and belongs to BCS class II category. Co-crystallization screening by reaction crystallization method (RCM) and monitored by powder X-ray diffraction led to the discovery of a new co-crystal formed between 6-MP and isonicotinamide (co-crystal 1). Co-crystal 1 was thoroughly characterized by X-ray diffraction, FT-IR and Raman spectroscopy, and thermal analysis. Noticeably, the in vitro and in vivo studies revealed that co-crystal 1 possesses improved dissolution rate and superior bioavailability on animal model. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Discovery of replicating circular RNAs by RNA-seq and computational algorithms.

    PubMed

    Zhang, Zhixiang; Qi, Shuishui; Tang, Nan; Zhang, Xinxin; Chen, Shanshan; Zhu, Pengfei; Ma, Lin; Cheng, Jinping; Xu, Yun; Lu, Meiguang; Wang, Hongqing; Ding, Shou-Wei; Li, Shifang; Wu, Qingfa

    2014-12-01

    Replicating circular RNAs are independent plant pathogens known as viroids, or act to modulate the pathogenesis of plant and animal viruses as their satellite RNAs. The rate of discovery of these subviral pathogens was low over the past 40 years because the classical approaches are technical demanding and time-consuming. We previously described an approach for homology-independent discovery of replicating circular RNAs by analysing the total small RNA populations from samples of diseased tissues with a computational program known as progressive filtering of overlapping small RNAs (PFOR). However, PFOR written in PERL language is extremely slow and is unable to discover those subviral pathogens that do not trigger in vivo accumulation of extensively overlapping small RNAs. Moreover, PFOR is yet to identify a new viroid capable of initiating independent infection. Here we report the development of PFOR2 that adopted parallel programming in the C++ language and was 3 to 8 times faster than PFOR. A new computational program was further developed and incorporated into PFOR2 to allow the identification of circular RNAs by deep sequencing of long RNAs instead of small RNAs. PFOR2 analysis of the small RNA libraries from grapevine and apple plants led to the discovery of Grapevine latent viroid (GLVd) and Apple hammerhead viroid-like RNA (AHVd-like RNA), respectively. GLVd was proposed as a new species in the genus Apscaviroid, because it contained the typical structural elements found in this group of viroids and initiated independent infection in grapevine seedlings. AHVd-like RNA encoded a biologically active hammerhead ribozyme in both polarities, and was not specifically associated with any of the viruses found in apple plants. We propose that these computational algorithms have the potential to discover novel circular RNAs in plants, invertebrates and vertebrates regardless of whether they replicate and/or induce the in vivo accumulation of small RNAs.

  4. Discovery of Replicating Circular RNAs by RNA-Seq and Computational Algorithms

    PubMed Central

    Tang, Nan; Zhang, Xinxin; Chen, Shanshan; Zhu, Pengfei; Ma, Lin; Cheng, Jinping; Xu, Yun; Lu, Meiguang; Wang, Hongqing; Ding, Shou-Wei; Li, Shifang; Wu, Qingfa

    2014-01-01

    Replicating circular RNAs are independent plant pathogens known as viroids, or act to modulate the pathogenesis of plant and animal viruses as their satellite RNAs. The rate of discovery of these subviral pathogens was low over the past 40 years because the classical approaches are technical demanding and time-consuming. We previously described an approach for homology-independent discovery of replicating circular RNAs by analysing the total small RNA populations from samples of diseased tissues with a computational program known as progressive filtering of overlapping small RNAs (PFOR). However, PFOR written in PERL language is extremely slow and is unable to discover those subviral pathogens that do not trigger in vivo accumulation of extensively overlapping small RNAs. Moreover, PFOR is yet to identify a new viroid capable of initiating independent infection. Here we report the development of PFOR2 that adopted parallel programming in the C++ language and was 3 to 8 times faster than PFOR. A new computational program was further developed and incorporated into PFOR2 to allow the identification of circular RNAs by deep sequencing of long RNAs instead of small RNAs. PFOR2 analysis of the small RNA libraries from grapevine and apple plants led to the discovery of Grapevine latent viroid (GLVd) and Apple hammerhead viroid-like RNA (AHVd-like RNA), respectively. GLVd was proposed as a new species in the genus Apscaviroid, because it contained the typical structural elements found in this group of viroids and initiated independent infection in grapevine seedlings. AHVd-like RNA encoded a biologically active hammerhead ribozyme in both polarities, and was not specifically associated with any of the viruses found in apple plants. We propose that these computational algorithms have the potential to discover novel circular RNAs in plants, invertebrates and vertebrates regardless of whether they replicate and/or induce the in vivo accumulation of small RNAs. PMID:25503469

  5. Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics

    PubMed Central

    Brusniak, Mi-Youn; Bodenmiller, Bernd; Campbell, David; Cooke, Kelly; Eddes, James; Garbutt, Andrew; Lau, Hollis; Letarte, Simon; Mueller, Lukas N; Sharma, Vagisha; Vitek, Olga; Zhang, Ning; Aebersold, Ruedi; Watts, Julian D

    2008-01-01

    Background Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics. Results We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling. Conclusion The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field. PMID:19087345

  6. The threshold vs LNT showdown: Dose rate findings exposed flaws in the LNT model part 1. The Russell-Muller debate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Calabrese, Edward J., E-mail: edwardc@schoolph.uma

    This paper assesses the discovery of the dose-rate effect in radiation genetics and how it challenged fundamental tenets of the linear non-threshold (LNT) dose response model, including the assumptions that all mutational damage is cumulative and irreversible and that the dose-response is linear at low doses. Newly uncovered historical information also describes how a key 1964 report by the International Commission for Radiological Protection (ICRP) addressed the effects of dose rate in the assessment of genetic risk. This unique story involves assessments by two leading radiation geneticists, Hermann J. Muller and William L. Russell, who independently argued that the report'smore » Genetic Summary Section on dose rate was incorrect while simultaneously offering vastly different views as to what the report's summary should have contained. This paper reveals occurrences of scientific disagreements, how conflicts were resolved, which view(s) prevailed and why. During this process the Nobel Laureate, Muller, provided incorrect information to the ICRP in what appears to have been an attempt to manipulate the decision-making process and to prevent the dose-rate concept from being adopted into risk assessment practices. - Highlights: • The discovery of radiation dose rate challenged the scientific basis of LNT. • Radiation dose rate occurred in males and females. • The dose rate concept supported a threshold dose-response for radiation.« less

  7. Developments in SPR Fragment Screening.

    PubMed

    Chavanieu, Alain; Pugnière, Martine

    2016-01-01

    Fragment-based approaches have played an increasing role alongside high-throughput screening in drug discovery for 15 years. The label-free biosensor technology based on surface plasmon resonance (SPR) is now sensitive and informative enough to serve during primary screens and validation steps. In this review, the authors discuss the role of SPR in fragment screening. After a brief description of the underlying principles of the technique and main device developments, they evaluate the advantages and adaptations of SPR for fragment-based drug discovery. SPR can also be applied to challenging targets such as membrane receptors and enzymes. The high-level of immobilization of the protein target and its stability are key points for a relevant screening that can be optimized using oriented immobilized proteins and regenerable sensors. Furthermore, to decrease the rate of false negatives, a selectivity test may be performed in parallel on the main target bearing the binding site mutated or blocked with a low-off-rate ligand. Fragment-based drug design, integrated in a rational workflow led by SPR, will thus have a predominant role for the next wave of drug discovery which could be greatly enhanced by new improvements in SPR devices.

  8. cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate.

    PubMed

    Clevert, Djork-Arné; Mitterecker, Andreas; Mayr, Andreas; Klambauer, Günter; Tuefferd, Marianne; De Bondt, An; Talloen, Willem; Göhlmann, Hinrich; Hochreiter, Sepp

    2011-07-01

    Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with a disease in a clinical study, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, 'cn.FARMS', which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR. The software cn.FARMS is publicly available as a R package at http://www.bioinf.jku.at/software/cnfarms/cnfarms.html.

  9. Mathematical Modeling and Analysis of Mass Spectrometry Data in Workflows for the Discovery of Biomarkets in Breast Cancer

    DTIC Science & Technology

    2008-07-01

    Mass Spectrometry Data in Workflows for the Discovery of Biomarkets in Breast Cancer PRINCIPAL INVESTIGATOR: Vladimir Fokin, Ph.D... Biomarkets in Breast Cancer 5b. GRANT NUMBER W81XWH-07-1-0447 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER Vladimir Fokin, Ph.D

  10. Regulatory sequence analysis tools.

    PubMed

    van Helden, Jacques

    2003-07-01

    The web resource Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat) offers a collection of software tools dedicated to the prediction of regulatory sites in non-coding DNA sequences. These tools include sequence retrieval, pattern discovery, pattern matching, genome-scale pattern matching, feature-map drawing, random sequence generation and other utilities. Alternative formats are supported for the representation of regulatory motifs (strings or position-specific scoring matrices) and several algorithms are proposed for pattern discovery. RSAT currently holds >100 fully sequenced genomes and these data are regularly updated from GenBank.

  11. Phage display biopanning and isolation of target-unrelated peptides: in search of nonspecific binders hidden in a combinatorial library.

    PubMed

    Bakhshinejad, Babak; Zade, Hesam Motaleb; Shekarabi, Hosna Sadat Zahed; Neman, Sara

    2016-12-01

    Phage display is known as a powerful methodology for the identification of targeting ligands that specifically bind to a variety of targets. The high-throughput screening of phage display combinatorial peptide libraries is performed through the affinity selection method of biopanning. Although phage display selection has proven very successful in the discovery of numerous high-affinity target-binding peptides with potential application in drug discovery and delivery, the enrichment of false-positive target-unrelated peptides (TUPs) without any actual affinity towards the target remains a major problem of library screening. Selection-related TUPs may emerge because of binding to the components of the screening system rather than the target. Propagation-related TUPs may arise as a result of faster growth rate of some phage clones enabling them to outcompete slow-propagating clones. Amplification of the library between rounds of biopanning makes a significant contribution to the selection of phage clones with propagation advantage. Distinguishing nonspecific TUPs from true target binders is of particular importance for the translation of biopanning findings from basic research to clinical applications. Different experimental and in silico approaches are applied to assess the specificity of phage display-derived peptides towards the target. Bioinformatic tools are playing a rapidly growing role in the analysis of biopanning data and identification of target-irrelevant TUPs. Recent progress in the introduction of efficient strategies for TUP detection holds enormous promise for the discovery of clinically relevant cell- and tissue-homing peptides and paves the way for the development of novel targeted diagnostic and therapeutic platforms in pharmaceutical areas.

  12. The Discovery of the Electromagnetic Counterpart of GW170817: Kilonova AT 2017gfo/DLT17ck

    NASA Astrophysics Data System (ADS)

    Valenti, Stefano; David; Sand, J.; Yang, Sheng; Cappellaro, Enrico; Tartaglia, Leonardo; Corsi, Alessandra; Jha, Saurabh W.; Reichart, Daniel E.; Haislip, Joshua; Kouprianov, Vladimir

    2017-10-01

    During the second observing run of the Laser Interferometer Gravitational-wave Observatory (LIGO) and Virgo Interferometer, a gravitational-wave signal consistent with a binary neutron star coalescence was detected on 2017 August 17th (GW170817), quickly followed by a coincident short gamma-ray burst trigger detected by the Fermi satellite. The Distance Less Than 40 (DLT40) Mpc supernova search performed pointed follow-up observations of a sample of galaxies regularly monitored by the survey that fell within the combined LIGO+Virgo localization region and the larger Fermi gamma-ray burst error box. Here we report the discovery of a new optical transient (DLT17ck, also known as SSS17a; it has also been registered as AT 2017gfo) spatially and temporally coincident with GW170817. The photometric and spectroscopic evolution of DLT17ck is unique, with an absolute peak magnitude of M r = -15.8 ± 0.1 and an r-band decline rate of 1.1 mag day-1. This fast evolution is generically consistent with kilonova models, which have been predicted as the optical counterpart to binary neutron star coalescences. Analysis of archival DLT40 data does not show any sign of transient activity at the location of DLT17ck down to r ˜ 19 mag in the time period between 8 months and 21 days prior to GW170817. This discovery represents the beginning of a new era for multi-messenger astronomy, opening a new path by which to study and understand binary neutron star coalescences, short gamma-ray bursts, and their optical counterparts.

  13. Personal discovery in diabetes self-management: Discovering cause and effect using self-monitoring data

    PubMed Central

    Mamykina, Lena; Heitkemper, Elizabeth M.; Smaldone, Arlene M.; Kukafka, Rita; Cole-Lewis, Heather J.; Davidson, Patricia G.; Mynatt, Elizabeth D.; Cassells, Andrea; Tobin, Jonathan N.; Hripcsak, George

    2017-01-01

    Objective To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. Materials and methods We conducted an observational qualitative study of discovery with personal data among adults attending a diabetes self-management education (DSME) program that utilized a discovery-based curriculum. The study included observations of class sessions, and interviews and focus groups with the educator and attendees of the program (n = 14). Results The main discovery in diabetes self-management evolved around discovering patterns of association between characteristics of individuals’ activities and changes in their blood glucose levels that the participants referred to as “cause and effect”. This discovery empowered individuals to actively engage in self-management and provided a desired flexibility in selection of personalized self-management strategies. We show that discovery of cause and effect involves four essential phases: (1) feature selection, (2) hypothesis generation, (3) feature evaluation, and (4) goal specification. Further, we identify opportunities to support discovery at each stage with informatics and data visualization solutions by providing assistance with: (1) active manipulation of collected data (e.g., grouping, filtering and side-by-side inspection), (2) hypotheses formulation (e.g., using natural language statements or constructing visual queries), (3) inference evaluation (e.g., through aggregation and visual comparison, and statistical analysis of associations), and (4) translation of discoveries into actionable goals (e.g., tailored selection from computable knowledge sources of effective diabetes self-management behaviors). Discussion The study suggests that discovery of cause and effect in diabetes can be a powerful approach to helping individuals to improve their self-management strategies, and that self-monitoring data can serve as a driving engine for personal discovery that may lead to sustainable behavior changes. Conclusions Enabling personal discovery is a promising new approach to enhancing chronic disease self-management with informatics interventions. PMID:28974460

  14. Personal discovery in diabetes self-management: Discovering cause and effect using self-monitoring data.

    PubMed

    Mamykina, Lena; Heitkemper, Elizabeth M; Smaldone, Arlene M; Kukafka, Rita; Cole-Lewis, Heather J; Davidson, Patricia G; Mynatt, Elizabeth D; Cassells, Andrea; Tobin, Jonathan N; Hripcsak, George

    2017-12-01

    To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. We conducted an observational qualitative study of discovery with personal data among adults attending a diabetes self-management education (DSME) program that utilized a discovery-based curriculum. The study included observations of class sessions, and interviews and focus groups with the educator and attendees of the program (n = 14). The main discovery in diabetes self-management evolved around discovering patterns of association between characteristics of individuals' activities and changes in their blood glucose levels that the participants referred to as "cause and effect". This discovery empowered individuals to actively engage in self-management and provided a desired flexibility in selection of personalized self-management strategies. We show that discovery of cause and effect involves four essential phases: (1) feature selection, (2) hypothesis generation, (3) feature evaluation, and (4) goal specification. Further, we identify opportunities to support discovery at each stage with informatics and data visualization solutions by providing assistance with: (1) active manipulation of collected data (e.g., grouping, filtering and side-by-side inspection), (2) hypotheses formulation (e.g., using natural language statements or constructing visual queries), (3) inference evaluation (e.g., through aggregation and visual comparison, and statistical analysis of associations), and (4) translation of discoveries into actionable goals (e.g., tailored selection from computable knowledge sources of effective diabetes self-management behaviors). The study suggests that discovery of cause and effect in diabetes can be a powerful approach to helping individuals to improve their self-management strategies, and that self-monitoring data can serve as a driving engine for personal discovery that may lead to sustainable behavior changes. Enabling personal discovery is a promising new approach to enhancing chronic disease self-management with informatics interventions. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Using glycome databases for drug discovery.

    PubMed

    Aoki-Kinoshita, Kiyoko F

    2008-08-01

    The glycomics field has made great advancements in the last decade due to technologies for their synthesis and analysis including carbohydrate microarrays. Accordingly, databases for glycomics research have also emerged and been made publicly available by many major institutions worldwide. This review introduces these and other useful databases on which new methods for drug discovery can be developed. The scope of this review covers current documented and accessible databases and resources pertaining to glycomics. These were selected with the expectation that they may be useful for drug discovery research. There is a plethora of glycomics databases that have much potential for drug discovery. This may seem daunting at first but this review helps to put some of these resources into perspective. Additionally, some thoughts on how to integrate these resources to allow more efficient research are presented.

  16. The clinical impact of recent advances in LC-MS for cancer biomarker discovery and verification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Hui; Shi, Tujin; Qian, Wei-Jun

    2015-12-04

    Mass spectrometry-based proteomics has become an indispensable tool in biomedical research with broad applications ranging from fundamental biology, systems biology, and biomarker discovery. Recent advances in LC-MS have made it become a major technology in clinical applications, especially in cancer biomarker discovery and verification. To overcome the challenges associated with the analysis of clinical samples, such as extremely wide dynamic range of protein concentrations in biofluids and the need to perform high throughput and accurate quantification, significant efforts have been devoted to improve the overall performance of LC-MS bases clinical proteomics. In this review, we summarize the recent advances inmore » LC-MS in the aspect of cancer biomarker discovery and quantification, and discuss its potentials, limitations, and future perspectives.« less

  17. Hydrostatic Stress Effects Incorporated Into the Analysis of the High-Strain-Rate Deformation of Polymer Matrix Composites

    NASA Technical Reports Server (NTRS)

    Goldberg, Robert K.; Roberts, Gary D.

    2003-01-01

    Procedures for modeling the effect of high strain rate on composite materials are needed for designing reliable composite engine cases that are lighter than the metal cases in current use. The types of polymer matrix composites that are likely to be used in such an application have a deformation response that is nonlinear and that varies with strain rate. The nonlinearity and strain rate dependence of the composite response is primarily due to the matrix constituent. Therefore, in developing material models to be used in the design of impact-resistant composite engine cases, the deformation of the polymer matrix must be correctly analyzed. However, unlike in metals, the nonlinear response of polymers depends on the hydrostatic stresses, which must be accounted for within an analytical model. An experimental program has been carried out through a university grant with the Ohio State University to obtain tensile and shear deformation data for a representative polymer for strain rates ranging from quasi-static to high rates of several hundred per second. This information has been used at the NASA Glenn Research Center to develop, characterize, and correlate a material model in which the strain rate dependence and nonlinearity (including hydrostatic stress effects) of the polymer are correctly analyzed. To obtain the material data, Glenn s researchers designed and fabricated test specimens of a representative toughened epoxy resin. Quasi-static tests at low strain rates and split Hopkinson bar tests at high strain rates were then conducted at the Ohio State University. The experimental data confirmed the strong effects of strain rate on both the tensile and shear deformation of the polymer. For the analytical model, Glenn researchers modified state variable constitutive equations previously used for the viscoplastic analysis of metals to allow for the analysis of the nonlinear, strain-rate-dependent polymer deformation. Specifically, we accounted for the effects of hydrostatic stresses. An important discovery in the course of this work was that the hydrostatic stress effects varied during the loading process, which needed to be accounted for within the constitutive equations. The model is characterized primarily by shear data, with tensile data used to characterize the hydrostatic stress effects.

  18. Trinity | Informatics Technology for Cancer Research (ITCR)

    Cancer.gov

    Trinity Cancer Transcriptome Analysis Toolkit (CTAT) including de novo transcriptome assembly with downstream support for expression analysis and focused analyses on cancer transcriptomes, incorporating mutation and fusion transcript discovery, and single cell analysis.

  19. Discovery of specific ligands for oral squamous carcinoma to develop anti-cancer drug loaded precise targeting nanotherapeutics.

    PubMed

    Yang, Fan; Liu, Ruiwu; Kramer, Randall; Xiao, Wenwu; Jordan, Richard; Lam, Kit S

    2012-12-01

    Oral squamous cell carcinoma has a low five-year survival rate, which may be due to late detection and a lack of effective tumor-specific therapies. Using a high throughput drug discovery strategy termed one-bead one-compound combinatorial library, the authors identified six compounds with high binding affinity to different human oral squamous cell carcinoma cell lines but not to normal cells. Current work is under way to develop these ligands to oral squamous cell carcinoma specific imaging probes or therapeutic agents.

  20. Translational aspects of blood-brain barrier transport and central nervous system effects of drugs: from discovery to patients.

    PubMed

    de Lange, E C M; Hammarlund-Udenaes, M

    2015-04-01

    The development of CNS drugs is associated with high failure rates. It is postulated that too much focus has been put on BBB permeability and too little on understanding BBB transport, which is the main limiting factor in drug delivery to the brain. An integrated approach to collecting, understanding, and handling pharmacokinetic-pharmacodynamic information from early discovery stages to the clinic is therefore recommended in order to improve translation to human drug treatment. © 2015 American Society for Clinical Pharmacology and Therapeutics.

  1. Statistical physics and physiology: monofractal and multifractal approaches

    NASA Technical Reports Server (NTRS)

    Stanley, H. E.; Amaral, L. A.; Goldberger, A. L.; Havlin, S.; Peng, C. K.

    1999-01-01

    Even under healthy, basal conditions, physiologic systems show erratic fluctuations resembling those found in dynamical systems driven away from a single equilibrium state. Do such "nonequilibrium" fluctuations simply reflect the fact that physiologic systems are being constantly perturbed by external and intrinsic noise? Or, do these fluctuations actually, contain useful, "hidden" information about the underlying nonequilibrium control mechanisms? We report some recent attempts to understand the dynamics of complex physiologic fluctuations by adapting and extending concepts and methods developed very recently in statistical physics. Specifically, we focus on interbeat interval variability as an important quantity to help elucidate possibly non-homeostatic physiologic variability because (i) the heart rate is under direct neuroautonomic control, (ii) interbeat interval variability is readily measured by noninvasive means, and (iii) analysis of these heart rate dynamics may provide important practical diagnostic and prognostic information not obtainable with current approaches. The analytic tools we discuss may be used on a wider range of physiologic signals. We first review recent progress using two analysis methods--detrended fluctuation analysis and wavelets--sufficient for quantifying monofractual structures. We then describe recent work that quantifies multifractal features of interbeat interval series, and the discovery that the multifractal structure of healthy subjects is different than that of diseased subjects.

  2. Using RSAT oligo-analysis and dyad-analysis tools to discover regulatory signals in nucleic sequences.

    PubMed

    Defrance, Matthieu; Janky, Rekin's; Sand, Olivier; van Helden, Jacques

    2008-01-01

    This protocol explains how to discover functional signals in genomic sequences by detecting over- or under-represented oligonucleotides (words) or spaced pairs thereof (dyads) with the Regulatory Sequence Analysis Tools (http://rsat.ulb.ac.be/rsat/). Two typical applications are presented: (i) predicting transcription factor-binding motifs in promoters of coregulated genes and (ii) discovering phylogenetic footprints in promoters of orthologous genes. The steps of this protocol include purging genomic sequences to discard redundant fragments, discovering over-represented patterns and assembling them to obtain degenerate motifs, scanning sequences and drawing feature maps. The main strength of the method is its statistical ground: the binomial significance provides an efficient control on the rate of false positives. In contrast with optimization-based pattern discovery algorithms, the method supports the detection of under- as well as over-represented motifs. Computation times vary from seconds (gene clusters) to minutes (whole genomes). The execution of the whole protocol should take approximately 1 h.

  3. Biomarker MicroRNAs for Diagnosis of Oral Squamous Cell Carcinoma Identified Based on Gene Expression Data and MicroRNA-mRNA Network Analysis

    PubMed Central

    Zhang, Hui; Li, Tangxin; Zheng, Linqing

    2017-01-01

    Oral squamous cell carcinoma is one of the most malignant tumors with high mortality rate worldwide. Biomarker discovery is critical for early diagnosis and precision treatment of this disease. MicroRNAs are small noncoding RNA molecules which often regulate essential biological processes and are good candidates for biomarkers. By integrative analysis of both the cancer-associated gene expression data and microRNA-mRNA network, miR-148b-3p, miR-629-3p, miR-27a-3p, and miR-142-3p were screened as novel diagnostic biomarkers for oral squamous cell carcinoma based on their unique regulatory abilities in the network structure of the conditional microRNA-mRNA network and their important functions. These findings were confirmed by literature verification and functional enrichment analysis. Future experimental validation is expected for the further investigation of their molecular mechanisms. PMID:29098014

  4. A scientometric prediction of the discovery of the first potentially habitable planet with a mass similar to Earth.

    PubMed

    Arbesman, Samuel; Laughlin, Gregory

    2010-10-04

    The search for a habitable extrasolar planet has long interested scientists, but only recently have the tools become available to search for such planets. In the past decades, the number of known extrasolar planets has ballooned into the hundreds, and with it, the expectation that the discovery of the first Earth-like extrasolar planet is not far off. Here, we develop a novel metric of habitability for discovered planets and use this to arrive at a prediction for when the first habitable planet will be discovered. Using a bootstrap analysis of currently discovered exoplanets, we predict the discovery of the first Earth-like planet to be announced in the first half of 2011, with the likeliest date being early May 2011. Our predictions, using only the properties of previously discovered exoplanets, accord well with external estimates for the discovery of the first potentially habitable extrasolar planet and highlight the the usefulness of predictive scientometric techniques to understand the pace of scientific discovery in many fields.

  5. Towards Discovery and Targeted Peptide Biomarker Detection Using nanoESI-TIMS-TOF MS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garabedian, Alyssa; Benigni, Paolo; Ramirez, Cesar E.

    Abstract. In the present work, the potential of trapped ion mobility spectrometry coupled to TOF mass spectrometry (TIMS-TOF MS) for discovery and targeted monitoring of peptide biomarkers from human-in-mouse xenograft tumor tissue was evaluated. In particular, a TIMS-MS workflow was developed for the detection and quantification of peptide biomarkers using internal heavy analogs, taking advantage of the high mobility resolution (R = 150–250) prior to mass analysis. Five peptide biomarkers were separated, identified, and quantified using offline nanoESI-TIMSCID- TOF MS; the results were in good agreement with measurements using a traditional LC-ESI-MS/MS proteomics workflow. The TIMS-TOF MS analysis permitted peptidemore » biomarker detection based on accurate mobility, mass measurements, and high sequence coverage for concentrations in the 10–200 nM range, while simultaneously achieving discovery measurements« less

  6. International Space Station (ISS) Low Pressure Intramodule Quick Disconnect Failures

    NASA Technical Reports Server (NTRS)

    Lewis, John F.; Harris, Danny; Link, Dwight; Morrison, Russel

    2004-01-01

    A failure of an ISS intermodule Quick Disconnect (QD) during protoflight vibration testing of ISS regenerative Environmental Control and Life Support (ECLS) hardware led to the discovery of QD design, manufacturing, and test flaws which can yield the male QD susceptible to failure of the secondary housing seal and inadequate housing assembly locking mechanisms. Discovery of this failure had large implications when considering that currently there are 399 similar units on orbit and approximately 1100 units on the ground integrated into flight hardware. Discovery of the nature of the failure required testing and analysis and implementation of a recovery plan requiring part screening and review of element level and project hazard analysis to determine if secondary seals are required. Implementation also involves coordination with the Nodes and MPLM project offices, Regenerative ECLS Project, ISS Payloads, JAXA, ESA, and ISS Logistics and Maintenance.

  7. Using in Vitro Evolution and Whole Genome Analysis To Discover Next Generation Targets for Antimalarial Drug Discovery

    PubMed Central

    2018-01-01

    Although many new anti-infectives have been discovered and developed solely using phenotypic cellular screening and assay optimization, most researchers recognize that structure-guided drug design is more practical and less costly. In addition, a greater chemical space can be interrogated with structure-guided drug design. The practicality of structure-guided drug design has launched a search for the targets of compounds discovered in phenotypic screens. One method that has been used extensively in malaria parasites for target discovery and chemical validation is in vitro evolution and whole genome analysis (IVIEWGA). Here, small molecules from phenotypic screens with demonstrated antiparasitic activity are used in genome-based target discovery methods. In this Review, we discuss the newest, most promising druggable targets discovered or further validated by evolution-based methods, as well as some exceptions. PMID:29451780

  8. Bioinformatics and peptidomics approaches to the discovery and analysis of food-derived bioactive peptides.

    PubMed

    Agyei, Dominic; Tsopmo, Apollinaire; Udenigwe, Chibuike C

    2018-06-01

    There are emerging advancements in the strategies used for the discovery and development of food-derived bioactive peptides because of their multiple food and health applications. Bioinformatics and peptidomics are two computational and analytical techniques that have the potential to speed up the development of bioactive peptides from bench to market. Structure-activity relationships observed in peptides form the basis for bioinformatics and in silico prediction of bioactive sequences encrypted in food proteins. Peptidomics, on the other hand, relies on "hyphenated" (liquid chromatography-mass spectrometry-based) techniques for the detection, profiling, and quantitation of peptides. Together, bioinformatics and peptidomics approaches provide a low-cost and effective means of predicting, profiling, and screening bioactive protein hydrolysates and peptides from food. This article discuses the basis, strengths, and limitations of bioinformatics and peptidomics approaches currently used for the discovery and analysis of food-derived bioactive peptides.

  9. High-throughput metabolic stability studies in drug discovery by orthogonal acceleration time-of-flight (OATOF) with analogue-to-digital signal capture (ADC).

    PubMed

    Temesi, David G; Martin, Scott; Smith, Robin; Jones, Christopher; Middleton, Brian

    2010-06-30

    Screening assays capable of performing quantitative analysis on hundreds of compounds per week are used to measure metabolic stability during early drug discovery. Modern orthogonal acceleration time-of-flight (OATOF) mass spectrometers equipped with analogue-to-digital signal capture (ADC) now offer performance levels suitable for many applications normally supported by triple quadruple instruments operated in multiple reaction monitoring (MRM) mode. Herein the merits of MRM and OATOF with ADC detection are compared for more than 1000 compounds screened in rat and/or cryopreserved human hepatocytes over a period of 3 months. Statistical comparison of a structurally diverse subset indicated good agreement for the two detection methods. The overall success rate was higher using OATOF detection and data acquisition time was reduced by around 20%. Targeted metabolites of diazepam were detected in samples from a CLint determination performed at 1 microM. Data acquisition by positive and negative ion mode switching can be achieved on high-performance liquid chromatography (HPLC) peak widths as narrow as 0.2 min (at base), thus enabling a more comprehensive first pass analysis with fast HPLC gradients. Unfortunately, most existing OATOF instruments lack the software tools necessary to rapidly convert the huge amounts of raw data into quantified results. Software with functionality similar to open access triple quadrupole systems is needed for OATOF to truly compete in a high-throughput screening environment. Copyright 2010 John Wiley & Sons, Ltd.

  10. Selection of Therapeutic H5N1 Monoclonal Antibodies Following IgVH Repertoire Analysis in Mice

    PubMed Central

    Gray, Sean A.; Moore, Margaret; VandenEkart, Emily J.; Roque, Richard P.; Bowen, Richard A.; Van Hoeven, Neal; Wiley, Steven R.; Clegg, Christopher H.

    2016-01-01

    The rapid rate of influenza virus mutation drives the emergence of new strains that inflict serious seasonal epidemics and less frequent, but more deadly, pandemics. While vaccination provides the best protection against influenza, its utility is often diminished by the unpredictability of new pathogenic strains. Consequently, efforts are underway to identify new antiviral drugs and monoclonal antibodies that can be used to treat recently infected individuals and prevent disease in vulnerable populations. Next Generation Sequencing (NGS) and the analysis of antibody gene repertoires is a valuable tool for Ab discovery. Here, we describe a technology platform for isolating therapeutic monoclonal antibodies (MAbs) by analyzing the IgVH repertoires of mice immunized with recombinant H5N1 hemagglutinin (rH5). As an initial proof of concept, 35 IgVH genes selected using a CDRH3 search algorithm, co-expressed in a murine IgG2a expression vector with a panel of germline murine kappa genes, and culture supernatants screened for antigen binding. Seventeen of the 35 IgVH MAbs (49%) bound rH5VN1203 in preliminary screens and 8 of 9 purified MAbs inhibited 3 heterosubtypic strains of H5N1 virus when assayed by HI, and 2 MAbs demonstrated prophylactic and therapeutic activity in virus-challenged mice. This is the first example in which an NGS discovery platform has been used to isolate anti-influenza MAbs with relevant therapeutic activity. PMID:27109194

  11. Pathway-based analyses.

    PubMed

    Kent, Jack W

    2016-02-03

    New technologies for acquisition of genomic data, while offering unprecedented opportunities for genetic discovery, also impose severe burdens of interpretation and penalties for multiple testing. The Pathway-based Analyses Group of the Genetic Analysis Workshop 19 (GAW19) sought reduction of multiple-testing burden through various approaches to aggregation of highdimensional data in pathways informed by prior biological knowledge. Experimental methods testedincluded the use of "synthetic pathways" (random sets of genes) to estimate power and false-positive error rate of methods applied to simulated data; data reduction via independent components analysis, single-nucleotide polymorphism (SNP)-SNP interaction, and use of gene sets to estimate genetic similarity; and general assessment of the efficacy of prior biological knowledge to reduce the dimensionality of complex genomic data. The work of this group explored several promising approaches to managing high-dimensional data, with the caveat that these methods are necessarily constrained by the quality of external bioinformatic annotation.

  12. Asymptotics of empirical eigenstructure for high dimensional spiked covariance.

    PubMed

    Wang, Weichen; Fan, Jianqing

    2017-06-01

    We derive the asymptotic distributions of the spiked eigenvalues and eigenvectors under a generalized and unified asymptotic regime, which takes into account the magnitude of spiked eigenvalues, sample size, and dimensionality. This regime allows high dimensionality and diverging eigenvalues and provides new insights into the roles that the leading eigenvalues, sample size, and dimensionality play in principal component analysis. Our results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013). They also reveal the biases of estimating leading eigenvalues and eigenvectors by using principal component analysis, and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases. Our results are successfully applied to outstanding problems in estimation of risks of large portfolios and false discovery proportions for dependent test statistics and are illustrated by simulation studies.

  13. Asymptotics of empirical eigenstructure for high dimensional spiked covariance

    PubMed Central

    Wang, Weichen

    2017-01-01

    We derive the asymptotic distributions of the spiked eigenvalues and eigenvectors under a generalized and unified asymptotic regime, which takes into account the magnitude of spiked eigenvalues, sample size, and dimensionality. This regime allows high dimensionality and diverging eigenvalues and provides new insights into the roles that the leading eigenvalues, sample size, and dimensionality play in principal component analysis. Our results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013). They also reveal the biases of estimating leading eigenvalues and eigenvectors by using principal component analysis, and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases. Our results are successfully applied to outstanding problems in estimation of risks of large portfolios and false discovery proportions for dependent test statistics and are illustrated by simulation studies. PMID:28835726

  14. New insights into old methods for identifying causal rare variants.

    PubMed

    Wang, Haitian; Huang, Chien-Hsun; Lo, Shaw-Hwa; Zheng, Tian; Hu, Inchi

    2011-11-29

    The advance of high-throughput next-generation sequencing technology makes possible the analysis of rare variants. However, the investigation of rare variants in unrelated-individuals data sets faces the challenge of low power, and most methods circumvent the difficulty by using various collapsing procedures based on genes, pathways, or gene clusters. We suggest a new way to identify causal rare variants using the F-statistic and sliced inverse regression. The procedure is tested on the data set provided by the Genetic Analysis Workshop 17 (GAW17). After preliminary data reduction, we ranked markers according to their F-statistic values. Top-ranked markers were then subjected to sliced inverse regression, and those with higher absolute coefficients in the most significant sliced inverse regression direction were selected. The procedure yields good false discovery rates for the GAW17 data and thus is a promising method for future study on rare variants.

  15. Comparison of normalization methods for the analysis of metagenomic gene abundance data.

    PubMed

    Pereira, Mariana Buongermino; Wallroth, Mikael; Jonsson, Viktor; Kristiansson, Erik

    2018-04-20

    In shotgun metagenomics, microbial communities are studied through direct sequencing of DNA without any prior cultivation. By comparing gene abundances estimated from the generated sequencing reads, functional differences between the communities can be identified. However, gene abundance data is affected by high levels of systematic variability, which can greatly reduce the statistical power and introduce false positives. Normalization, which is the process where systematic variability is identified and removed, is therefore a vital part of the data analysis. A wide range of normalization methods for high-dimensional count data has been proposed but their performance on the analysis of shotgun metagenomic data has not been evaluated. Here, we present a systematic evaluation of nine normalization methods for gene abundance data. The methods were evaluated through resampling of three comprehensive datasets, creating a realistic setting that preserved the unique characteristics of metagenomic data. Performance was measured in terms of the methods ability to identify differentially abundant genes (DAGs), correctly calculate unbiased p-values and control the false discovery rate (FDR). Our results showed that the choice of normalization method has a large impact on the end results. When the DAGs were asymmetrically present between the experimental conditions, many normalization methods had a reduced true positive rate (TPR) and a high false positive rate (FPR). The methods trimmed mean of M-values (TMM) and relative log expression (RLE) had the overall highest performance and are therefore recommended for the analysis of gene abundance data. For larger sample sizes, CSS also showed satisfactory performance. This study emphasizes the importance of selecting a suitable normalization methods in the analysis of data from shotgun metagenomics. Our results also demonstrate that improper methods may result in unacceptably high levels of false positives, which in turn may lead to incorrect or obfuscated biological interpretation.

  16. West Chalkey, Cameron Parish, Louisiana - A case for continued exploration in mature producing provinces

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klefstad, G.E.

    A potential giant gas field has been discovered in the very mature exploration province of south Louisiana by Transco Exploration Partners (TXP) and Exxon Company USA. The West Chalkley prospect is located in Cameron Parish, Louisiana, and is productive in the upper Oligocene Miogypsinoides (Miogyp) sandstones. The discovery is in the same producing trend as the prolific South Lake Arthur field, where the Miogyp sandstones have gas reserves on the order of 1.0 tcf The prospect was generated by a combination of trend analysis, subsurface well control, and reflection seismic data. The feature appears to be a faulted anticline separatemore » from the nearest production in the area, Chalkley field, which is located about 1 mi cast and discovered in 1938. Both TXP and Exxon, working independently, recognized the potential prospect and pursued leasing activities in the area. TXP initiated discussions with the landowner in February 1988 and acquired a 960 ac lease in June. Exxon leased approximately 2,100 ac surrounding the TXP lease about one month later. TXP subsequently sold the prospect to Exxon on October 12, 1988. The Exxon 1 Sweet Lake Land and Oil Company was spudded on March 16, 1989, and reached total depth of 15,600 ft on July 4, 1989. Log analysis indicated nearly 500 net ft of gas pay in the 805-ft gross productive interval. Testing through perforations near the base of the pay zone yielded flow rates as high as 21.28 MMCFGPD and 338 BOPD. The discovery well is expected to be on production by early 1990 at rates approaching 50 MMCFGPD. Two delineation wells are currently drilling and a deeper pool wildcat is planned to spud around mid-1990 to determine the areal extent and ultimate size of this important new find.« less

  17. Exploring relation types for literature-based discovery.

    PubMed

    Preiss, Judita; Stevenson, Mark; Gaizauskas, Robert

    2015-09-01

    Literature-based discovery (LBD) aims to identify "hidden knowledge" in the medical literature by: (1) analyzing documents to identify pairs of explicitly related concepts (terms), then (2) hypothesizing novel relations between pairs of unrelated concepts that are implicitly related via a shared concept to which both are explicitly related. Many LBD approaches use simple techniques to identify semantically weak relations between concepts, for example, document co-occurrence. These generate huge numbers of hypotheses, difficult for humans to assess. More complex techniques rely on linguistic analysis, for example, shallow parsing, to identify semantically stronger relations. Such approaches generate fewer hypotheses, but may miss hidden knowledge. The authors investigate this trade-off in detail, comparing techniques for identifying related concepts to discover which are most suitable for LBD. A generic LBD system that can utilize a range of relation types was developed. Experiments were carried out comparing a number of techniques for identifying relations. Two approaches were used for evaluation: replication of existing discoveries and the "time slicing" approach.(1) RESULTS: Previous LBD discoveries could be replicated using relations based either on document co-occurrence or linguistic analysis. Using relations based on linguistic analysis generated many fewer hypotheses, but a significantly greater proportion of them were candidates for hidden knowledge. The use of linguistic analysis-based relations improves accuracy of LBD without overly damaging coverage. LBD systems often generate huge numbers of hypotheses, which are infeasible to manually review. Improving their accuracy has the potential to make these systems significantly more usable. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  18. Computational Systems Biology Approach Predicts Regulators and Targets of microRNAs and Their Genomic Hotspots in Apoptosis Process.

    PubMed

    Alanazi, Ibrahim O; Ebrahimie, Esmaeil

    2016-07-01

    Novel computational systems biology tools such as common targets analysis, common regulators analysis, pathway discovery, and transcriptomic-based hotspot discovery provide new opportunities in understanding of apoptosis molecular mechanisms. In this study, after measuring the global contribution of microRNAs in the course of apoptosis by Affymetrix platform, systems biology tools were utilized to obtain a comprehensive view on the role of microRNAs in apoptosis process. Network analysis and pathway discovery highlighted the crosstalk between transcription factors and microRNAs in apoptosis. Within the transcription factors, PRDM1 showed the highest upregulation during the course of apoptosis, with more than 9-fold expression increase compared to non-apoptotic condition. Within the microRNAs, MIR1208 showed the highest expression in non-apoptotic condition and downregulated by more than 6 fold during apoptosis. Common regulators algorithm showed that TNF receptor is the key upstream regulator with a high number of regulatory interactions with the differentially expressed microRNAs. BCL2 and AKT1 were the key downstream targets of differentially expressed microRNAs. Enrichment analysis of the genomic locations of differentially expressed microRNAs led us to the discovery of chromosome bands which were highly enriched (p < 0.01) with the apoptosis-related microRNAs, such as 13q31.3, 19p13.13, and Xq27.3 This study opens a new avenue in understanding regulatory mechanisms and downstream functions in the course of apoptosis as well as distinguishing genomic-enriched hotspots for apoptosis process.

  19. Bioenergy Knowledge Discovery Framework Fact Sheet

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    The Bioenergy Knowledge Discovery Framework (KDF) supports the development of a sustainable bioenergy industry by providing access to a variety of data sets, publications, and collaboration and mapping tools that support bioenergy research, analysis, and decision making. In the KDF, users can search for information, contribute data, and use the tools and map interface to synthesize, analyze, and visualize information in a spatially integrated manner.

  20. C-band radar pulse Doppler error: Its discovery, modeling, and elimination

    NASA Technical Reports Server (NTRS)

    Krabill, W. B.; Dempsey, D. J.

    1978-01-01

    The discovery of a C Band radar pulse Doppler error is discussed and use of the GEOS 3 satellite's coherent transponder to isolate the error source is described. An analysis of the pulse Doppler tracking loop is presented and a mathematical model for the error was developed. Error correction techniques were developed and are described including implementation details.

  1. From scientific discovery to cures: bright stars within a galaxy.

    PubMed

    Williams, R Sanders; Lotia, Samad; Holloway, Alisha K; Pico, Alexander R

    2015-09-24

    We propose that data mining and network analysis utilizing public databases can identify and quantify relationships between scientific discoveries and major advances in medicine (cures). Further development of such approaches could help to increase public understanding and governmental support for life science research and could enhance decision making in the quest for cures. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. On Line Instruction: An Opportunity to Re-Examine and Re-Invent Pedagogy

    ERIC Educational Resources Information Center

    Rosenthal, Irene

    2010-01-01

    Author recounts ten discoveries she made about on-line instruction that were beyond her field of vision when she was still viewing it though the lens of traditional classroom instruction. The discoveries include what she learned by reviewing the research in effective course design and a discourse analysis she conducted of the number and types of…

  3. Challenges in Biomarker Discovery: Combining Expert Insights with Statistical Analysis of Complex Omics Data

    PubMed Central

    McDermott, Jason E.; Wang, Jing; Mitchell, Hugh; Webb-Robertson, Bobbie-Jo; Hafen, Ryan; Ramey, John; Rodland, Karin D.

    2012-01-01

    Introduction The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful molecular signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities for more sophisticated approaches to integrating purely statistical and expert knowledge-based approaches. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges that have been encountered in deriving valid and useful signatures of disease. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to identify predictive signatures of disease are key to future success in the biomarker field. We will describe our recommendations for possible approaches to this problem including metrics for the evaluation of biomarkers. PMID:23335946

  4. Three-Dimensional in Vitro Cell Culture Models in Drug Discovery and Drug Repositioning

    PubMed Central

    Langhans, Sigrid A.

    2018-01-01

    Drug development is a lengthy and costly process that proceeds through several stages from target identification to lead discovery and optimization, preclinical validation and clinical trials culminating in approval for clinical use. An important step in this process is high-throughput screening (HTS) of small compound libraries for lead identification. Currently, the majority of cell-based HTS is being carried out on cultured cells propagated in two-dimensions (2D) on plastic surfaces optimized for tissue culture. At the same time, compelling evidence suggests that cells cultured in these non-physiological conditions are not representative of cells residing in the complex microenvironment of a tissue. This discrepancy is thought to be a significant contributor to the high failure rate in drug discovery, where only a low percentage of drugs investigated ever make it through the gamut of testing and approval to the market. Thus, three-dimensional (3D) cell culture technologies that more closely resemble in vivo cell environments are now being pursued with intensity as they are expected to accommodate better precision in drug discovery. Here we will review common approaches to 3D culture, discuss the significance of 3D cultures in drug resistance and drug repositioning and address some of the challenges of applying 3D cell cultures to high-throughput drug discovery. PMID:29410625

  5. 4-Hydroxyphenylpyruvate Dioxygenase Inhibitors: From Chemical Biology to Agrochemicals.

    PubMed

    Ndikuryayo, Ferdinand; Moosavi, Behrooz; Yang, Wen-Chao; Yang, Guang-Fu

    2017-10-04

    The development of new herbicides is receiving considerable attention to control weed biotypes resistant to current herbicides. Consequently, new enzymes are always desired as targets for herbicide discovery. 4-Hydroxyphenylpyruvate dioxygenase (HPPD, EC 1.13.11.27) is an enzyme engaged in photosynthetic activity and catalyzes the transformation of 4-hydroxyphenylpyruvic acid (HPPA) into homogentisic acid (HGA). HPPD inhibitors constitute a promising area of discovery and development of innovative herbicides with some advantages, including excellent crop selectivity, low application rates, and broad-spectrum weed control. HPPD inhibitors have been investigated for agrochemical interests, and some of them have already been commercialized as herbicides. In this review, we mainly focus on the chemical biology of HPPD, discovery of new potential inhibitors, and strategies for engineering transgenic crops resistant to current HPPD-inhibiting herbicides. The conclusion raises some relevant gaps for future research directions.

  6. Bayesian Models Leveraging Bioactivity and Cytotoxicity Information for Drug Discovery

    PubMed Central

    Ekins, Sean; Reynolds, Robert C.; Kim, Hiyun; Koo, Mi-Sun; Ekonomidis, Marilyn; Talaue, Meliza; Paget, Steve D.; Woolhiser, Lisa K.; Lenaerts, Anne J.; Bunin, Barry A.; Connell, Nancy; Freundlich, Joel S.

    2013-01-01

    SUMMARY Identification of unique leads represents a significant challenge in drug discovery. This hurdle is magnified in neglected diseases such as tuberculosis. We have leveraged public high-throughput screening (HTS) data, to experimentally validate virtual screening approach employing Bayesian models built with bioactivity information (single-event model) as well as bioactivity and cytotoxicity information (dual-event model). We virtually screen a commercial library and experimentally confirm actives with hit rates exceeding typical HTS results by 1-2 orders of magnitude. The first dual-event Bayesian model identified compounds with antitubercular whole-cell activity and low mammalian cell cytotoxicity from a published set of antimalarials. The most potent hit exhibits the in vitro activity and in vitro/in vivo safety profile of a drug lead. These Bayesian models offer significant economies in time and cost to drug discovery. PMID:23521795

  7. Fast radio burst event rate counts - I. Interpreting the observations

    NASA Astrophysics Data System (ADS)

    Macquart, J.-P.; Ekers, R. D.

    2018-02-01

    The fluence distribution of the fast radio burst (FRB) population (the `source count' distribution, N (>F) ∝Fα), is a crucial diagnostic of its distance distribution, and hence the progenitor evolutionary history. We critically reanalyse current estimates of the FRB source count distribution. We demonstrate that the Lorimer burst (FRB 010724) is subject to discovery bias, and should be excluded from all statistical studies of the population. We re-examine the evidence for flat, α > -1, source count estimates based on the ratio of single-beam to multiple-beam detections with the Parkes multibeam receiver, and show that current data imply only a very weak constraint of α ≲ -1.3. A maximum-likelihood analysis applied to the portion of the Parkes FRB population detected above the observational completeness fluence of 2 Jy ms yields α = -2.6_{-1.3}^{+0.7 }. Uncertainties in the location of each FRB within the Parkes beam render estimates of the Parkes event rate uncertain in both normalizing survey area and the estimated post-beam-corrected completeness fluence; this uncertainty needs to be accounted for when comparing the event rate against event rates measured at other telescopes.

  8. Genome-Wide Methylation Analyses in Glioblastoma Multiforme

    PubMed Central

    Lai, Rose K.; Chen, Yanwen; Guan, Xiaowei; Nousome, Darryl; Sharma, Charu; Canoll, Peter; Bruce, Jeffrey; Sloan, Andrew E.; Cortes, Etty; Vonsattel, Jean-Paul; Su, Tao; Delgado-Cruzata, Lissette; Gurvich, Irina; Santella, Regina M.; Ostrom, Quinn; Lee, Annette; Gregersen, Peter; Barnholtz-Sloan, Jill

    2014-01-01

    Few studies had investigated genome-wide methylation in glioblastoma multiforme (GBM). Our goals were to study differential methylation across the genome in gene promoters using an array-based method, as well as repetitive elements using surrogate global methylation markers. The discovery sample set for this study consisted of 54 GBM from Columbia University and Case Western Reserve University, and 24 brain controls from the New York Brain Bank. We assembled a validation dataset using methylation data of 162 TCGA GBM and 140 brain controls from dbGAP. HumanMethylation27 Analysis Bead-Chips (Illumina) were used to interrogate 26,486 informative CpG sites in both the discovery and validation datasets. Global methylation levels were assessed by analysis of L1 retrotransposon (LINE1), 5 methyl-deoxycytidine (5m-dC) and 5 hydroxylmethyl-deoxycytidine (5hm-dC) in the discovery dataset. We validated a total of 1548 CpG sites (1307 genes) that were differentially methylated in GBM compared to controls. There were more than twice as many hypomethylated genes as hypermethylated ones. Both the discovery and validation datasets found 5 tumor methylation classes. Pathway analyses showed that the top ten pathways in hypomethylated genes were all related to functions of innate and acquired immunities. Among hypermethylated pathways, transcriptional regulatory network in embryonic stem cells was the most significant. In the study of global methylation markers, 5m-dC level was the best discriminant among methylation classes, whereas in survival analyses, high level of LINE1 methylation was an independent, favorable prognostic factor in the discovery dataset. Based on a pathway approach, hypermethylation in genes that control stem cell differentiation were significant, poor prognostic factors of overall survival in both the discovery and validation datasets. Approaches that targeted these methylated genes may be a future therapeutic goal. PMID:24586730

  9. User needs analysis and usability assessment of DataMed - a biomedical data discovery index.

    PubMed

    Dixit, Ram; Rogith, Deevakar; Narayana, Vidya; Salimi, Mandana; Gururaj, Anupama; Ohno-Machado, Lucila; Xu, Hua; Johnson, Todd R

    2017-11-30

    To present user needs and usability evaluations of DataMed, a Data Discovery Index (DDI) that allows searching for biomedical data from multiple sources. We conducted 2 phases of user studies. Phase 1 was a user needs analysis conducted before the development of DataMed, consisting of interviews with researchers. Phase 2 involved iterative usability evaluations of DataMed prototypes. We analyzed data qualitatively to document researchers' information and user interface needs. Biomedical researchers' information needs in data discovery are complex, multidimensional, and shaped by their context, domain knowledge, and technical experience. User needs analyses validate the need for a DDI, while usability evaluations of DataMed show that even though aggregating metadata into a common search engine and applying traditional information retrieval tools are promising first steps, there remain challenges for DataMed due to incomplete metadata and the complexity of data discovery. Biomedical data poses distinct problems for search when compared to websites or publications. Making data available is not enough to facilitate biomedical data discovery: new retrieval techniques and user interfaces are necessary for dataset exploration. Consistent, complete, and high-quality metadata are vital to enable this process. While available data and researchers' information needs are complex and heterogeneous, a successful DDI must meet those needs and fit into the processes of biomedical researchers. Research directions include formalizing researchers' information needs, standardizing overviews of data to facilitate relevance judgments, implementing user interfaces for concept-based searching, and developing evaluation methods for open-ended discovery systems such as DDIs. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  10. An Analysis of Citizen Science Based Research: Usage and Publication Patterns

    PubMed Central

    Follett, Ria; Strezov, Vladimir

    2015-01-01

    The use of citizen science for scientific discovery relies on the acceptance of this method by the scientific community. Using the Web of Science and Scopus as the source of peer reviewed articles, an analysis of all published articles on “citizen science” confirmed its growth, and found that significant research on methodology and validation techniques preceded the rapid rise of the publications on research outcomes based on citizen science methods. Of considerable interest is the growing number of studies relying on the re-use of collected datasets from past citizen science research projects, which used data from either individual or multiple citizen science projects for new discoveries, such as for climate change research. The extent to which citizen science has been used in scientific discovery demonstrates its importance as a research approach. This broad analysis of peer reviewed papers on citizen science, that included not only citizen science projects, but the theory and methods developed to underpin the research, highlights the breadth and depth of the citizen science approach and encourages cross-fertilization between the different disciplines. PMID:26600041

  11. Chromatogram-Bioactivity Correlation-Based Discovery and Identification of Three Bioactive Compounds Affecting Endothelial Function in Ginkgo Biloba Extract.

    PubMed

    Liu, Hong; Tan, Li-Ping; Huang, Xin; Liao, Yi-Qiu; Zhang, Wei-Jian; Li, Pei-Bo; Wang, Yong-Gang; Peng, Wei; Wu, Zhong; Su, Wei-Wei; Yao, Hong-Liang

    2018-05-03

    Discovery and identification of three bioactive compounds affecting endothelial function in Ginkgo biloba Extract (GBE) based on chromatogram-bioactivity correlation analysis. Three portions were separated from GBE via D101 macroporous resin and then re-combined to prepare nine GBE samples. 21 compounds in GBE samples were identified through UFLC-DAD-Q-TOF-MS/MS. Correlation analysis between compounds differences and endothelin-1 (ET-1) in vivo in nine GBE samples was conducted. The analysis results indicated that three bioactive compounds had close relevance to ET-1: Kaempferol-3- O -α-l-glucoside, 3- O -{2- O -{6- O -[P-OH-trans-cinnamoyl]-β-d-glucosyl}-α-rhamnosyl} Quercetin isomers, and 3- O -{2- O -{6- O -[P-OH-trans-cinnamoyl]-β-d-glucosyl}-α-rhamnosyl} Kaempferide. The discovery of bioactive compounds could provide references for the quality control and novel pharmaceuticals development of GRE. The present work proposes a feasible chromatogram-bioactivity correlation based approach to discover the compounds and define their bioactivities for the complex multi-component systems.

  12. An Analysis of Citizen Science Based Research: Usage and Publication Patterns.

    PubMed

    Follett, Ria; Strezov, Vladimir

    2015-01-01

    The use of citizen science for scientific discovery relies on the acceptance of this method by the scientific community. Using the Web of Science and Scopus as the source of peer reviewed articles, an analysis of all published articles on "citizen science" confirmed its growth, and found that significant research on methodology and validation techniques preceded the rapid rise of the publications on research outcomes based on citizen science methods. Of considerable interest is the growing number of studies relying on the re-use of collected datasets from past citizen science research projects, which used data from either individual or multiple citizen science projects for new discoveries, such as for climate change research. The extent to which citizen science has been used in scientific discovery demonstrates its importance as a research approach. This broad analysis of peer reviewed papers on citizen science, that included not only citizen science projects, but the theory and methods developed to underpin the research, highlights the breadth and depth of the citizen science approach and encourages cross-fertilization between the different disciplines.

  13. Concept of operations for knowledge discovery from Big Data across enterprise data warehouses

    NASA Astrophysics Data System (ADS)

    Sukumar, Sreenivas R.; Olama, Mohammed M.; McNair, Allen W.; Nutaro, James J.

    2013-05-01

    The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Options that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.

  14. Culture-independent discovery of natural products from soil metagenomes.

    PubMed

    Katz, Micah; Hover, Bradley M; Brady, Sean F

    2016-03-01

    Bacterial natural products have proven to be invaluable starting points in the development of many currently used therapeutic agents. Unfortunately, traditional culture-based methods for natural product discovery have been deemphasized by pharmaceutical companies due in large part to high rediscovery rates. Culture-independent, or "metagenomic," methods, which rely on the heterologous expression of DNA extracted directly from environmental samples (eDNA), have the potential to provide access to metabolites encoded by a large fraction of the earth's microbial biosynthetic diversity. As soil is both ubiquitous and rich in bacterial diversity, it is an appealing starting point for culture-independent natural product discovery efforts. This review provides an overview of the history of soil metagenome-driven natural product discovery studies and elaborates on the recent development of new tools for sequence-based, high-throughput profiling of environmental samples used in discovering novel natural product biosynthetic gene clusters. We conclude with several examples of these new tools being employed to facilitate the recovery of novel secondary metabolite encoding gene clusters from soil metagenomes and the subsequent heterologous expression of these clusters to produce bioactive small molecules.

  15. The early bird gets the worm: foraging strategies of wild songbirds lead to the early discovery of food sources

    PubMed Central

    Farine, Damien R.; Lang, Stephen D. J.

    2013-01-01

    Animals need to manage the combined risks of predation and starvation in order to survive. Theoretical and empirical studies have shown that individuals can reduce predation risk by delaying feeding (and hence fat storage) until late afternoon. However, little is known about how individuals manage the opposing pressures of resource uncertainty and predation risks. We suggest that individuals should follow a two-part strategy: prioritizing the discovery of food early in the day and exploiting the best patch late in the day. Using automated data loggers, we tested whether a temporal component exists in the discovery of novel foraging locations by individuals in a mixed-species foraging guild. We found that food deployed in the morning was discovered significantly more often than food deployed in the afternoon. Based on the diurnal activity patterns in this population, overall rates of new arrivals were also significantly higher than expected in the morning and significantly lower than expected in the afternoon. These results align with our predictions of a shift from patch discovery to exploitation over the course of the day. PMID:24108676

  16. A renaissance of neural networks in drug discovery.

    PubMed

    Baskin, Igor I; Winkler, David; Tetko, Igor V

    2016-08-01

    Neural networks are becoming a very popular method for solving machine learning and artificial intelligence problems. The variety of neural network types and their application to drug discovery requires expert knowledge to choose the most appropriate approach. In this review, the authors discuss traditional and newly emerging neural network approaches to drug discovery. Their focus is on backpropagation neural networks and their variants, self-organizing maps and associated methods, and a relatively new technique, deep learning. The most important technical issues are discussed including overfitting and its prevention through regularization, ensemble and multitask modeling, model interpretation, and estimation of applicability domain. Different aspects of using neural networks in drug discovery are considered: building structure-activity models with respect to various targets; predicting drug selectivity, toxicity profiles, ADMET and physicochemical properties; characteristics of drug-delivery systems and virtual screening. Neural networks continue to grow in importance for drug discovery. Recent developments in deep learning suggests further improvements may be gained in the analysis of large chemical data sets. It's anticipated that neural networks will be more widely used in drug discovery in the future, and applied in non-traditional areas such as drug delivery systems, biologically compatible materials, and regenerative medicine.

  17. Potential biological targets for bioassay development in drug discovery of Sturge-Weber syndrome.

    PubMed

    Mohammadipanah, Fatemeh; Salimi, Fatemeh

    2018-02-01

    Sturge-Weber Syndrome (SWS) is a neurocutaneous disease with clinical manifestations including ocular (glaucoma), cutaneous (port-wine birthmark), neurologic (seizures), and vascular problems. Molecular mechanisms of SWS pathogenesis are initiated by the somatic mutation in GNAQ. Therefore, no definite treatments exist for SWS and treatment options only mitigate the intensity of its clinical manifestations. Biological assay design for drug discovery against this syndrome demands comprehensive knowledge on mechanisms which are involved in its pathogenesis. By analysis of the interrelated molecular targets of SWS, some in vitro bioassay systems can be allotted for drug screening against its progression. Development of such platforms of bioassay can bring along the implementation of high-throughput screening of natural or synthetic compounds in drug discovery programs. Regarding the fact that study of molecular targets and their integration in biological assay design can facilitate the process of effective drug discovery; some potential biological targets and their respective biological assay for SWS drug discovery are propounded in this review. For this purpose, some biological targets for SWS drug discovery such as acetylcholinesterase, alkaline phosphatase, GABAergic receptors, Hypoxia-Inducible Factor (HIF)-1α and 2α are suggested. © 2017 John Wiley & Sons A/S.

  18. Discovery informatics in biological and biomedical sciences: research challenges and opportunities.

    PubMed

    Honavar, Vasant

    2015-01-01

    New discoveries in biological, biomedical and health sciences are increasingly being driven by our ability to acquire, share, integrate and analyze, and construct and simulate predictive models of biological systems. While much attention has focused on automating routine aspects of management and analysis of "big data", realizing the full potential of "big data" to accelerate discovery calls for automating many other aspects of the scientific process that have so far largely resisted automation: identifying gaps in the current state of knowledge; generating and prioritizing questions; designing studies; designing, prioritizing, planning, and executing experiments; interpreting results; forming hypotheses; drawing conclusions; replicating studies; validating claims; documenting studies; communicating results; reviewing results; and integrating results into the larger body of knowledge in a discipline. Against this background, the PSB workshop on Discovery Informatics in Biological and Biomedical Sciences explores the opportunities and challenges of automating discovery or assisting humans in discovery through advances (i) Understanding, formalization, and information processing accounts of, the entire scientific process; (ii) Design, development, and evaluation of the computational artifacts (representations, processes) that embody such understanding; and (iii) Application of the resulting artifacts and systems to advance science (by augmenting individual or collective human efforts, or by fully automating science).

  19. Simulated JWST/NIRISS Transit Spectroscopy of Anticipated TESS Planets Compared to Select Discoveries from Space-Based and Ground-Based Surveys

    NASA Astrophysics Data System (ADS)

    Louie, Dana; Deming, Drake; Albert, Loic; Bouma, Luke; Bean, Jacob; Lopez-Morales, Mercedes

    2018-01-01

    The Transiting Exoplanet Survey Satellite (TESS) will embark in 2018 on a 2-year wide-field survey mission of most of the celestial sky, discovering over a thousand super-Earth and sub-Neptune-sized exoplanets potentially suitable for follow-up observations using the James Webb Space Telescope (JWST). Bouma et al. (2017) and Sullivan et al. (2015) used Monte Carlo simulations to predict the properties of the planetary systems that TESS is likely to detect, basing their simulations upon Kepler-derived planet occurrence rates and photometric performance models for the TESS cameras. We employed a JWST Near InfraRed Imager and Slitless Spectrograph (NIRISS) simulation tool to estimate the signal-to-noise (S/N) that JWST/NIRISS will attain in transmission spectroscopy of these anticipated TESS discoveries, and we then compared the S/N for anticipated TESS discoveries to our estimates of S/N for 18 known exoplanets. We analyzed the sensitivity of our results to planetary composition, cloud cover, and presence of an observational noise floor. We find that only a few anticipated TESS discoveries in the terrestrial planet regime will result in better JWST/NIRISS S/N than currently known exoplanets, such as the TRAPPIST-1 planets, GJ1132b, or LHS1140b. However, we emphasize that this outcome is based upon Kepler-derived occurrence rates, and that co-planar compact systems (e.g. TRAPPIST-1) were not included in predicting the anticipated TESS planet yield. Furthermore, our results show that several hundred anticipated TESS discoveries in the super-Earth and sub-Neptune regime will produce S/N higher than currently known exoplanets such as K2-3b or K2-3c. We apply our results to estimate the scope of a JWST follow-up observation program devoted to mapping the transition region between high molecular weight and primordial planetary atmospheres.

  20. Data mining for better material synthesis: The case of pulsed laser deposition of complex oxides

    NASA Astrophysics Data System (ADS)

    Young, Steven R.; Maksov, Artem; Ziatdinov, Maxim; Cao, Ye; Burch, Matthew; Balachandran, Janakiraman; Li, Linglong; Somnath, Suhas; Patton, Robert M.; Kalinin, Sergei V.; Vasudevan, Rama K.

    2018-03-01

    The pursuit of more advanced electronics, and finding solutions to energy needs often hinges upon the discovery and optimization of new functional materials. However, the discovery rate of these materials is alarmingly low. Much of the information that could drive this rate higher is scattered across tens of thousands of papers in the extant literature published over several decades but is not in an indexed form, and cannot be used in entirety without substantial effort. Many of these limitations can be circumvented if the experimentalist has access to systematized collections of prior experimental procedures and results. Here, we investigate the property-processing relationship during growth of oxide films by pulsed laser deposition. To do so, we develop an enabling software tool to (1) mine the literature of relevant papers for synthesis parameters and functional properties of previously studied materials, (2) enhance the accuracy of this mining through crowd sourcing approaches, (3) create a searchable repository that will be a community-wide resource enabling material scientists to leverage this information, and (4) provide through the Jupyter notebook platform, simple machine-learning-based analysis to learn the complex interactions between growth parameters and functional properties (all data/codes available on https://github.com/ORNL-DataMatls). The results allow visualization of growth windows, trends and outliers, which can serve as a template for analyzing the distribution of growth conditions, provide starting points for related compounds and act as a feedback for first-principles calculations. Such tools will comprise an integral part of the materials design schema in the coming decade.

  1. Automated discovery systems and the inductivist controversy

    NASA Astrophysics Data System (ADS)

    Giza, Piotr

    2017-09-01

    The paper explores possible influences that some developments in the field of branches of AI, called automated discovery and machine learning systems, might have upon some aspects of the old debate between Francis Bacon's inductivism and Karl Popper's falsificationism. Donald Gillies facetiously calls this controversy 'the duel of two English knights', and claims, after some analysis of historical cases of discovery, that Baconian induction had been used in science very rarely, or not at all, although he argues that the situation has changed with the advent of machine learning systems. (Some clarification of terms machine learning and automated discovery is required here. The key idea of machine learning is that, given data with associated outcomes, software can be trained to make those associations in future cases which typically amounts to inducing some rules from individual cases classified by the experts. Automated discovery (also called machine discovery) deals with uncovering new knowledge that is valuable for human beings, and its key idea is that discovery is like other intellectual tasks and that the general idea of heuristic search in problem spaces applies also to discovery tasks. However, since machine learning systems discover (very low-level) regularities in data, throughout this paper I use the generic term automated discovery for both kinds of systems. I will elaborate on this later on). Gillies's line of argument can be generalised: thanks to automated discovery systems, philosophers of science have at their disposal a new tool for empirically testing their philosophical hypotheses. Accordingly, in the paper, I will address the question, which of the two philosophical conceptions of scientific method is better vindicated in view of the successes and failures of systems developed within three major research programmes in the field: machine learning systems in the Turing tradition, normative theory of scientific discovery formulated by Herbert Simon's group and the programme called HHNT, proposed by J. Holland, K. Holyoak, R. Nisbett and P. Thagard.

  2. Recent trends in spin-resolved photoelectron spectroscopy

    NASA Astrophysics Data System (ADS)

    Okuda, Taichi

    2017-12-01

    Since the discovery of the Rashba effect on crystal surfaces and also the discovery of topological insulators, spin- and angle-resolved photoelectron spectroscopy (SARPES) has become more and more important, as the technique can measure directly the electronic band structure of materials with spin resolution. In the same way that the discovery of high-Tc superconductors promoted the development of high-resolution angle-resolved photoelectron spectroscopy, the discovery of this new class of materials has stimulated the development of new SARPES apparatus with new functions and higher resolution, such as spin vector analysis, ten times higher energy and angular resolution than conventional SARPES, multichannel spin detection, and so on. In addition, the utilization of vacuum ultra violet lasers also opens a pathway to the realization of novel SARPES measurements. In this review, such recent trends in SARPES techniques and measurements will be overviewed.

  3. Variability in nest survival rates and implications to nesting studies

    USGS Publications Warehouse

    Klett, A.T.; Johnson, D.H.

    1982-01-01

    We used four reasonably large samples (83-213) of Mallard (Anas platyrhynchos) and Blue-winged Teal (A. discors) nests on an interstate highway right-of-way in southcentral North Dakota to evaluate potential biases in hatch-rate estimates. Twelve consecutive, weekly searches for nests were conducted with a cable-chain drag in 1976 and 1977. Nests were revisited at weekly intervals. Four methods were used to estimate hatch rates for the four data sets: the Traditional Method, the Mayfield Method, and two modifications of the Mayfield Method that are sometimes appropriate when daily mortality rates of nests are not constant. Hatch rates and the average age of nests at discovery declined as the interval between searches decreased, suggesting that mortality rates were not constant in our samples. An analysis of variance indicated that daily mortality rates varied with the age of nests in all four samples. Mortality was generally highest during the early laying period, moderately high during the late laying period, and lowest during incubation. We speculate that this relationship of mortality to nest age might be due to the presence of hens at nests or to differences in the vulnerability of nest sites to predation. A modification of the Mayfield Method that accounts for age-related variation in nest mortality was most appropriate for our samples. We suggest methods for conducting nesting studies and estimating nest success for species possessing similar nesting habits.

  4. Regulation of gene expression in the mammalian eye and its relevance to eye disease

    PubMed Central

    Scheetz, Todd E.; Kim, Kwang-Youn A.; Swiderski, Ruth E.; Philp, Alisdair R.; Braun, Terry A.; Knudtson, Kevin L.; Dorrance, Anne M.; DiBona, Gerald F.; Huang, Jian; Casavant, Thomas L.; Sheffield, Val C.; Stone, Edwin M.

    2006-01-01

    We used expression quantitative trait locus mapping in the laboratory rat (Rattus norvegicus) to gain a broad perspective of gene regulation in the mammalian eye and to identify genetic variation relevant to human eye disease. Of >31,000 gene probes represented on an Affymetrix expression microarray, 18,976 exhibited sufficient signal for reliable analysis and at least 2-fold variation in expression among 120 F2 rats generated from an SR/JrHsd × SHRSP intercross. Genome-wide linkage analysis with 399 genetic markers revealed significant linkage with at least one marker for 1,300 probes (α = 0.001; estimated empirical false discovery rate = 2%). Both contiguous and noncontiguous loci were found to be important in regulating mammalian eye gene expression. We investigated one locus of each type in greater detail and identified putative transcription-altering variations in both cases. We found an inserted cREL binding sequence in the 5′ flanking sequence of the Abca4 gene associated with an increased expression level of that gene, and we found a mutation of the gene encoding thyroid hormone receptor β2 associated with a decreased expression level of the gene encoding short-wavelength sensitive opsin (Opn1sw). In addition to these positional studies, we performed a pairwise analysis of gene expression to identify genes that are regulated in a coordinated manner and used this approach to validate two previously undescribed genes involved in the human disease Bardet–Biedl syndrome. These data and analytical approaches can be used to facilitate the discovery of additional genes and regulatory elements involved in human eye disease. PMID:16983098

  5. Gene signatures of postoperative atrial fibrillation in atrial tissue after coronary artery bypass grafting surgery in patients receiving β-blockers.

    PubMed

    Kertai, Miklos D; Qi, Wenjing; Li, Yi-Ju; Lombard, Frederick W; Liu, Yutao; Smith, Michael P; Stafford-Smith, Mark; Newman, Mark F; Milano, Carmelo A; Mathew, Joseph P; Podgoreanu, Mihai V

    2016-03-01

    Atrial tissue gene expression profiling may help to determine how differentially expressed genes in the human atrium before cardiopulmonary bypass (CPB) are related to subsequent biologic pathway activation patterns, and whether specific expression profiles are associated with an increased risk for postoperative atrial fibrillation (AF) or altered response to β-blocker (BB) therapy after coronary artery bypass grafting (CABG) surgery. Right atrial appendage (RAA) samples were collected from 45 patients who were receiving perioperative BB treatment, and underwent CABG surgery. The isolated RNA samples were used for microarray gene expression analysis, to identify probes that were expressed differently in patients with and without postoperative AF. Gene expression analysis was performed to identify probes that were expressed differently in patients with and without postoperative AF. Gene set enrichment analysis (GSEA) was performed to determine how sets of genes might be systematically altered in patients with postoperative AF. Of the 45 patients studied, genomic DNA from 42 patients was used for target sequencing of 66 candidate genes potentially associated with AF, and 2,144 single-nucleotide polymorphisms (SNPs) were identified. We then performed expression quantitative trait loci (eQTL) analysis to determine the correlation between SNPs identified in the genotyped patients, and RAA expression. Probes that met a false discovery rate<0.25 were selected for eQTL analysis. Of the 17,678 gene expression probes analyzed, 2 probes met our prespecified significance threshold of false discovery rate<0.25. The most significant probe corresponded to vesicular overexpressed in cancer - prosurvival protein 1 gene (VOPP1; 1.83 fold change; P=3.47×10(-7)), and was up-regulated in patients with postoperative AF, whereas the second most significant probe, which corresponded to the LOC389286 gene (0.49 fold change; P=1.54×10(-5)), was down-regulated in patients with postoperative AF. GSEA highlighted the role of VOPP1 in pathways with biologic relevance to myocardial homeostasis, and oxidative stress and redox modulation. Candidate gene eQTL showed a trans-acting association between variants of G protein-coupled receptor kinase 5 gene, previously linked to altered BB response, and high expression of VOPP1. In patients undergoing CABG surgery, RAA gene expression profiling, and pathway and eQTL analysis suggested that VOPP1 plays a novel etiological role in postoperative AF despite perioperative BB therapy. Copyright © 2016. Published by Elsevier Ltd.

  6. S&TR Preview: Groundbreaking Laser Set to Energize Science

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Haefner, Constantin

    The High-Repetition-Rate Advanced Petawatt Laser System (HAPLS) is designed to fire 10 times per second, which represents a major advancement over existing petawatt lasers and opens the door to new scientific discoveries.

  7. Migrating and Static Sand Ripples on Mars

    NASA Image and Video Library

    2013-08-28

    This observation from NASA Mars Reconnaissance Orbiter is of one many that highlights new discoveries; one of these is that many sand dunes and ripples are moving, some at rates of several meters per year.

  8. Template occluded SBA-15: An effective dissolution enhancer for poorly water-soluble drug

    NASA Astrophysics Data System (ADS)

    Tingming, Fu; Liwei, Guo; Kang, Le; Tianyao, Wang; Jin, Lu

    2010-09-01

    The aim of the present work was to improve the dissolution rate of piroxicam by inclusion into template occluded SBA-15. Our strategy involves directly introducing piroxicam into as-prepared SBA-15 occluded with P123 (EO 20PO 70EO 20) by self assembling method in acetonitrile/methylene chloride mixture solution. Ultraviolet spectrometry experiment and thermogravimetric analysis-differential scanning calorimetry (TG-DSC) profiles show that the piroxicam and P123 contents in the inclusion compound are 12 wt% and 28 wt%, respectively. X-ray powder diffraction and DSC analysis reveal that the included piroxicam is arranged in amorphous form. N 2 adsorption-desorption experiment indicates that the piroxicam has been introduced to the mesopores instead of precipitating at the outside of the silica material. The inclusion compound was submitted to in vitro dissolution tests, the results show that the piroxicam dissolve from template occluded inclusion compound more rapidly, than these from the piroxicam crystalline and template removed samples in all tested conditions. Thus a facile method to improve the dissolution rate of poorly water-soluble drug was established, and this discovery opens a new avenue for the utilization of templates used for the synthesis of mesoporous materials.

  9. Discovery of Information Diffusion Process in Social Networks

    NASA Astrophysics Data System (ADS)

    Kim, Kwanho; Jung, Jae-Yoon; Park, Jonghun

    Information diffusion analysis in social networks is of significance since it enables us to deeply understand dynamic social interactions among users. In this paper, we introduce approaches to discovering information diffusion process in social networks based on process mining. Process mining techniques are applied from three perspectives: social network analysis, process discovery and community recognition. We then present experimental results by using a real-life social network data. The proposed techniques are expected to employ as new analytical tools in online social networks such as blog and wikis for company marketers, politicians, news reporters and online writers.

  10. Julian Davies and the discovery of kanamycin resistance transposon Tn5.

    PubMed

    Berg, Douglas E

    2017-04-01

    This paper recounts some of my fond memories of a collaboration between Julian Davies and myself that started in 1974 in Geneva and that led to our serendipitous discovery of the bacterial kanamycin resistance transposon Tn5, and aspects of the lasting positive impact of our interaction and discovery on me and the community. Tn5 was one of the first antibiotic resistance transposons to be found. Its analysis over the ensuing decades provided valuable insights into mechanisms and control of transposition, and led to its use as a much-valued tool in diverse areas of molecular genetics, as also will be discussed here.

  11. WebArray: an online platform for microarray data analysis

    PubMed Central

    Xia, Xiaoqin; McClelland, Michael; Wang, Yipeng

    2005-01-01

    Background Many cutting-edge microarray analysis tools and algorithms, including commonly used limma and affy packages in Bioconductor, need sophisticated knowledge of mathematics, statistics and computer skills for implementation. Commercially available software can provide a user-friendly interface at considerable cost. To facilitate the use of these tools for microarray data analysis on an open platform we developed an online microarray data analysis platform, WebArray, for bench biologists to utilize these tools to explore data from single/dual color microarray experiments. Results The currently implemented functions were based on limma and affy package from Bioconductor, the spacings LOESS histogram (SPLOSH) method, PCA-assisted normalization method and genome mapping method. WebArray incorporates these packages and provides a user-friendly interface for accessing a wide range of key functions of limma and others, such as spot quality weight, background correction, graphical plotting, normalization, linear modeling, empirical bayes statistical analysis, false discovery rate (FDR) estimation, chromosomal mapping for genome comparison. Conclusion WebArray offers a convenient platform for bench biologists to access several cutting-edge microarray data analysis tools. The website is freely available at . It runs on a Linux server with Apache and MySQL. PMID:16371165

  12. Getting There "Cuando No Hay Camino" (When There Is No Path): Paths to Discovery "Testimonios" by Chicanas in STEM

    ERIC Educational Resources Information Center

    Cantu, Norma

    2012-01-01

    This essay outlines how the book, "Paths to Discovery: Autobiographies from Chicanas with Careers in Science, Mathematics, and Engineering" (Cantu, 2008) came about. I then use "testimonio" theory to analyze the narratives in this book as the data of a qualitative study, and I describe the general themes that the analysis highlights. I scrutinize…

  13. Utilizing Social Bookmarking Tag Space for Web Content Discovery: A Social Network Analysis Approach

    ERIC Educational Resources Information Center

    Wei, Wei

    2010-01-01

    Social bookmarking has gained popularity since the advent of Web 2.0. Keywords known as tags are created to annotate web content, and the resulting tag space composed of the tags, the resources, and the users arises as a new platform for web content discovery. Useful and interesting web resources can be located through searching and browsing based…

  14. Simultaneous Proteomic Discovery and Targeted Monitoring using Liquid Chromatography, Ion Mobility Spectrometry, and Mass Spectrometry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burnum-Johnson, Kristin E.; Nie, Song; Casey, Cameron P.

    Current proteomics approaches are comprised of both broad discovery measurements as well as more quantitative targeted measurements. These two different measurement types are used to initially identify potentially important proteins (e.g., candidate biomarkers) and then enable improved quantification for a limited number of selected proteins. However, both approaches suffer from limitations, particularly the lower sensitivity, accuracy, and quantitation precision for discovery approaches compared to targeted approaches, and the limited proteome coverage provided by targeted approaches. Herein, we describe a new proteomics approach that allows both discovery and targeted monitoring (DTM) in a single analysis using liquid chromatography, ion mobility spectrometrymore » and mass spectrometry (LC-IMS-MS). In DTM, heavy labeled peptides for target ions are spiked into tryptic digests and both the labeled and unlabeled peptides are broadly detected using LC-IMS-MS instrumentation, allowing the benefits of discovery and targeted approaches. To understand the possible improvement of the DTM approach, it was compared to LC-MS broad measurements using an accurate mass and time tag database and selected reaction monitoring (SRM) targeted measurements. The DTM results yielded greater peptide/protein coverage and a significant improvement in the detection of lower abundance species compared to LC-MS discovery measurements. DTM was also observed to have similar detection limits as SRM for the targeted measurements indicating its potential for combining the discovery and targeted approaches.« less

  15. Citizen Science Initiatives: Engaging the Public and Demystifying Science

    PubMed Central

    Van Vliet, Kim; Moore, Claybourne

    2016-01-01

    The Internet and smart phone technologies have opened up new avenues for collaboration among scientists around the world. These technologies have also expanded citizen science opportunities and public participation in scientific research (PPSR). Here we discuss citizen science, what it is, who does it, and the variety of projects and methods used to increase scientific knowledge and scientific literacy. We describe a number of different types of citizen-science projects. These greatly increase the number of people involved, helping to speed the pace of data analysis and allowing science to advance more rapidly. As a result of the numerous advantages of citizen-science projects, these opportunities are likely to expand in the future and increase the rate of novel discoveries. PMID:27047582

  16. MICROSCOPE Mission: First Results of a Space Test of the Equivalence Principle.

    PubMed

    Touboul, Pierre; Métris, Gilles; Rodrigues, Manuel; André, Yves; Baghi, Quentin; Bergé, Joël; Boulanger, Damien; Bremer, Stefanie; Carle, Patrice; Chhun, Ratana; Christophe, Bruno; Cipolla, Valerio; Damour, Thibault; Danto, Pascale; Dittus, Hansjoerg; Fayet, Pierre; Foulon, Bernard; Gageant, Claude; Guidotti, Pierre-Yves; Hagedorn, Daniel; Hardy, Emilie; Huynh, Phuong-Anh; Inchauspe, Henri; Kayser, Patrick; Lala, Stéphanie; Lämmerzahl, Claus; Lebat, Vincent; Leseur, Pierre; Liorzou, Françoise; List, Meike; Löffler, Frank; Panet, Isabelle; Pouilloux, Benjamin; Prieur, Pascal; Rebray, Alexandre; Reynaud, Serge; Rievers, Benny; Robert, Alain; Selig, Hanns; Serron, Laura; Sumner, Timothy; Tanguy, Nicolas; Visser, Pieter

    2017-12-08

    According to the weak equivalence principle, all bodies should fall at the same rate in a gravitational field. The MICROSCOPE satellite, launched in April 2016, aims to test its validity at the 10^{-15} precision level, by measuring the force required to maintain two test masses (of titanium and platinum alloys) exactly in the same orbit. A nonvanishing result would correspond to a violation of the equivalence principle, or to the discovery of a new long-range force. Analysis of the first data gives δ(Ti,Pt)=[-1±9(stat)±9(syst)]×10^{-15} (1σ statistical uncertainty) for the titanium-platinum Eötvös parameter characterizing the relative difference in their free-fall accelerations.

  17. Citizen Science Initiatives: Engaging the Public and Demystifying Science.

    PubMed

    Van Vliet, Kim; Moore, Claybourne

    2016-03-01

    The Internet and smart phone technologies have opened up new avenues for collaboration among scientists around the world. These technologies have also expanded citizen science opportunities and public participation in scientific research (PPSR). Here we discuss citizen science, what it is, who does it, and the variety of projects and methods used to increase scientific knowledge and scientific literacy. We describe a number of different types of citizen-science projects. These greatly increase the number of people involved, helping to speed the pace of data analysis and allowing science to advance more rapidly. As a result of the numerous advantages of citizen-science projects, these opportunities are likely to expand in the future and increase the rate of novel discoveries.

  18. MICROSCOPE Mission: First Results of a Space Test of the Equivalence Principle

    NASA Astrophysics Data System (ADS)

    Touboul, Pierre; Métris, Gilles; Rodrigues, Manuel; André, Yves; Baghi, Quentin; Bergé, Joël; Boulanger, Damien; Bremer, Stefanie; Carle, Patrice; Chhun, Ratana; Christophe, Bruno; Cipolla, Valerio; Damour, Thibault; Danto, Pascale; Dittus, Hansjoerg; Fayet, Pierre; Foulon, Bernard; Gageant, Claude; Guidotti, Pierre-Yves; Hagedorn, Daniel; Hardy, Emilie; Huynh, Phuong-Anh; Inchauspe, Henri; Kayser, Patrick; Lala, Stéphanie; Lämmerzahl, Claus; Lebat, Vincent; Leseur, Pierre; Liorzou, Françoise; List, Meike; Löffler, Frank; Panet, Isabelle; Pouilloux, Benjamin; Prieur, Pascal; Rebray, Alexandre; Reynaud, Serge; Rievers, Benny; Robert, Alain; Selig, Hanns; Serron, Laura; Sumner, Timothy; Tanguy, Nicolas; Visser, Pieter

    2017-12-01

    According to the weak equivalence principle, all bodies should fall at the same rate in a gravitational field. The MICROSCOPE satellite, launched in April 2016, aims to test its validity at the 10-15 precision level, by measuring the force required to maintain two test masses (of titanium and platinum alloys) exactly in the same orbit. A nonvanishing result would correspond to a violation of the equivalence principle, or to the discovery of a new long-range force. Analysis of the first data gives δ (Ti ,Pt )=[-1 ±9 (stat)±9 (syst)]×10-15 (1 σ statistical uncertainty) for the titanium-platinum Eötvös parameter characterizing the relative difference in their free-fall accelerations.

  19. Integrated Computational Analysis of Genes Associated with Human Hereditary Insensitivity to Pain. A Drug Repurposing Perspective

    PubMed Central

    Lötsch, Jörn; Lippmann, Catharina; Kringel, Dario; Ultsch, Alfred

    2017-01-01

    Genes causally involved in human insensitivity to pain provide a unique molecular source of studying the pathophysiology of pain and the development of novel analgesic drugs. The increasing availability of “big data” enables novel research approaches to chronic pain while also requiring novel techniques for data mining and knowledge discovery. We used machine learning to combine the knowledge about n = 20 genes causally involved in human hereditary insensitivity to pain with the knowledge about the functions of thousands of genes. An integrated computational analysis proposed that among the functions of this set of genes, the processes related to nervous system development and to ceramide and sphingosine signaling pathways are particularly important. This is in line with earlier suggestions to use these pathways as therapeutic target in pain. Following identification of the biological processes characterizing hereditary insensitivity to pain, the biological processes were used for a similarity analysis with the functions of n = 4,834 database-queried drugs. Using emergent self-organizing maps, a cluster of n = 22 drugs was identified sharing important functional features with hereditary insensitivity to pain. Several members of this cluster had been implicated in pain in preclinical experiments. Thus, the present concept of machine-learned knowledge discovery for pain research provides biologically plausible results and seems to be suitable for drug discovery by identifying a narrow choice of repurposing candidates, demonstrating that contemporary machine-learned methods offer innovative approaches to knowledge discovery from available evidence. PMID:28848388

  20. Statistical correction of the Winner’s Curse explains replication variability in quantitative trait genome-wide association studies

    PubMed Central

    Pe’er, Itsik

    2017-01-01

    Genome-wide association studies (GWAS) have identified hundreds of SNPs responsible for variation in human quantitative traits. However, genome-wide-significant associations often fail to replicate across independent cohorts, in apparent inconsistency with their apparent strong effects in discovery cohorts. This limited success of replication raises pervasive questions about the utility of the GWAS field. We identify all 332 studies of quantitative traits from the NHGRI-EBI GWAS Database with attempted replication. We find that the majority of studies provide insufficient data to evaluate replication rates. The remaining papers replicate significantly worse than expected (p < 10−14), even when adjusting for regression-to-the-mean of effect size between discovery- and replication-cohorts termed the Winner’s Curse (p < 10−16). We show this is due in part to misreporting replication cohort-size as a maximum number, rather than per-locus one. In 39 studies accurately reporting per-locus cohort-size for attempted replication of 707 loci in samples with similar ancestry, replication rate matched expectation (predicted 458, observed 457, p = 0.94). In contrast, ancestry differences between replication and discovery (13 studies, 385 loci) cause the most highly-powered decile of loci to replicate worse than expected, due to difference in linkage disequilibrium. PMID:28715421

  1. Lawson's Shoehorn, or Should the Philosophy of Science Be Rated 'X'?

    ERIC Educational Resources Information Center

    Allchin, Douglas

    2003-01-01

    Addresses Lawson's (2002) interpretations of Galileo's discovery of the moons of Jupiter and other cases that exhibit historical errors. Suggests that such cases can distort history and lessons about the nature of science. (SOE)

  2. Period and amplitude of non-volcanic tremors and repeaters: a dimensional analysis

    NASA Astrophysics Data System (ADS)

    Nielsen, Stefan

    2017-04-01

    Since its relatively recent discovery, the origin of non-volcanic tremor has been source of great curiosity and debate. Two main interpretations have been proposed, one based on fluid migration, the other relating to slow slip events on a plate boundary (the latter hypothesis has recently gained considerable ground). Here I define the conditions of slip of one or more small asperities embedded within a larger creeping fault patch. The radiation-damping equation coupled with rate-and-state friction evolution equations results in a system of ordinary differential equations. For a finite size asperity, the system equates to a peculiar non-linear damped oscillator, converging to a limit cycle. Dimensional analysis shows that period and amplitude of the oscillations depend on dimensional parameter combinations formed from a limited set of parameters: asperity dimension Γ, rate and state friction parameters (a, b, L), shear stiffness of the medium G, mass density ρ, background creep rate ˙V and normal stress σ. Under realistic parameter ranges, the asperity may show (1) tremor-like short period oscillations, accelerating to radiate sufficient energy to be barely detectable and a periodicity of the order of one to ten Hertz, as observed for non-volcanic tremor activity at the base of large inter-plate faults; (2) isolated stick-slip events with intervals in the order of days to months, as observed in repeater events of modest magnitude within creeping fault sections.

  3. Compound prioritization methods increase rates of chemical probe discovery in model organisms

    PubMed Central

    Wallace, Iain M; Urbanus, Malene L; Luciani, Genna M; Burns, Andrew R; Han, Mitchell KL; Wang, Hao; Arora, Kriti; Heisler, Lawrence E; Proctor, Michael; St. Onge, Robert P; Roemer, Terry; Roy, Peter J; Cummins, Carolyn L; Bader, Gary D; Nislow, Corey; Giaever, Guri

    2011-01-01

    SUMMARY Pre-selection of compounds that are more likely to induce a phenotype can increase the efficiency and reduce the costs for model organism screening. To identify such molecules, we screened ~81,000 compounds in S. cerevisiae and identified ~7,500 that inhibit cell growth. Screening these growth-inhibitory molecules across a diverse panel of model organisms resulted in an increased phenotypic hit-rate. This data was used to build a model to predict compounds that inhibit yeast growth. Empirical and in silico application of the model enriched the discovery of bioactive compounds in diverse model organisms. To demonstrate the potential of these molecules as lead chemical probes we used chemogenomic profiling in yeast and identified specific inhibitors of lanosterol synthase and of stearoyl-CoA 9-desaturase. As community resources, the ~7,500 growth-inhibitory molecules has been made commercially available and the computational model and filter used are provided. PMID:22035796

  4. From Discovery to Justification: Outline of an Ideal Research Program in Empirical Psychology

    PubMed Central

    Witte, Erich H.; Zenker, Frank

    2017-01-01

    The gold standard for an empirical science is the replicability of its research results. But the estimated average replicability rate of key-effects that top-tier psychology journals report falls between 36 and 39% (objective vs. subjective rate; Open Science Collaboration, 2015). So the standard mode of applying null-hypothesis significance testing (NHST) fails to adequately separate stable from random effects. Therefore, NHST does not fully convince as a statistical inference strategy. We argue that the replicability crisis is “home-made” because more sophisticated strategies can deliver results the successful replication of which is sufficiently probable. Thus, we can overcome the replicability crisis by integrating empirical results into genuine research programs. Instead of continuing to narrowly evaluate only the stability of data against random fluctuations (discovery context), such programs evaluate rival hypotheses against stable data (justification context). PMID:29163256

  5. Assessing differential gene expression with small sample sizes in oligonucleotide arrays using a mean-variance model.

    PubMed

    Hu, Jianhua; Wright, Fred A

    2007-03-01

    The identification of the genes that are differentially expressed in two-sample microarray experiments remains a difficult problem when the number of arrays is very small. We discuss the implications of using ordinary t-statistics and examine other commonly used variants. For oligonucleotide arrays with multiple probes per gene, we introduce a simple model relating the mean and variance of expression, possibly with gene-specific random effects. Parameter estimates from the model have natural shrinkage properties that guard against inappropriately small variance estimates, and the model is used to obtain a differential expression statistic. A limiting value to the positive false discovery rate (pFDR) for ordinary t-tests provides motivation for our use of the data structure to improve variance estimates. Our approach performs well compared to other proposed approaches in terms of the false discovery rate.

  6. What we know and don't know about Earth's missing biodiversity.

    PubMed

    Scheffers, Brett R; Joppa, Lucas N; Pimm, Stuart L; Laurance, William F

    2012-09-01

    Estimates of non-microbial diversity on Earth range from 2 million to over 50 million species, with great uncertainties in numbers of insects, fungi, nematodes, and deep-sea organisms. We summarize estimates for major taxa, the methods used to obtain them, and prospects for further discoveries. Major challenges include frequent synonymy, the difficulty of discriminating certain species by morphology alone, and the fact that many undiscovered species are small, difficult to find, or have small geographic ranges. Cryptic species could be numerous in some taxa. Novel techniques, such as DNA barcoding, new databases, and crowd-sourcing, could greatly accelerate the rate of species discovery. Such advances are timely. Most missing species probably live in biodiversity hotspots, where habitat destruction is rife, and so current estimates of extinction rates from known species are too low. Copyright © 2012 Elsevier Ltd. All rights reserved.

  7. Matrix- and tensor-based recommender systems for the discovery of currently unknown inorganic compounds

    NASA Astrophysics Data System (ADS)

    Seko, Atsuto; Hayashi, Hiroyuki; Kashima, Hisashi; Tanaka, Isao

    2018-01-01

    Chemically relevant compositions (CRCs) and atomic arrangements of inorganic compounds have been collected as inorganic crystal structure databases. Machine learning is a unique approach to search for currently unknown CRCs from vast candidates. Herein we propose matrix- and tensor-based recommender system approaches to predict currently unknown CRCs from database entries of CRCs. Firstly, the performance of the recommender system approaches to discover currently unknown CRCs is examined. A Tucker decomposition recommender system shows the best discovery rate of CRCs as the majority of the top 100 recommended ternary and quaternary compositions correspond to CRCs. Secondly, systematic density functional theory (DFT) calculations are performed to investigate the phase stability of the recommended compositions. The phase stability of the 27 compositions reveals that 23 currently unknown compounds are newly found to be stable. These results indicate that the recommender system has great potential to accelerate the discovery of new compounds.

  8. A review of human pluripotent stem cell-derived cardiomyocytes for high-throughput drug discovery, cardiotoxicity screening, and publication standards.

    PubMed

    Mordwinkin, Nicholas M; Burridge, Paul W; Wu, Joseph C

    2013-02-01

    Drug attrition rates have increased in past years, resulting in growing costs for the pharmaceutical industry and consumers. The reasons for this include the lack of in vitro models that correlate with clinical results and poor preclinical toxicity screening assays. The in vitro production of human cardiac progenitor cells and cardiomyocytes from human pluripotent stem cells provides an amenable source of cells for applications in drug discovery, disease modeling, regenerative medicine, and cardiotoxicity screening. In addition, the ability to derive human-induced pluripotent stem cells from somatic tissues, combined with current high-throughput screening and pharmacogenomics, may help realize the use of these cells to fulfill the potential of personalized medicine. In this review, we discuss the use of pluripotent stem cell-derived cardiomyocytes for drug discovery and cardiotoxicity screening, as well as current hurdles that must be overcome for wider clinical applications of this promising approach.

  9. Ranking metrics in gene set enrichment analysis: do they matter?

    PubMed

    Zyla, Joanna; Marczyk, Michal; Weiner, January; Polanska, Joanna

    2017-05-12

    There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated Welch Test statistic and absolute value of Signal-To-Noise ratio gave stable results, while Baumgartner-Weiss-Schindler and Minimum Significant Difference showed better results for larger sample size. Finally, the Gene Set Enrichment Analysis method with all tested ranking metrics was parallelised and implemented in MATLAB, and is available at https://github.com/ZAEDPolSl/MrGSEA . Choosing a ranking metric in Gene Set Enrichment Analysis has critical impact on results of pathway enrichment analysis. The absolute value of Moderated Welch Test has the best overall sensitivity and Minimum Significant Difference has the best overall specificity of gene set analysis. When the number of non-normally distributed genes is high, using Baumgartner-Weiss-Schindler test statistic gives better outcomes. Also, it finds more enriched pathways than other tested metrics, which may induce new biological discoveries.

  10. Recent lab-on-chip developments for novel drug discovery.

    PubMed

    Khalid, Nauman; Kobayashi, Isao; Nakajima, Mitsutoshi

    2017-07-01

    Microelectromechanical systems (MEMS) and micro total analysis systems (μTAS) revolutionized the biochemical and electronic industries, and this miniaturization process became a key driver for many markets. Now, it is a driving force for innovations in life sciences, diagnostics, analytical sciences, and chemistry, which are called 'lab-on-a-chip, (LOC)' devices. The use of these devices allows the development of fast, portable, and easy-to-use systems with a high level of functional integration for applications such as point-of-care diagnostics, forensics, the analysis of biomolecules, environmental or food analysis, and drug development. In this review, we report on the latest developments in fabrication methods and production methodologies to tailor LOC devices. A brief overview of scale-up strategies is also presented together with their potential applications in drug delivery and discovery. The impact of LOC devices on drug development and discovery has been extensively reviewed in the past. The current research focuses on fast and accurate detection of genomics, cell mutations and analysis, drug delivery, and discovery. The current research also differentiates the LOC devices into new terminology of microengineering, like organ-on-a-chip, stem cells-on-a-chip, human-on-a-chip, and body-on-a-chip. Key challenges will be the transfer of fabricated LOC devices from lab-scale to industrial large-scale production. Moreover, extensive toxicological studies are needed to justify the use of microfabricated drug delivery vehicles in biological systems. It will also be challenging to transfer the in vitro findings to suitable and promising in vivo models. WIREs Syst Biol Med 2017, 9:e1381. doi: 10.1002/wsbm.1381 For further resources related to this article, please visit the WIREs website. © 2017 Wiley Periodicals, Inc.

  11. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

    PubMed

    Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

    2015-01-20

    While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.

  12. A comprehensive company database analysis of biological assay variability.

    PubMed

    Kramer, Christian; Dahl, Göran; Tyrchan, Christian; Ulander, Johan

    2016-08-01

    Analysis of data from various compounds measured in diverse biological assays is a central part of drug discovery research projects. However, no systematic overview of the variability in biological assays has been published and judgments on assay quality and robustness of data are often based on personal belief and experience within the drug discovery community. To address this we performed a reproducibility analysis of all biological assays at AstraZeneca between 2005 and 2014. We found an average experimental uncertainty of less than a twofold difference and no technologies or assay types had higher variability than others. This work suggests that robust data can be obtained from the most commonly applied biological assays. Copyright © 2016. Published by Elsevier Ltd.

  13. Comparison between project-based learning and discovery learning toward students' metacognitive strategies on global warming concept

    NASA Astrophysics Data System (ADS)

    Tumewu, Widya Anjelia; Wulan, Ana Ratna; Sanjaya, Yayan

    2017-05-01

    The purpose of this study was to know comparing the effectiveness of learning using Project-based learning (PjBL) and Discovery Learning (DL) toward students metacognitive strategies on global warming concept. A quasi-experimental research design with a The Matching-Only Pretest-Posttest Control Group Design was used in this study. The subjects were students of two classes 7th grade of one of junior high school in Bandung City, West Java of 2015/2016 academic year. The study was conducted on two experimental class, that were project-based learning treatment on the experimental class I and discovery learning treatment was done on the experimental class II. The data was collected through questionnaire to know students metacognitive strategies. The statistical analysis showed that there were statistically significant differences in students metacognitive strategies between project-based learning and discovery learning.

  14. Classic fungal natural products in the genomic age: the molecular legacy of Harold Raistrick.

    PubMed

    Schor, Raissa; Cox, Russell

    2018-03-01

    Covering: 1893 to 2017Harold Raistrick was involved in the discovery of many of the most important classes of fungal metabolites during the 20th century. This review focusses on how these discoveries led to developments in isotopic labelling, biomimetic chemistry and the discovery, analysis and exploitation of biosynthetic gene clusters for major classes of fungal metabolites including: alternariol; geodin and metabolites of the emodin pathway; maleidrides; citrinin and the azaphilones; dehydrocurvularin; mycophenolic acid; and the tropolones. Key recent advances in the molecular understanding of these important pathways, including the discovery of biosynthetic gene clusters, the investigation of the molecular and chemical aspects of key biosynthetic steps, and the reengineering of key components of the pathways are reviewed and compared. Finally, discussion of key relationships between metabolites and pathways and the most important recent advances and opportunities for future research directions are given.

  15. Coming In: Queer Narratives of Sexual Self-Discovery.

    PubMed

    Rosenberg, Shoshana

    2017-10-09

    Many models of queer sexuality continue to depict a linear narrative of sexual development, beginning in repression/concealment and eventuating in coming out. The present study sought to challenge this by engaging in a hermeneutically informed thematic analysis of interviews with eight queer people living in Western Australia. Four themes were identified: "searching for identity," "society, stigma, and self," "sexual self-discovery," and "coming in." Interviewees discussed internalized homophobia and its impact on their life; experiences and implications of finding a community and achieving a sense of belonging; the concept of sexual self-discovery being a lifelong process; and sexuality as fluid, dynamic, and situational rather than static. The article concludes by suggesting that the idea of "coming in"-arriving at a place of acceptance of one's sexuality, regardless of its fluidity or how it is viewed by society-offers considerable analytic leverage for understanding the journeys of sexual self-discovery of queer-identified people.

  16. 400mm Mapping Sequence performed during the STS-119 R-Bar Pitch Maneuver

    NASA Image and Video Library

    2008-03-17

    ISS018-E-040791 (17 March 2009) --- Backdropped by a blanket of clouds, Space Shuttle Discovery is featured in this image photographed by an Expedition 18 crewmember on the International Space Station during rendezvous and docking operations. Before docking with the station, astronaut Lee Archambault, STS-119 commander, flew the shuttle through a Rendezvous Pitch Maneuver or basically a backflip to allow the space station crew a good view of Discovery's heat shield. Using digital still cameras equipped with both 400 and 800 millimeter lenses, the ISS crewmembers took a number of photos of the shuttle's thermal protection system and sent them down to teams on the ground for analysis. A 400 millimeter lens was used for this image. Docking occurred at 4:20 p.m. (CDT) on March 17, 2009. The final pair of power-generating solar array wings and the S6 truss segment are visible in Discovery?s cargo bay.

  17. 400mm Mapping Sequence performed during the STS-119 R-Bar Pitch Maneuver

    NASA Image and Video Library

    2008-03-17

    ISS018-E-040792 (17 March 2009) --- Backdropped by a blanket of clouds, Space Shuttle Discovery is featured in this image photographed by an Expedition 18 crewmember on the International Space Station during rendezvous and docking operations. Before docking with the station, astronaut Lee Archambault, STS-119 commander, flew the shuttle through a Rendezvous Pitch Maneuver or basically a backflip to allow the space station crew a good view of Discovery's heat shield. Using digital still cameras equipped with both 400 and 800 millimeter lenses, the ISS crewmembers took a number of photos of the shuttle's thermal protection system and sent them down to teams on the ground for analysis. A 400 millimeter lens was used for this image. Docking occurred at 4:20 p.m. (CDT) on March 17, 2009. The final pair of power-generating solar array wings and the S6 truss segment are visible in Discovery?s cargo bay.

  18. 400mm Mapping Sequence performed during the STS-119 R-Bar Pitch Maneuver

    NASA Image and Video Library

    2008-03-17

    ISS018-E-040790 (17 March 2009) --- Backdropped by the blackness of space, Space Shuttle Discovery is featured in this image photographed by an Expedition 18 crewmember on the International Space Station during rendezvous and docking operations. Before docking with the station, astronaut Lee Archambault, STS-119 commander, flew the shuttle through a Rendezvous Pitch Maneuver or basically a backflip to allow the space station crew a good view of Discovery's heat shield. Using digital still cameras equipped with both 400 and 800 millimeter lenses, the ISS crewmembers took a number of photos of the shuttle's thermal protection system and sent them down to teams on the ground for analysis. A 400 millimeter lens was used for this image. Docking occurred at 4:20 p.m. (CDT) on March 17, 2009. The final pair of power-generating solar array wings and the S6 truss segment are visible in Discovery?s cargo bay.

  19. Real-Time Discovery Services over Large, Heterogeneous and Complex Healthcare Datasets Using Schema-Less, Column-Oriented Methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Begoli, Edmon; Dunning, Ted; Charlie, Frasure

    We present a service platform for schema-leess exploration of data and discovery of patient-related statistics from healthcare data sets. The architecture of this platform is motivated by the need for fast, schema-less, and flexible approaches to SQL-based exploration and discovery of information embedded in the common, heterogeneously structured healthcare data sets and supporting components (electronic health records, practice management systems, etc.) The motivating use cases described in the paper are clinical trials candidate discovery, and a treatment effectiveness analysis. Following the use cases, we discuss the key features and software architecture of the platform, the underlying core components (Apache Parquet,more » Drill, the web services server), and the runtime profiles and performance characteristics of the platform. We conclude by showing dramatic speedup with some approaches, and the performance tradeoffs and limitations of others.« less

  20. Genetic correlation between amyotrophic lateral sclerosis and schizophrenia

    NASA Astrophysics Data System (ADS)

    McLaughlin, Russell L.; Schijven, Dick; van Rheenen, Wouter; van Eijk, Kristel R.; O'Brien, Margaret; Kahn, René S.; Ophoff, Roel A.; Goris, An; Bradley, Daniel G.; Al-Chalabi, Ammar; van den Berg, Leonard H.; Luykx, Jurjen J.; Hardiman, Orla; Veldink, Jan H.; Shatunov, Aleksey; Dekker, Annelot M.; Diekstra, Frank P.; Pulit, Sara L.; van der Spek, Rick A. A.; van Doormaal, Perry T. C.; Sproviero, William; Jones, Ashley R.; Nicholson, Garth A.; Rowe, Dominic B.; Pamphlett, Roger; Kiernan, Matthew C.; Bauer, Denis; Kahlke, Tim; Williams, Kelly; Eftimov, Filip; Fogh, Isabella; Ticozzi, Nicola; Lin, Kuang; Millecamps, Stéphanie; Salachas, François; Meininger, Vincent; de Carvalho, Mamede; Pinto, Susana; Mora, Jesus S.; Rojas-García, Ricardo; Polak, Meraida; Chandran, Siddharthan; Colville, Shuna; Swingler, Robert; Morrison, Karen E.; Shaw, Pamela J.; Hardy, John; Orrell, Richard W.; Pittman, Alan; Sidle, Katie; Fratta, Pietro; Malaspina, Andrea; Petri, Susanne; Abdulla, Susanna; Drepper, Carsten; Sendtner, Michael; Meyer, Thomas; Wiedau-Pazos, Martina; Lomen-Hoerth, Catherine; van Deerlin, Vivianna M.; Trojanowski, John Q.; Elman, Lauren; McCluskey, Leo; Basak, Nazli; Meitinger, Thomas; Lichtner, Peter; Blagojevic-Radivojkov, Milena; Andres, Christian R.; Maurel, Cindy; Bensimon, Gilbert; Landwehrmeyer, Bernhard; Brice, Alexis; Payan, Christine A. M.; Saker-Delye, Safa; Dürr, Alexandra; Wood, Nicholas; Tittmann, Lukas; Lieb, Wolfgang; Franke, Andre; Rietschel, Marcella; Cichon, Sven; Nöuthen, Markus M.; Amouyel, Philippe; Tzourio, Christophe; Dartigues, Jean-François; Uitterlinden, Andre G.; Rivadeneira, Fernando; Estrada, Karol; Hofman, Albert; Curtis, Charles; van der Kooi, Anneke J.; de Visser, Marianne; Weber, Markus; Shaw, Christopher E.; Smith, Bradley N.; Pansarasa, Orietta; Cereda, Cristina; Del Bo, Roberto; Comi, Giacomo P.; D'Alfonso, Sandra; Bertolin, Cinzia; Sorarù, Gianni; Mazzini, Letizia; Pensato, Viviana; Gellera, Cinzia; Tiloca, Cinzia; Ratti, Antonia; Calvo, Andrea; Moglia, Cristina; Brunetti, Maura; Arcuti, Simon; Capozzo, Rosa; Zecca, Chiara; Lunetta, Christian; Penco, Silvana; Riva, Nilo; Padovani, Alessandro; Filosto, Massimiliano; Blair, Ian; Leigh, P. Nigel; Casale, Federico; Chio, Adriano; Beghi, Ettore; Pupillo, Elisabetta; Tortelli, Rosanna; Logroscino, Giancarlo; Powell, John; Ludolph, Albert C.; Weishaupt, Jochen H.; Robberecht, Wim; van Damme, Philip; Brown, Robert H.; Glass, Jonathan; Landers, John E.; Andersen, Peter M.; Corcia, Philippe; Vourc'h, Patrick; Silani, Vincenzo; van Es, Michael A.; Pasterkamp, R. Jeroen; Lewis, Cathryn M.; Breen, Gerome; Ripke, Stephan; Neale, Benjamin M.; Corvin, Aiden; Walters, James T. R.; Farh, Kai-How; Holmans, Peter A.; Lee, Phil; Bulik-Sullivan, Brendan; Collier, David A.; Huang, Hailiang; Pers, Tune H.; Agartz, Ingrid; Agerbo, Esben; Albus, Margot; Alexander, Madeline; Amin, Farooq; Bacanu, Silviu A.; Begemann, Martin; Belliveau, Richard A.; Bene, Judit; Bergen, Sarah E.; Bevilacqua, Elizabeth; Bigdeli, Tim B.; Black, Donald W.; Bruggeman, Richard; Buccola, Nancy G.; Buckner, Randy L.; Byerley, William; Cahn, Wiepke; Cai, Guiqing; Campion, Dominique; Cantor, Rita M.; Carr, Vaughan J.; Carrera, Noa; Catts, Stanley V.; Chambert, Kimberley D.; Chan, Raymond C. K.; Chan, Ronald Y. L.; Chen, Eric Y. H.; Cheng, Wei; Cheung, Eric F. C.; Chong, Siow Ann; Cloninger, C. Robert; Cohen, David; Cohen, Nadine; Cormican, Paul; Craddock, Nick; Crowley, James J.; Curtis, David; Davidson, Michael; Davis, Kenneth L.; Degenhardt, Franziska; Del Favero, Jurgen; Demontis, Ditte; Dikeos, Dimitris; Dinan, Timothy; Djurovic, Srdjan; Donohoe, Gary; Drapeau, Elodie; Duan, Jubao; Dudbridge, Frank; Durmishi, Naser; Eichhammer, Peter; Eriksson, Johan; Escott-Price, Valentina; Essioux, Laurent; Fanous, Ayman H.; Farrell, Martilias S.; Frank, Josef; Franke, Lude; Freedman, Robert; Freimer, Nelson B.; Friedl, Marion; Friedman, Joseph I.; Fromer, Menachem; Genovese, Giulio; Georgieva, Lyudmila; Giegling, Ina; Giusti-Rodríguez, Paola; Godard, Stephanie; Goldstein, Jacqueline I.; Golimbet, Vera; Gopal, Srihari; Gratten, Jacob; de Haan, Lieuwe; Hammer, Christian; Hamshere, Marian L.; Hansen, Mark; Hansen, Thomas; Haroutunian, Vahram; Hartmann, Annette M.; Henskens, Frans A.; Herms, Stefan; Hirschhorn, Joel N.; Hoffmann, Per; Hofman, Andrea; Hollegaard, Mads V.; Hougaard, David M.; Ikeda, Masashi; Joa, Inge; Julià, Antonio; Kalaydjieva, Luba; Karachanak-Yankova, Sena; Karjalainen, Juha; Kavanagh, David; Keller, Matthew C.; Kennedy, James L.; Khrunin, Andrey; Kim, Yunjung; Klovins, Janis; Knowles, James A.; Konte, Bettina; Kucinskas, Vaidutis; Kucinskiene, Zita Ausrele; Kuzelova-Ptackova, Hana; Kähler, Anna K.; Laurent, Claudine; Lee, Jimmy; Lee, S. Hong; Legge, Sophie E.; Lerer, Bernard; Li, Miaoxin; Li, Tao; Liang, Kung-Yee; Lieberman, Jeffrey; Limborska, Svetlana; Loughland, Carmel M.; Lubinski, Jan; Lönnqvist, Jouko; Macek, Milan; Magnusson, Patrik K. E.; Maher, Brion S.; Maier, Wolfgang; Mallet, Jacques; Marsal, Sara; Mattheisen, Manuel; Mattingsdal, Morten; McCarley, Robert W.; McDonald, Colm; McIntosh, Andrew M.; Meier, Sandra; Meijer, Carin J.; Melegh, Bela; Melle, Ingrid; Mesholam-Gately, Raquelle I.; Metspalu, Andres; Michie, Patricia T.; Milani, Lili; Milanova, Vihra; Mokrab, Younes; Morris, Derek W.; Mors, Ole; Murphy, Kieran C.; Murray, Robin M.; Myin-Germeys, Inez; Müller-Myhsok, Bertram; Nelis, Mari; Nenadic, Igor; Nertney, Deborah A.; Nestadt, Gerald; Nicodemus, Kristin K.; Nikitina-Zake, Liene; Nisenbaum, Laura; Nordin, Annelie; O'Callaghan, Eadbhard; O'Dushlaine, Colm; O'Neill, F. Anthony; Oh, Sang-Yun; Olincy, Ann; Olsen, Line; van Os, Jim; Pantelis, Christos; Papadimitriou, George N.; Papiol, Sergi; Parkhomenko, Elena; Pato, Michele T.; Paunio, Tiina; Pejovic-Milovancevic, Milica; Perkins, Diana O.; Pietiläinen, Olli; Pimm, Jonathan; Pocklington, Andrew J.; Price, Alkes; Pulver, Ann E.; Purcell, Shaun M.; Quested, Digby; Rasmussen, Henrik B.; Reichenberg, Abraham; Reimers, Mark A.; Richards, Alexander L.; Roffman, Joshua L.; Roussos, Panos; Ruderfer, Douglas M.; Salomaa, Veikko; Sanders, Alan R.; Schall, Ulrich; Schubert, Christian R.; Schulze, Thomas G.; Schwab, Sibylle G.; Scolnick, Edward M.; Scott, Rodney J.; Seidman, Larry J.; Shi, Jianxin; Sigurdsson, Engilbert; Silagadze, Teimuraz; Silverman, Jeremy M.; Sim, Kang; Slominsky, Petr; Smoller, Jordan W.; So, Hon-Cheong; Spencer, Chris C. A.; Stahl, Eli A.; Stefansson, Hreinn; Steinberg, Stacy; Stogmann, Elisabeth; Straub, Richard E.; Strengman, Eric; Strohmaier, Jana; Stroup, T. Scott; Subramaniam, Mythily; Suvisaari, Jaana; Svrakic, Dragan M.; Szatkiewicz, Jin P.; Söderman, Erik; Thirumalai, Srinivas; Toncheva, Draga; Tosato, Sarah; Veijola, Juha; Waddington, John; Walsh, Dermot; Wang, Dai; Wang, Qiang; Webb, Bradley T.; Weiser, Mark; Wildenauer, Dieter B.; Williams, Nigel M.; Williams, Stephanie; Witt, Stephanie H.; Wolen, Aaron R.; Wong, Emily H. M.; Wormley, Brandon K.; Xi, Hualin Simon; Zai, Clement C.; Zheng, Xuebin; Zimprich, Fritz; Wray, Naomi R.; Stefansson, Kari; Visscher, Peter M.; Adolfsson, Rolf; Andreassen, Ole A.; Blackwood, Douglas H. R.; Bramon, Elvira; Buxbaum, Joseph D.; Børglum, Anders D.; Darvasi, Ariel; Domenici, Enrico; Ehrenreich, Hannelore; Esko, Tõnu; Gejman, Pablo V.; Gill, Michael; Gurling, Hugh; Hultman, Christina M.; Iwata, Nakao; Jablensky, Assen V.; Jönsson, Erik G.; Kendler, Kenneth S.; Kirov, George; Knight, Jo; Lencz, Todd; Levinson, Douglas F.; Li, Qingqin S.; Liu, Jianjun; Malhotra, Anil K.; McCarroll, Steven A.; McQuillin, Andrew; Moran, Jennifer L.; Mortensen, Preben B.; Mowry, Bryan J.; Owen, Michael J.; Palotie, Aarno; Pato, Carlos N.; Petryshen, Tracey L.; Posthuma, Danielle; Riley, Brien P.; Rujescu, Dan; Sham, Pak C.; Sklar, Pamela; St Clair, David; Weinberger, Daniel R.; Wendland, Jens R.; Werge, Thomas; Daly, Mark J.; Sullivan, Patrick F.; O'Donovan, Michael C.

    2017-03-01

    We have previously shown higher-than-expected rates of schizophrenia in relatives of patients with amyotrophic lateral sclerosis (ALS), suggesting an aetiological relationship between the diseases. Here, we investigate the genetic relationship between ALS and schizophrenia using genome-wide association study data from over 100,000 unique individuals. Using linkage disequilibrium score regression, we estimate the genetic correlation between ALS and schizophrenia to be 14.3% (7.05-21.6 P=1 × 10-4) with schizophrenia polygenic risk scores explaining up to 0.12% of the variance in ALS (P=8.4 × 10-7). A modest increase in comorbidity of ALS and schizophrenia is expected given these findings (odds ratio 1.08-1.26) but this would require very large studies to observe epidemiologically. We identify five potential novel ALS-associated loci using conditional false discovery rate analysis. It is likely that shared neurobiological mechanisms between these two disorders will engender novel hypotheses in future preclinical and clinical studies.

  1. Genetic correlation between amyotrophic lateral sclerosis and schizophrenia.

    PubMed

    McLaughlin, Russell L; Schijven, Dick; van Rheenen, Wouter; van Eijk, Kristel R; O'Brien, Margaret; Kahn, René S; Ophoff, Roel A; Goris, An; Bradley, Daniel G; Al-Chalabi, Ammar; van den Berg, Leonard H; Luykx, Jurjen J; Hardiman, Orla; Veldink, Jan H

    2017-03-21

    We have previously shown higher-than-expected rates of schizophrenia in relatives of patients with amyotrophic lateral sclerosis (ALS), suggesting an aetiological relationship between the diseases. Here, we investigate the genetic relationship between ALS and schizophrenia using genome-wide association study data from over 100,000 unique individuals. Using linkage disequilibrium score regression, we estimate the genetic correlation between ALS and schizophrenia to be 14.3% (7.05-21.6; P=1 × 10 -4 ) with schizophrenia polygenic risk scores explaining up to 0.12% of the variance in ALS (P=8.4 × 10 -7 ). A modest increase in comorbidity of ALS and schizophrenia is expected given these findings (odds ratio 1.08-1.26) but this would require very large studies to observe epidemiologically. We identify five potential novel ALS-associated loci using conditional false discovery rate analysis. It is likely that shared neurobiological mechanisms between these two disorders will engender novel hypotheses in future preclinical and clinical studies.

  2. MicroRNA Expression Profiling of the Armed Forces Health Surveillance Branch Cohort for Identification of "Enviro-miRs" Associated With Deployment-Based Environmental Exposure.

    PubMed

    Dalgard, Clifton L; Polston, Keith F; Sukumar, Gauthaman; Mallon, Col Timothy M; Wilkerson, Matthew D; Pollard, Harvey B

    2016-08-01

    The aim of this study was to identify serum microRNA (miRNA) biomarkers that indicate deployment-associated exposures in service members at military installations with open burn pits. Another objective was to determine detection rates of miRNAs in Department of Defense Serum Repository (DoDSR) samples with a high-throughput methodology. Low-volume serum samples (n = 800) were profiled by miRNA-capture isolation, pre-amplification, and measurement by a quantitative PCR-based OpenArray platform. Normalized quantitative cycle values were used for differential expression analysis between groups. Assay specificity, dynamic range, reproducibility, and detection rates by OpenArray passed target desired specifications. Serum abundant miRNAs were consistently measured in study specimens. Four miRNAs were differentially expressed in the case deployment group subjects. miRNAs are suitable RNA species for biomarker discovery in the DoDSR serum specimens. Serum miRNAs are candidate biomarkers for deployment and environmental exposure in military service members.

  3. Hot-spot analysis for drug discovery targeting protein-protein interactions.

    PubMed

    Rosell, Mireia; Fernández-Recio, Juan

    2018-04-01

    Protein-protein interactions are important for biological processes and pathological situations, and are attractive targets for drug discovery. However, rational drug design targeting protein-protein interactions is still highly challenging. Hot-spot residues are seen as the best option to target such interactions, but their identification requires detailed structural and energetic characterization, which is only available for a tiny fraction of protein interactions. Areas covered: In this review, the authors cover a variety of computational methods that have been reported for the energetic analysis of protein-protein interfaces in search of hot-spots, and the structural modeling of protein-protein complexes by docking. This can help to rationalize the discovery of small-molecule inhibitors of protein-protein interfaces of therapeutic interest. Computational analysis and docking can help to locate the interface, molecular dynamics can be used to find suitable cavities, and hot-spot predictions can focus the search for inhibitors of protein-protein interactions. Expert opinion: A major difficulty for applying rational drug design methods to protein-protein interactions is that in the majority of cases the complex structure is not available. Fortunately, computational docking can complement experimental data. An interesting aspect to explore in the future is the integration of these strategies for targeting PPIs with large-scale mutational analysis.

  4. Application of industrial scale genomics to discovery of therapeutic targets in heart failure.

    PubMed

    Mehraban, F; Tomlinson, J E

    2001-12-01

    In recent years intense activity in both academic and industrial sectors has provided a wealth of information on the human genome with an associated impressive increase in the number of novel gene sequences deposited in sequence data repositories and patent applications. This genomic industrial revolution has transformed the way in which drug target discovery is now approached. In this article we discuss how various differential gene expression (DGE) technologies are being utilized for cardiovascular disease (CVD) drug target discovery. Other approaches such as sequencing cDNA from cardiovascular derived tissues and cells coupled with bioinformatic sequence analysis are used with the aim of identifying novel gene sequences that may be exploited towards target discovery. Additional leverage from gene sequence information is obtained through identification of polymorphisms that may confer disease susceptibility and/or affect drug responsiveness. Pharmacogenomic studies are described wherein gene expression-based techniques are used to evaluate drug response and/or efficacy. Industrial-scale genomics supports and addresses not only novel target gene discovery but also the burgeoning issues in pharmaceutical and clinical cardiovascular medicine relative to polymorphic gene responses.

  5. A Scientometric Prediction of the Discovery of the First Potentially Habitable Planet with a Mass Similar to Earth

    PubMed Central

    Arbesman, Samuel; Laughlin, Gregory

    2010-01-01

    Background The search for a habitable extrasolar planet has long interested scientists, but only recently have the tools become available to search for such planets. In the past decades, the number of known extrasolar planets has ballooned into the hundreds, and with it, the expectation that the discovery of the first Earth-like extrasolar planet is not far off. Methodology/Principal Findings Here, we develop a novel metric of habitability for discovered planets and use this to arrive at a prediction for when the first habitable planet will be discovered. Using a bootstrap analysis of currently discovered exoplanets, we predict the discovery of the first Earth-like planet to be announced in the first half of 2011, with the likeliest date being early May 2011. Conclusions/Significance Our predictions, using only the properties of previously discovered exoplanets, accord well with external estimates for the discovery of the first potentially habitable extrasolar planet and highlight the the usefulness of predictive scientometric techniques to understand the pace of scientific discovery in many fields. PMID:20957226

  6. Cancer drug discovery: recent innovative approaches to tumor modeling.

    PubMed

    Lovitt, Carrie J; Shelper, Todd B; Avery, Vicky M

    2016-09-01

    Cell culture models have been at the heart of anti-cancer drug discovery programs for over half a century. Advancements in cell culture techniques have seen the rapid evolution of more complex in vitro cell culture models investigated for use in drug discovery. Three-dimensional (3D) cell culture research has become a strong focal point, as this technique permits the recapitulation of the tumor microenvironment. Biologically relevant 3D cellular models have demonstrated significant promise in advancing cancer drug discovery, and will continue to play an increasing role in the future. In this review, recent advances in 3D cell culture techniques and their application in tumor modeling and anti-cancer drug discovery programs are discussed. The topics include selection of cancer cells, 3D cell culture assays (associated endpoint measurements and analysis), 3D microfluidic systems and 3D bio-printing. Although advanced cancer cell culture models and techniques are becoming commonplace in many research groups, the use of these approaches has yet to be fully embraced in anti-cancer drug applications. Furthermore, limitations associated with analyzing information-rich biological data remain unaddressed.

  7. The Discovery of the Electromagnetic Counterpart of GW170817: Kilonova AT 2017gfo/DLT17ck

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Valenti, Stefano; Yang, Sheng; Tartaglia, Leonardo

    During the second observing run of the Laser Interferometer Gravitational-wave Observatory (LIGO) and Virgo Interferometer, a gravitational-wave signal consistent with a binary neutron star coalescence was detected on 2017 August 17th (GW170817), quickly followed by a coincident short gamma-ray burst trigger detected by the Fermi satellite. The Distance Less Than 40 (DLT40) Mpc supernova search performed pointed follow-up observations of a sample of galaxies regularly monitored by the survey that fell within the combined LIGO+Virgo localization region and the larger Fermi gamma-ray burst error box. Here we report the discovery of a new optical transient (DLT17ck, also known as SSS17a;more » it has also been registered as AT 2017gfo) spatially and temporally coincident with GW170817. The photometric and spectroscopic evolution of DLT17ck is unique, with an absolute peak magnitude of M {sub r} = −15.8 ± 0.1 and an r -band decline rate of 1.1 mag day{sup −1}. This fast evolution is generically consistent with kilonova models, which have been predicted as the optical counterpart to binary neutron star coalescences. Analysis of archival DLT40 data does not show any sign of transient activity at the location of DLT17ck down to r ∼ 19 mag in the time period between 8 months and 21 days prior to GW170817. This discovery represents the beginning of a new era for multi-messenger astronomy, opening a new path by which to study and understand binary neutron star coalescences, short gamma-ray bursts, and their optical counterparts.« less

  8. Searches for new Milky Way satellites from the first two years of data of the Subaru/Hyper Suprime-Cam survey: Discovery of Cetus III

    NASA Astrophysics Data System (ADS)

    Homma, Daisuke; Chiba, Masashi; Okamoto, Sakurako; Komiyama, Yutaka; Tanaka, Masayuki; Tanaka, Mikito; Ishigaki, Miho N.; Hayashi, Kohei; Arimoto, Nobuo; Garmilla, José A.; Lupton, Robert H.; Strauss, Michael A.; Miyazaki, Satoshi; Wang, Shiang-Yu; Murayama, Hitoshi

    2018-01-01

    We present the results from a search for new Milky Way (MW) satellites from the first two years of data from the Hyper Suprime-Cam (HSC) Subaru Strategic Program (SSP) ˜300 deg2 and report the discovery of a highly compelling ultra-faint dwarf galaxy candidate in Cetus. This is the second ultra-faint dwarf we have discovered after Virgo I reported in our previous paper. This satellite, Cetus III, has been identified as a statistically significant (10.7 σ) spatial overdensity of star-like objects, which are selected from a relevant isochrone filter designed for a metal-poor and old stellar population. This stellar system is located at a heliocentric distance of 251^{+24}_{-11}kpc with a most likely absolute magnitude of MV = -2.4 ± 0.6 mag estimated from a Monte Carlo analysis. Cetus III is extended with a half-light radius of r_h = 90^{+42}_{-17}pc, suggesting that this is a faint dwarf satellite in the MW located beyond the detection limit of the Sloan Digital Sky Survey. Further spectroscopic studies are needed to assess the nature of this stellar system. We also revisit and update the parameters for Virgo I, finding M_V = -0.33^{+0.75}_{-0.87}mag and r_h = 47^{+19}_{-13}pc. Using simulations of Λ-dominated cold dark matter models, we predict that we should find one or two new MW satellites from ˜300 deg2 HSC-SSP data, in rough agreement with the discovery rate so far. The further survey and completion of HSC-SSP over ˜1400 deg2 will provide robust insights into the missing satellites problem.

  9. Integrative analysis of micro-RNA, gene expression, and survival of glioblastoma multiforme.

    PubMed

    Huang, Yen-Tsung; Hsu, Thomas; Kelsey, Karl T; Lin, Chien-Ling

    2015-02-01

    Glioblastoma multiforme (GBM), the most common type of malignant brain tumor, is highly fatal. Limited understanding of its rapid progression necessitates additional approaches that integrate what is known about the genomics of this cancer. Using a discovery set (n = 348) and a validation set (n = 174) of GBM patients, we performed genome-wide analyses that integrated mRNA and micro-RNA expression data from GBM as well as associated survival information, assessing coordinated variability in each as this reflects their known mechanistic functions. Cox proportional hazards models were used for the survival analyses, and nonparametric permutation tests were performed for the micro-RNAs to investigate the association between the number of associated genes and its prognostication. We also utilized mediation analyses for micro-RNA-gene pairs to identify their mediation effects. Genome-wide analyses revealed a novel pattern: micro-RNAs related to more gene expressions are more likely to be associated with GBM survival (P = 4.8 × 10(-5)). Genome-wide mediation analyses for the 32,660 micro-RNA-gene pairs with strong association (false discovery rate [FDR] < 0.01%) identified 51 validated pairs with significant mediation effect. Of the 51 pairs, miR-223 had 16 mediation genes. These 16 mediation genes of miR-223 were also highly associated with various other micro-RNAs and mediated their prognostic effects as well. We further constructed a gene signature using the 16 genes, which was highly associated with GBM survival in both the discovery and validation sets (P = 9.8 × 10(-6)). This comprehensive study discovered mediation effects of micro-RNA to gene expression and GBM survival and provided a new analytic framework for integrative genomics. © 2014 WILEY PERIODICALS, INC.

  10. Discovery of phosphonic acid natural products by mining the genomes of 10,000 actinomycetes.

    PubMed

    Ju, Kou-San; Gao, Jiangtao; Doroghazi, James R; Wang, Kwo-Kwang A; Thibodeaux, Christopher J; Li, Steven; Metzger, Emily; Fudala, John; Su, Joleen; Zhang, Jun Kai; Lee, Jaeheon; Cioni, Joel P; Evans, Bradley S; Hirota, Ryuichi; Labeda, David P; van der Donk, Wilfred A; Metcalf, William W

    2015-09-29

    Although natural products have been a particularly rich source of human medicines, activity-based screening results in a very high rate of rediscovery of known molecules. Based on the large number of natural product biosynthetic genes in microbial genomes, many have proposed "genome mining" as an alternative approach for discovery efforts; however, this idea has yet to be performed experimentally on a large scale. Here, we demonstrate the feasibility of large-scale, high-throughput genome mining by screening a collection of over 10,000 actinomycetes for the genetic potential to make phosphonic acids, a class of natural products with diverse and useful bioactivities. Genome sequencing identified a diverse collection of phosphonate biosynthetic gene clusters within 278 strains. These clusters were classified into 64 distinct groups, of which 55 are likely to direct the synthesis of unknown compounds. Characterization of strains within five of these groups resulted in the discovery of a new archetypical pathway for phosphonate biosynthesis, the first (to our knowledge) dedicated pathway for H-phosphinates, and 11 previously undescribed phosphonic acid natural products. Among these compounds are argolaphos, a broad-spectrum antibacterial phosphonopeptide composed of aminomethylphosphonate in peptide linkage to a rare amino acid N(5)-hydroxyarginine; valinophos, an N-acetyl l-Val ester of 2,3-dihydroxypropylphosphonate; and phosphonocystoximate, an unusual thiohydroximate-containing molecule representing a new chemotype of sulfur-containing phosphonate natural products. Analysis of the genome sequences from the remaining strains suggests that the majority of the phosphonate biosynthetic repertoire of Actinobacteria has been captured at the gene level. This dereplicated strain collection now provides a reservoir of numerous, as yet undiscovered, phosphonate natural products.

  11. Discovery of phosphonic acid natural products by mining the genomes of 10,000 actinomycetes

    PubMed Central

    Ju, Kou-San; Gao, Jiangtao; Doroghazi, James R.; Wang, Kwo-Kwang A.; Thibodeaux, Christopher J.; Li, Steven; Metzger, Emily; Fudala, John; Su, Joleen; Zhang, Jun Kai; Lee, Jaeheon; Cioni, Joel P.; Evans, Bradley S.; Hirota, Ryuichi; Labeda, David P.; van der Donk, Wilfred A.; Metcalf, William W.

    2015-01-01

    Although natural products have been a particularly rich source of human medicines, activity-based screening results in a very high rate of rediscovery of known molecules. Based on the large number of natural product biosynthetic genes in microbial genomes, many have proposed “genome mining” as an alternative approach for discovery efforts; however, this idea has yet to be performed experimentally on a large scale. Here, we demonstrate the feasibility of large-scale, high-throughput genome mining by screening a collection of over 10,000 actinomycetes for the genetic potential to make phosphonic acids, a class of natural products with diverse and useful bioactivities. Genome sequencing identified a diverse collection of phosphonate biosynthetic gene clusters within 278 strains. These clusters were classified into 64 distinct groups, of which 55 are likely to direct the synthesis of unknown compounds. Characterization of strains within five of these groups resulted in the discovery of a new archetypical pathway for phosphonate biosynthesis, the first (to our knowledge) dedicated pathway for H-phosphinates, and 11 previously undescribed phosphonic acid natural products. Among these compounds are argolaphos, a broad-spectrum antibacterial phosphonopeptide composed of aminomethylphosphonate in peptide linkage to a rare amino acid N5-hydroxyarginine; valinophos, an N-acetyl l-Val ester of 2,3-dihydroxypropylphosphonate; and phosphonocystoximate, an unusual thiohydroximate-containing molecule representing a new chemotype of sulfur-containing phosphonate natural products. Analysis of the genome sequences from the remaining strains suggests that the majority of the phosphonate biosynthetic repertoire of Actinobacteria has been captured at the gene level. This dereplicated strain collection now provides a reservoir of numerous, as yet undiscovered, phosphonate natural products. PMID:26324907

  12. Linking Automated Data Analysis and Visualization with Applications in Developmental Biology and High-Energy Physics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ruebel, Oliver

    2009-11-20

    Knowledge discovery from large and complex collections of today's scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research coveredmore » in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics.Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of high-dimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of high-energy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.« less

  13. Behavioral and Psychosocial Considerations in Intelligence Analysis: A Preliminary Review of Literature on Critical Thinking Skills

    DTIC Science & Technology

    2009-03-01

    Department of Defense, Washington Headquarters Services , Directorate for Information Operations and Reports (0704-0188), 1215 Jefferson Davis Highway, Suite... discovery (teaching by problem solving), and exploratory (teaching by exploration). Research suggests while guided discovery and exploratory training...34 College Student Journal 38 (2004): 482-493. MasterFILE Premier. EBSCO . 4 June 2008. - This study was conducted to determine whether an introductory

  14. An endogenous foamy virus in the aye-aye (Daubentonia madagascariensis).

    PubMed

    Han, Guan-Zhu; Worobey, Michael

    2012-07-01

    We report the discovery and analysis of an endogenous foamy virus (PSFVaye) within the genome of the aye-aye (Daubentonia madagascariensis), a strepsirrhine primate from Madagascar. Phylogenetic analyses indicate that PSFVaye is divergent from all known simian foamy viruses, suggesting an association between foamy viruses and primates since the haplorrhine-strepsirrhine split. The discovery of PSFVaye indicates that primate foamy virus might be more broadly distributed than previously thought.

  15. DNA barcoding a nightmare taxon: assessing barcode index numbers and barcode gaps for sweat bees.

    PubMed

    Gibbs, Jason

    2018-01-01

    There is an ongoing campaign to DNA barcode the world's >20 000 bee species. Recent revisions of Lasioglossum (Dialictus) (Hymenoptera: Halictidae) for Canada and the eastern United States were completed using integrative taxonomy. DNA barcode data from 110 species of L. (Dialictus) are examined for their value in identification and discovering additional taxonomic diversity. Specimen identification success was estimated using the best close match method. Error rates were 20% relative to current taxonomic understanding. Barcode Index Numbers (BINs) assigned using Refined Single Linkage Analysis (RESL) and barcode gaps using the Automatic Barcode Gap Discovery (ABGD) method were also assessed. RESL was incongruent for 44.5% of species, although some cryptic diversity may exist. Forty-three of 110 species were part of merged BINs with multiple species. The barcode gap is non-existent for the data set as a whole and ABGD showed levels of discordance similar to the RESL. The viridatum species-group is particularly problematic, so that DNA barcodes alone would be misleading for species delimitation and specimen identification. Character-based methods using fixed nucleotide substitutions could improve specimen identification success in some cases. The use of DNA barcoding for species discovery for standard taxonomic practice in the absence of a well-defined barcode gap is discussed.

  16. Accelerators for Discovery Science and Security applications

    NASA Astrophysics Data System (ADS)

    Todd, A. M. M.; Bluem, H. P.; Jarvis, J. D.; Park, J. H.; Rathke, J. W.; Schultheiss, T. J.

    2015-05-01

    Several Advanced Energy Systems (AES) accelerator projects that span applications in Discovery Science and Security are described. The design and performance of the IR and THz free electron laser (FEL) at the Fritz-Haber-Institut der Max-Planck-Gesellschaft in Berlin that is now an operating user facility for physical chemistry research in molecular and cluster spectroscopy as well as surface science, is highlighted. The device was designed to meet challenging specifications, including a final energy adjustable in the range of 15-50 MeV, low longitudinal emittance (<50 keV-psec) and transverse emittance (<20 π mm-mrad), at more than 200 pC bunch charge with a micropulse repetition rate of 1 GHz and a macropulse length of up to 15 μs. Secondly, we will describe an ongoing effort to develop an ultrafast electron diffraction (UED) source that is scheduled for completion in 2015 with prototype testing taking place at the Brookhaven National Laboratory (BNL) Accelerator Test Facility (ATF). This tabletop X-band system will find application in time-resolved chemical imaging and as a resource for drug-cell interaction analysis. A third active area at AES is accelerators for security applications where we will cover some top-level aspects of THz and X-ray systems that are under development and in testing for stand-off and portal detection.

  17. Identification of Trypanocidal Activity for Known Clinical Compounds Using a New Trypanosoma cruzi Hit-Discovery Screening Cascade.

    PubMed

    De Rycker, Manu; Thomas, John; Riley, Jennifer; Brough, Stephen J; Miles, Tim J; Gray, David W

    2016-04-01

    Chagas disease is a significant health problem in Latin America and the available treatments have significant issues in terms of toxicity and efficacy. There is thus an urgent need to develop new treatments either via a repurposing strategy or through the development of new chemical entities. A key first step is the identification of compounds with anti-Trypanosoma cruzi activity from compound libraries. Here we describe a hit discovery screening cascade designed to specifically identify hits that have the appropriate anti-parasitic properties to warrant further development. The cascade consists of a primary imaging-based assay followed by newly developed and appropriately scaled secondary assays to predict the cidality and rate-of-kill of the compounds. Finally, we incorporated a cytochrome P450 CYP51 biochemical assay to remove compounds that owe their phenotypic response to inhibition of this enzyme. We report the use of the cascade in profiling two small libraries containing clinically tested compounds and identify Clemastine, Azelastine, Ifenprodil, Ziprasidone and Clofibrate as molecules having appropriate profiles. Analysis of clinical derived pharmacokinetic and toxicity data indicates that none of these are appropriate for repurposing but they may represent suitable start points for further optimisation for the treatment of Chagas disease.

  18. The Increasing Interest of ANAMMOX Research in China: Bacteria, Process Development, and Application

    PubMed Central

    Chai, Li-Yuan; Tang, Chong-Jian; Zheng, Ping; Min, Xiao-Bo; Yang, Zhi-Hui; Song, Yu-Xia

    2013-01-01

    Nitrogen pollution created severe environmental problems and increasingly has become an important issue in China. Since the first discovery of ANAMMOX in the early 1990s, this related technology has become a promising as well as sustainable bioprocess for treating strong nitrogenous wastewater. Many Chinese research groups have concentrated their efforts on the ANAMMOX research including bacteria, process development, and application during the past 20 years. A series of new and outstanding outcomes including the discovery of new ANAMMOX bacterial species (Brocadia sinica), sulfate-dependent ANAMMOX bacteria (Anammoxoglobus sulfate and Bacillus benzoevorans), and the highest nitrogen removal performance (74.3–76.7 kg-N/m3/d) in lab scale granule-based UASB reactors around the world were achieved. The characteristics, structure, packing pattern and floatation mechanism of the high-rate ANAMMOX granules in ANAMMOX reactors were also carefully illustrated by native researchers. Nowadays, some pilot and full-scale ANAMMOX reactors were constructed to treat different types of ammonium-rich wastewater including monosodium glutamate wastewater, pharmaceutical wastewater, and leachate. The prime objective of the present review is to elucidate the ongoing ANAMMOX research in China from lab scale to full scale applications, comparative analysis, and evaluation of significant findings and to set a design to usher ANAMMOX research in culmination. PMID:24381935

  19. ChIP-PaM: an algorithm to identify protein-DNA interaction using ChIP-Seq data.

    PubMed

    Wu, Song; Wang, Jianmin; Zhao, Wei; Pounds, Stanley; Cheng, Cheng

    2010-06-03

    ChIP-Seq is a powerful tool for identifying the interaction between genomic regulators and their bound DNAs, especially for locating transcription factor binding sites. However, high cost and high rate of false discovery of transcription factor binding sites identified from ChIP-Seq data significantly limit its application. Here we report a new algorithm, ChIP-PaM, for identifying transcription factor target regions in ChIP-Seq datasets. This algorithm makes full use of a protein-DNA binding pattern by capitalizing on three lines of evidence: 1) the tag count modelling at the peak position, 2) pattern matching of a specific tag count distribution, and 3) motif searching along the genome. A novel data-based two-step eFDR procedure is proposed to integrate the three lines of evidence to determine significantly enriched regions. Our algorithm requires no technical controls and efficiently discriminates falsely enriched regions from regions enriched by true transcription factor (TF) binding on the basis of ChIP-Seq data only. An analysis of real genomic data is presented to demonstrate our method. In a comparison with other existing methods, we found that our algorithm provides more accurate binding site discovery while maintaining comparable statistical power.

  20. Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships

    PubMed Central

    2010-01-01

    Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245

Top